Currently, the views data cache class rebuilds the data too often, so if table data is not stored on the class, it will rebuild the data (hook_views_data) rather than trying to retrieve the cached data for all tables. This can then be used to populate on the class storage, and then saved as a per table entry too. This should stop alot of rebuilds.
Also, If the data needs to be re cached, it will be set for all tables every time. We want this to just cache the tables we have requested. This will then reduce the overhead on requests when the view cache is rebuilt. Which can take a while when there is alot of tables in the info data.
Let's tidy this class up by trying to firstly retrieve data from the cache first, rather than invoking again, and changing the class logic to only cache tables that have been requested and need to be rebuilt.
Comment | File | Size | Author |
---|---|---|---|
#42 | views-data-cache-rebuild-1944674-42.patch | 459 bytes | mxwright |
#37 | views-data-cache-rebuild-1944674-37.patch | 5.34 KB | Berdir |
#37 | views-data-cache-rebuild-1944674-37-interdiff.txt | 540 bytes | Berdir |
#35 | views-data-cache-rebuild-1944674-35-tests-only.patch | 3.68 KB | Berdir |
#35 | views-data-cache-rebuild-1944674-35.patch | 5.37 KB | Berdir |
Comments
Comment #1
damiankloip CreditAttribution: damiankloip commentedHere is a first attempt. Let's question/discuss the actual approach, then I can add some more test coverage.
Comment #2
damiankloip CreditAttribution: damiankloip commentedComment #3
dawehnerThis comment about getData() doesn't seem to be true anymore with the updated code, as it actually matters whether get() was called without a specific table.
Couldn't we just push directly in the else?
I'm wondering whether we should have one set() method for a single table and one for all, as set() is public, it shouldn't care about internal implementation details(the fact that we have a cache).
Comment #5
damiankloip CreditAttribution: damiankloip commentedThanks for the review!
I updated the buildAll property comment.
Yeah, I thought this to start with, but then realised that if it lives in the else then a non existent table would be added to the requestedTables array, and cached.
How about we just make the set() method protected? As people shouldn't be using this to set anything, they should just be using this to retrieve data.
Comment #6
damiankloip CreditAttribution: damiankloip commentedSorry, The $from_cache variable needs to be set first! This patch should actually work.
Comment #7
dawehnerTagging.
Comment #8
dawehnerThis looks really really good!
The config is justed used in the constructor, so what about not storing it at all? (Probably a follow up)
Over 80 chars
Comment #9
BerdirWhy array_push() and not a simple $this->requestedTables[] = $key?
How useful are the set and cacheGet() methods actually with this? The language suffix handling is a separate helper method now, the only thing they do is the skipCache check, which could just as well be directly in destruct() because it would save us from looping over the requested tables and then not actually saving them.
I have a separate issue that adds a setMultiple() where I already killed the set method so that I can do a single set for all requested cache tables.
Other than that, agreed that this looks nice.
Comment #10
damiankloip CreditAttribution: damiankloip commentedNice, thanks! I have rerolled in those changes. I may add more tests to this today/tomorrow as mentioned in the other issue.
Comment #11
olli CreditAttribution: olli commentedOops.
Not used anymore.
This could be named cacheSet.
$key is undefined here.
This does not need to assign $cid.
Would it make sense to do $this->cacheSet($data) here so parallel requests would read the data from cache?
Comment #12
damiankloip CreditAttribution: damiankloip commentedThanks for your review, some good points.
Regarding the cacheSet, stuff has changed so this is done in destruct() so I think atleast it should stay here.
I will write some more unit tests for this tomorrow.
Comment #13
dawehner.
Comment #14
BerdirTested this manually and it works lovely. Only a handful of views_data:$table:en cache entries are written (node, node_revision, taxonomy_term, users, file_managed, comments), clicking around in the views UI relies on the full cache entry and does write any additional cache entries except the one for views itself. Awesome!
That said, I still think the set() could be inlined in destroy() and the skipCache checked moved into the existing !empty($this->storage) if. Will be easier for me to re-roll the cache()->setMultiple() patch afterwards.
Comment #15
dawehnerYeah it's not public method anymore, so it seems okay to not overabstract that.
Comment #16
BerdirShould be $this->storage[$table] in the new set() call.
Comment #17
dawehnerGood catch!
Comment #18
dawehnerhehe
Comment #19
BerdirOk, this looks great to me. Waiting for the promised tests by @damiankloip but this looks RTBC to me otherwise. One important thing to test would be to make sure that multiple requests for a missing table definition does not result in multiple views data rebuilds/writes, a thing that is currently plaguing 7.x-3.x
Also marking as needs backport, while the API itself is completely different, this would be extremely valuable thing to have in 7.x-3.x
Comment #20
damiankloip CreditAttribution: damiankloip commentedOK, so I added alot more test coverage for this.
It did uncover a sneaky bug, which is basically what berdir suggested in #19; When an invalid table was requested, getData() was being called again because if there was no cache for $cid, it was just getting all data again in the else, So I changed this to an
elseif(!this->fullyLoaded)
, so if data has already been fully loaded, and it doesn't find the table, don't get it again.The interdiff is slightly wrong, I changed
\Drupal::state()
to$this->container->get('state')
in the setUp for the test.Comment #22
damiankloip CreditAttribution: damiankloip commentedSorry, and without a test file in there that wasn't meant to be!
Comment #24
damiankloip CreditAttribution: damiankloip commentedWe need to clear the views data now it actually works properly, funny what things escape being seen when the data cache is rebuilt for table it can;t find.
Comment #25
BerdirCan't really parse that sentence.
Minor but when reading how it was used, I was wondering if the argument should be inverted to $increment = TRUE. I think that would make more sense, reading something like assertCountIncrement(TRUE) sounds like it would actually test that it was incremented.
Comment #26
damiankloip CreditAttribution: damiankloip commentedThanks berdir, leave it with me!
Comment #27
damiankloip CreditAttribution: damiankloip commentedHere are those changes.
Comment #28
BerdirGreat, thanks for the quick re-roll. Great work, I think this is ready to go. Very happy to have this resolved for 8.x, now we just need to find a way to backport what we can :)
Comment #29
xjm#27: 1944674-27.patch queued for re-testing.
Comment #30
catchLooks much better! Committed/pushed to 8.x.
Moving to 7.x for backport.
Comment #31
BerdirSeeing the per-table cache in action on a big, message site, I think this is major.
Here's a start to implement this in 7.x-3.x. No tests yet, sadly, but seems to be working quite fine.
Comment #32
BerdirGrml, same patch with correct prefix.
Comment #33
BerdirHad to update the if statement to not ignore empty tables.
Comment #34
damiankloip CreditAttribution: damiankloip commentedYes, that looks better.
I guess we should at least add some tests? We've generally learnt the hard way from things breaking for peoples sites :)
Comment #35
BerdirYes, we did :)
Comment #36
dawehnerThanks for even provide tests!
Could we also merge these two lines?
Comment #37
BerdirSure.
Comment #38
dawehnerThank you very much.
Comment #40
nicholasruunu CreditAttribution: nicholasruunu commentedRemoving this seems to have broken views cache with memcache: https://www.drupal.org/node/2153517#comment-9337341
Comment #41
mxwright CreditAttribution: mxwright commentedThe patch in #40, since committed and included in 7-3.8, has been causing trouble for my site and others. See here for more:
#2153517: Randomly getting broken/missing handler
#2305905: Massive query after cloning aggregator views crashes site
and potentially #1954348: Fields not available anymore in fields add (cache issue)
As nicholasruunu points out in #40, the removal of this code seems to be causing the problem:
Adding that code back to cache.inc seems to solve the problem. Without it, my site crashes and 7-3.8 becomes unusable and we have to revert to 7-3.7, which throws constant security warnings. Please help.
Comment #42
mxwright CreditAttribution: mxwright commentedHere is a patch that puts that code back into the latest dev version.
Comment #43
mxwright CreditAttribution: mxwright commentedComment #44
mxwright CreditAttribution: mxwright commentedComment #45
damiankloip CreditAttribution: damiankloip commentedI think this might be an issue with the memcache max data size of 1MB?
There is an issue for that: https://www.drupal.org/node/435694
Comment #46
nicholasruunu CreditAttribution: nicholasruunu commented#45 @damiankloip,
Seems like it, patch has been working for us the last 17 days.
Comment #47
mxwright CreditAttribution: mxwright commented@damiankloip, @nicholasruunu:
Seems like a lot of the reports include memcache - but we don't use memcache (neither do a few of the users in some of the above linked issues), so that can't be it. The fix in #40/41 seems to work though, but I don't know the implications (why it was removed, etc).
Comment #48
sokrplare CreditAttribution: sokrplare commentedFor those of us who it was memcache related, they just released memcache-7.x-1.4 today which includes the patch for #435694: >1M data writes incompatible with memcached (memcached -I 32M -m32) and has resolved this issue for us.
Comment #49
rviner CreditAttribution: rviner commentedMy site also heavily depends on views but doesn't use memcache. Since upgrading to 3.8 random view data would go missing and the admin view config screen would display the field as broken handler until the cache is cleared.
I have applied the patch in #42 and will see how it goes but I would also be interested in why this was removed.
Comment #50
steveoriolMERCI ! :-)
Patch #42 worked for me,
for errors like "This view has been automatically updated to fix missing relationships" on editing views
with D 7.34 + PHP 5.6.5 and MariaDB 5.5.5-10.0.15
Comment #51
joelpittetThis seemed like a much more scalable approach, may I ask why was it removed? Seems from reading the Issue Summary removing the line above that hunk would be a much better approach?
Right now the views_data cid on a commerce site, with many fields and modules is > 2.5MB
To make #42 work maybe we need to also remove the save all data, and if all the keys are needed maybe just save the keys that are saved to help get 'views_data:' . $key.
Comment #52
hey_germanoThe patch in #42 solved the broken/missing handler problem for me, thanks mxwright!
Comment #53
torgosPizzaWe are seeing a similar issue where Views Blocks would suddenly vanish. We rely on Views for a vast majority of the site, so I would very much like to see this resolved. I'll test out the patch in #42, but the comments in #51 make me think this patch may still need some work? Would love to get a maintainer's thoughts.
Comment #54
das-peter CreditAttribution: das-peter commentedMemcache / Redis users: There's another issue with
_views_fetch_data()
that affects you: #2475669: _views_fetch_data() prone to inconsistency (Especially with redis / memcache)Comment #55
das-peter CreditAttribution: das-peter commentedSome thoughts
Imagine this scenario:
_views_fetch_data()
triggers a full rebuild ->_views_fetch_data_build()
We end up with something like:
views_data:en
= array('table_a' => array(), 'table_b' => array(), 'table_c' => array());views_data:table_a:en
= array()views_data:table_b:en
= array()views_data:table_c:en
= array()Redis (LRU config) / memcache will start to evict items from cache to make space for new stuff.
Imagine
views_data:table_b:en
andviews_data:table_b:en
are deleted from the cache._views_fetch_data()
runsviews_data:table_b:en
is a miss, and triggers the full load of the views data cache, which fills the static cache.As this is registered as a lost
views_data:table_b:en
- the cache entry is restored.But
views_data:table_c:en
is a hit now! Since the static cache is in charge now. Thusviews_data:table_c:en
isn't considered "lost"._views_fetch_data()
runsNow
views_data:table_b:en
is a hit butviews_data:table_c:en
is a miss, triggers the full load of the views data cache and its dedicated cache item is restored too.First this means we would need quite long to "restore" the missing table specific cache items - one per request.
Second, if you're running memcache with the 1M item limit, absolutely every call would lead to a rebuild of the full cache.
So in perspective of this it would make sense to re-add the code that sets the table specific cache items whenever the full cache is rebuild.
However, for "Second" I guess in a real world scenario you'll always end up with a call to fetch the whole cache, which negates the adjustment again. And I really don't see another solution to this than adjust the memcache configuration (Think of it as the size of the table column
data
in the db based caching, you can't just change this to varchar - it has to be a longblob to work.)Comment #56
dawehnerWell, at least with version 1.5 of memcache module that particular problem is solved, as big entries will be split up automatically.
Comment #57
das-peter CreditAttribution: das-peter commented@dawehner Well I guess then it's mostly about if we want to have a ton of extra
cache_set()
calls when the full cache is rebuild. Which translates to a ton of db queries when the standard caching backend is used (in my case this would be around 530 calls for the project I work on atm ;) ). As I try to always use an in-memory cache backend I wouldn't mind to have the additional sets.Comment #58
mxwright CreditAttribution: mxwright as a volunteer commentedI've updated to Views 3.11 and tested it at length against the trouble I reported above. Everything now seems resolved. Without further information I'll have to attribute this to various seemingly-related things mentioned in the release notes:
And return this issue back to 'Fixed' status.