The core cache bins have tended to be tied to specific features rather than types of stuff, 'cache' is the exception rather than the rule.
I think we should refactor this so that they reflect the type of data being used rather than which module they might be related to.
There are the following categories of things we cache:
Registries and configuration - arrays and objects that aren't tied to content. cache + cache_bootstrap are probable fine for this as is.
Rendered HTML that is tied to arbitrary combinations of content - cache_block + cache_page, entity render cache. Strings...
Caches related to multiple content items, variable cache keys: (cache_path and cache_menu, views results) - not rendered HTML. Usually cached by path but not always.
Caches attached to specific entities - entity/field cache.
If we try to distil that into cache bin naming, maybe something like this:
{cache_bootstrap} - same as now - reserved for things you may want to put in a 'local' cache like APC. If bootstrap gets seriously slimmed down by the CMI and WSCCI initiatives we might be able to merge this back into {cache}, but it's still useful at the moment.
{cache_default} - all 'global' registries and configuration - some configuration/registries in other cache bins should move here, anything content-related should move out.
{cache_render} - all cached HTML strings go here - page + block (possibly including the check_markup() cache in core). This can get very big, the items are usually expensive to build, and don't last long.
{cache_path} - all per-path (but not HTML string) caches. Similar to cache_render but for lower level stuff. This can also get very big, but the items tend to last longer (since they're lower level) than rendered stuff.
{cache_entity} - renamed field cache bin, but could potentially be used by entity caching too if that gets added to D8.
Advantages of this:
- With MySQL, items requested very often (config and registries) will be in smaller bins with less frequent updates - so less likely to accumulate overhead with MyISAM, more efficient use of the query cache (the query cache is maintained at a per-table level for expiry.
- With Memcache (and anything else that is LRU or has limited space), it'd be easier to allocate memory to instances based on usage - we'd document that cache and cache_bootstrap are expected to have a finite number of items, and that cache_render, cache_path and cache_entity are more likely to get filled up.
There may be better divisions than this, but posting up for comment.
Comment | File | Size | Author |
---|---|---|---|
#98 | interdiff-1194136-98.txt | 724 bytes | damiankloip |
#98 | 1194136-98.patch | 71.7 KB | damiankloip |
#94 | 1194136-94.patch | 65.42 KB | Berdir |
#92 | 1194136-92-interdiff.txt | 7.71 KB | Berdir |
#92 | 1194136-92.patch | 64.55 KB | Berdir |
Comments
Comment #1
bleen CreditAttribution: bleen commentedIt is important that we keep the cache_form data separate because there are several issues that arise when the form cache is cleared:
#500646: An unrecoverable error occurred. Form missing from server cache
#795004: An unrecoverable error occurred. This form was missing from the server cache. Try reloading the page and submitting again.
#959200: Add ability to exclude bins (tables) from memcache
#722322: This form was missing from the server cache. Try reloading the page and submitting again.
#1309928: This form was missing from the server cache
...
I would be in favor of renaming that table something that doesn't include the word cache at all (form_storage? form_states?) but most importantly I just want to make sure it is kept completely separate
Comment #2
R.Muilwijk CreditAttribution: R.Muilwijk commentedAtleast content cache should be splitted from the configuration/registries caches because they take a different cache strategy.
Comment #3
BerdirGetting this started.
- cache_image is gone.
- There's now cache_config. I guess this will replace large parts of cache_bootstrap, now that the system table is on it's way out. We can possibly merge the remaining stuff in cache_bootstrap with cache once that has progressed a bit more.
- Introduced the html, path, entity bins as suggested in the initial post.
- cache_form currently moved into cache_path, the form_state part will move to KV and then it's jut a normal cache, as catch described in the initial post.
No upgrade path yet.
Comment #4
BerdirForgot the #cache definition in block.module. Also removed a new duplicated cache clear. There might be more of those.
Comment #6
BerdirForgot to update hook_cache_flush(), let's see how this goes.
Comment #8
BerdirMessed up the naming, changed 'html' to render.
Comment #10
BerdirFixed some remaining manual queries, unit tests and so on. Upgrade path is a bit tricky, quite a mess currently but seems to pass locally.
Comment #12
BerdirOk, found more incorrect render/html cache bins. The dblog test errors were interesting. The invalid cache flush bins resulted in exceptions in cron runs and the only place this showed up were those tests because the number of watchdog entries didn't match.
This might pass but there are still two things to figure out:
- Complete pgrade path, hopefully one that doesn't require the additional try/catch statements. Maybe we need to add some of these bins in the early fix function.
- This is blocked on #512026: Move $form_state storage from cache to new state/key-value system which in turn is blocked on the cache DIC issue. Which could benefit from this issue, because it would get the number of cache bins down.
Comment #14
BerdirWell, it would help if I wouldn't comment the cache clear calls, no wonder that test fails then ;)
Comment #15
Berdir#14: unify-cache-bins-1194136-14.patch queued for re-testing.
Comment #17
BerdirHere is a re-roll.
Haven't touche the views bins yet, should we incorporate them into this too?
Comment #19
BerdirUpsie. Missed one merge conflict.
Comment #21
damiankloip CreditAttribution: damiankloip commentedI'm not sure how views would fit into this?
(sorry if I'm telling people what they already know).
So we have two cache tables currently:
So I guess the new cache_render table could house the cached markup side of a view that's currently in cache_views_data? But then we would still want the results too, so could this live somewhere else? and could the current cache_views_data stuff just live in the general cache table? There could be quite a few entries in here if you had a site with alot of different languages, but if not, you would have one row for everything, and then one row containing the broken up data for each base table.
Comment #22
BerdirOk, this should fix the test failures in the cache tests and also removes the views_results cache bin.
I'm not sure about views_info yet either, but as suggested in the views_fetch_data() -> service issue, We might want to look into only storing the tables separately that are actually requested. That would mean that we initially have to do a few loads of the whole cache but that should stop quickly and it would save us possibly hundreds of useless cache_set()'s for sites with a few hundred fields/views tables. After that, I think it should be fine to merge views_info into the default cache bin?
Comment #24
Berdir#22: unify-cache-bins-1194136-22.patch queued for re-testing.
Comment #25
Berdir#22: unify-cache-bins-1194136-22.patch queued for re-testing.
Comment #27
BerdirAll the issues this depended on finally went in I think, here's an initial re-roll.
This will very likely fail all over the place as a lot of changes happened since the last patch.
Comment #29
BerdirThis should fix the installer.
Comment #31
BerdirThis should fix most failures, not sure sure about book but too lazy to install again just to run that locally ;)
Comment #33
BerdirThis should fix the last test fail.
Comment #34
Berdir#33: unify-cache-bins-1194136-33.patch queued for re-testing.
Comment #36
BerdirRe-roll, patch got a lot easier due to the removal of the cache table schema definitions just a bit ugly to re-roll :)
Updated the update function to remove the cache tables, will need to do the same for filter, block and views_info cache bins.
Comment #37
jibran#36: unify-cache-bins-1194136-36.patch queued for re-testing.
Comment #39
damiankloip CreditAttribution: damiankloip commentedSorry.
Comment #40
damiankloip CreditAttribution: damiankloip commentedSorry, totally wrong patch. IGNORE
Comment #42
damiankloip CreditAttribution: damiankloip commentedLet's try and start this again. It's been such a long time, I didn't bother with a reroll, just started again. Too much stuff has changed in the mean time.
Comment #44
damiankloip CreditAttribution: damiankloip commentedComment #45
catchComment #46
catchComment #47
catchLooking at the list of bins, we're missing a place for the Views results cache.
This most closely maps to the things in cache_path - menu links, path alias pre-load lists etc. are collections of entities by some kind of criteria, which later get loaded and rendered, which is similar to views results. Difference is that views results aren't per-page.
Might just need renaming - cache_listing or cache_collection could be OK. Not much will use this.
Comment #48
damiankloip CreditAttribution: damiankloip commentedJust a reroll for now. Will work on this a bit more today.
Comment #49
damiankloip CreditAttribution: damiankloip commentedOk, made a few more changes after speaking to catch:
- Renamed path cache bin to data
- Removed views_info cache bin, this stuff now lives in data
- Views result cache now also uses data bin
Comment #51
damiankloip CreditAttribution: damiankloip commentedMissed a couple. This should pass now.
Comment #52
BerdirStale comment...
This is in line with what I'm doing in #597236: Add entity caching to core.
FieldInfo is about field metadata, not sure if this should be in the entity bin, would be nice to only use that for the entity storage.
Looking at what it stores, it's up to 3 small caches (fields/instances/extra fields) per entity_type/bundle.
Not sure if we want it in cache.cache?
Well, at least it shouldn't be necessary to delete it twice :)
Applying the patch, manually dropping all cache tables in the database, core/rebuild.php => all well. Pretty cool :)
Some notes based on that:
- The list of cache bins looks really nice, way better than right now.
- What about cache_toolbar? Was not mentioned in here before, it creates a cache per user, so on a site with many users that have access to the toolbar, that could be quite a list, but not sure if that deserves it's on cache bin for that?
- cache contains per template cache entries that contain nothing more than a timestamp if I understand that correctly to compare the file changed timestamp. Should probably live in a different bin, also sounds like a job for a cache collector (not something to worry about here). Not that problematic, as it's only used if auto_refresh is turned on, which you want to disable on production...
- cache_bootstrap is down to 3 entries, hook_info, module_implements, and system_list. Seems pretty useless :)
Trying to understand a bit more what's going on, so added the following line to DatabaseBackend::getMultiple():
Gives:
(Why are we caching libraries by module?!)
Comment #53
tim.plunkettThe cached block annotations are not "render" information, IMO.
Comment #54
damiankloip CreditAttribution: damiankloip commentedThanks for the reviews.
Totally, this was in error. Was cache.block > C&P > cache.render :) We want this in either cache.cache or cache.data.
afaik, drupal_get_library() was always per module, but the conversion to yaml added the caching layer too for this. Sun has an issue to consolidate this. We have identified that this is not cool :)
I am not sure about cache_toolbar right now. This table could easily get waaaay big. So I feel like we should leave it as its own bin for now? and maybe discuss any alternatives once we have this in? Just to try and keep the scope of this issue in a bit.
One other question, I currently have cached info data, whether this is info, or cache plugin definitions, split between cache.cache and cache.data. I feel like maybe I should just move all of these to cache.cache for consistency? thoughts on that?
Patch contains those changes from above.
Comment #56
Wim LeersRE:
cache_toolbar
: I think the only reason the Toolbar module has its own cache, and its own (peculiar) caching technique, is because the underlying caching problem in the menu system hasn't been solved yet: #1805054: Cache localized, access filtered, URL resolved, and rendered menu trees. Once that is solved, Toolbar pretty much can get rid of its own cache. Again, AFAIK/IIRC.Sadly, #1805054: Cache localized, access filtered, URL resolved, and rendered menu trees hasn't gone anywhere in the past half year. Should it be a critical instead of a major? Should it be a beta blocker? I don't know.
Comment #57
BerdirThat's what I thought at first too. (waaay big). But cache_entity is going to have a row for every single loaded entity, now cache_toolbar seems pretty small :)
Re huge cache_entity, That's something I'm wondering about in #597236: Add entity caching to core, if we should use a bin per entity instead, but that's currently problematic, while we could easily get it dynamically from the cache factory, we currently can not clean it up like we can the others? maybe do that in entity_module_preuninstall() ?
Anyway, I'm fine leaving that, especially if it's hopefully only temporary/a workaround.
Comment #58
damiankloip CreditAttribution: damiankloip commented#2209035: Remove remaining/stray menu_link references from book module code broke this in the mean time.
Comment #59
catchentitycache in contrib has a bin-per-entity exactly to avoid them getting out of hand. That's worth doing but could definitely be its own issue.
toolbar is somewhere between cache_render and cache_data, but yes it's only there because of the other issue. I'd be fine with making that critical.
Comment #60
damiankloip CreditAttribution: damiankloip commentedok, so we should leave this as cache entity for now then? We can then have a follow up when the entity cache issue and this is in?
Toolbar can be left for now, and fixed in said issue above.
This just leaves the question of info, should we make this more iniform and put say all of them in cache.cache?
Comment #61
catchYep. Also it's no worse than the field cache in 7.x which was for all entities.
Info - most of those should really be in cache_bootstrap based on the intention of that bin - that was supposed to be for things that are a requirement for any full page request and so they can live somewhere that gets less polluted than the default bin.
Comment #62
Wim LeersReviewed for potential RTBC; found one mistake (rendered blocks are now stored in
cache_render
, but the wrong service was being injected:@cache.cache
instead of@cache.render
.And found these:
Which still needs to be addressed, or get a linked follow-up issue.
Comment #63
Wim Leers@catch: So, let's make #1805054: Cache localized, access filtered, URL resolved, and rendered menu trees critical?
Comment #64
damiankloip CreditAttribution: damiankloip commentedYes, these were added in by this patch. I get the feeling these will need to utilise cache tags. As clearing out whole bins like this is not that helpful. when a field is changed or something.
Comment #65
catchBumped #1805054: Cache localized, access filtered, URL resolved, and rendered menu trees.
On those views clears, there shouldn't be any need to clear views results just because a field instance was created. I can see views info caches needing clearing but that's very different.
Comment #66
damiankloip CreditAttribution: damiankloip commentedWell, before (currently) both result and info cache gets cleared out. Or are you saying we definitely don't need this for the created case in particular?
Comment #67
catchI don't see why we'd clear out result - how can new field instances invalidate existing views?
Comment #68
damiankloip CreditAttribution: damiankloip commentedThat's what I'm saying; you mean just ditch this from views_field_instance_config_create()? :)
Comment #69
catchYep :)
Comment #70
damiankloip CreditAttribution: damiankloip commentedrerolled and removed the create hook implementation.
Comment #72
damiankloip CreditAttribution: damiankloip commentedSorry, rebase/branch fail.
Comment #73
BerdirI added a @todo to #2216543: Fill in topic/@defgroup docs for Cache overview to add documentation about common cache bins and for what they should be used, I hope that will land asap (reviews welcome). The cache API really deserves some accurate and up to date documentation :)
Comment #74
damiankloip CreditAttribution: damiankloip commentedNeeds another reroll after the cache tags testing patch.
Comment #76
damiankloip CreditAttribution: damiankloip commentedForgot to change the cache bin in the new cache tags base test class.
Comment #77
damiankloip CreditAttribution: damiankloip commentedAnother reroll...
Comment #78
catchI think this is ready.
One extra thing we could do is rename the 'cache' bin to 'cache_default', would mean cache('default') instead of cache('cache') but separate issue I think.
Comment #79
Berdir+1 to all of that. RTBC, cache_default and doing that in a separate issue :) Defaulting to 'default' is way less confusing than 'cache' and we don't need to special case 'cache' anymore.
Comment #80
catchOpened #2221065: Rename default 'cache' cache bin to 'default'.
Comment #81
Wim LeersThanks for this. I love the clean-up :)
My only concern is that many comments haven't been updated and are therefore now confusing/misleading. E.g.:
The menu cache object for this controller.
inBookAdminEditForm
,The page cache object.
inPerformanceForm
, etc.Comment #82
Wim LeersComments updated!
I wanted to rename the comment in
BookAdminEditForm
but I noticed that all of that cache clearing was in fact unnecessary:BookManager::saveBookLink()
andNode::save()
already invalidated the affected cache tags. And the static reset forbook_menu_subtree_data
was entirely ludicrous, since that function does not exist anymore!While updating comments, I noticed something. This patch highlights in how many places we're still doing "let's clear all the things in the bin" instead of using cache tags to only clear affected cache entries. With the move towards fewer cache bins in this patch, that problem only grows in importance.
Comment #84
Wim Leers#2216543: Fill in topic/@defgroup docs for Cache overview broke this. Cache docs have moved to
core.api.php
, hence the interdiff.Comment #85
BerdirYep, sorry :)
Any chance that we can resolve the @todo in there that points to this issue about providing some documentation on the common cache bins and what they should be used for? I'm also OK with doing that on d.o if it's considered to be too much in-depth, but then we should create a stub page on d.o below Cache API and link to it.
Comment #87
BerdirCan't reproduce the installation fail, updated documentation a bit and fixed a @todo in EntityViewBuilder pointing to this issue. This might introduce some new fails that expect the cache.cache bin to be used.
Comment #89
BerdirLooks like only Wim's new cache tag tests care about which bin this is stored in ;)
Comment #90
Wim LeersThanks for the rerolls! Nice to see the @todo that was blocked on this issue resolved in #87.
Back to RTBC per #78.
Comment #92
BerdirRe-roll after #2158003: Remove Block Cache API in favor of blocks returning #cache with cache tags, less conflicts than I feared (just one). Let's see if I found all the new cache usages.
Comment #94
BerdirAh, I assumed that I missed at least one reference to the block cache but couldn't find it.
Comment #96
BerdirAh, defaulting the render cache to use 'render' was not yet done here, but makes a lot of sense I think. Fixed the second place where it defaulted to 'cache' and using 'render' in the tests.
Comment #98
damiankloip CreditAttribution: damiankloip commentedNeeded reroll, and fixed test fail.
Comment #99
BerdirYeah, missed one more. Thanks, I think this can go back to RTBC now.
Comment #100
catchCommitted/pushed to 8.x, thanks!
Comment #102
Wim LeersAwesome! :)
Comment #104
claudiu.cristeaThe only problem here is that core provides also a non-bin cache table:
cache_tags
. And that has the same name pattern as the bins. If you manually truncate the DB cache you may accidentally truncate alsocache_tags
(I did it!). So, IMO, cache bins should have a different name pattern, likecache_bin_*
.Comment #105
Wim Leerscache_tags
is an implementation detail of the database-powered cache back-end. But I agree with the rationale in #104; right now it's relatively confusing.Comment #106
claudiu.cristeaThen, here's a follow-up patch making DB cache bin naming more clear. Changed also the priority to 'Minor'.
Comment #107
BerdirCan you open a new issue for this? There's no reason to re-open this long-fixed issue for this, which is really a new task, and not a bugfix or so.
Comment #108
claudiu.cristea@Berdir
Yes, sure. Sorry.
It's here as a followup: #2302843: {cache_tags} conflicts with cache bin database table names