Problem/Motivation
Render caching made sense when it was introduced in Drupal 7, and even when it was expanded to be used for many more things in Drupal 8.
But since then, Dynamic Page Cache was added in #2429617: Make D8 2x as fast: Dynamic Page Cache: context-dependent page caching (for *all* users!). It's enabled by default. The benefit is that there's a single cache hit to render most of the page, rather than many fragments to retrieve from render cache. This saves a lot of I/O. And has a lower complexity.
Also important: there have been zero noteworthy bug reports during the 19 months that Drupal 8 has been out. In that time, Dynamic Page Cache has accelerated millions of page views.
That begs the question: do we still need render cache? Dynamic Page Cache is caching at a coarser level, so avoids lots of work. Maybe we don't need the granular caching layers as much anymore. That also greatly improves understandability/maintainability/DX. And the render cache bin is the one that has hundreds of thousands of cache items on big sites: #2526150-155: Database cache bins allow unlimited growth: cache DB tables of gigabytes!, so that'd go away too.
Can we remove render caching altogether? Or do we just disable it for e.g. blocks & entities, but keep it for Views, where it's likely most impactful?
Proposed resolution
TBD
Remaining tasks
TBD
User interface changes
TBD
API changes
TBD
Data model changes
TBD
| Comment | File | Size | Author |
|---|---|---|---|
| #61 | optimise-render-cache-2888838-61.patch | 2.57 KB | hmdnawaz |
| #20 | do-not-render-cache-auto-placeholder-2888838-20.patch | 3.46 KB | berdir |
Comments
Comment #2
wim leersDiscuss!
Comment #3
dawehnerTo be able to make a proper decision I think we need proper profiling for a real world scenario. /me is starring at berdir.
Comment #4
fgmJust looked at a magazine site we deployed a few days ago:
* render bin has 4 x the size of the dynamic page cache bin, and has > 90% hit rate
* dynamic page cache bin stays around 65% hit rate.
* qps rate on render bin is about 30 x qps rate on dynamic page cache
* EDIT: after 10 days since last flush, the render cache holds 500k items, while the DPC bin has 55k only
Regrettably, I have no idea of the proportion of queries to the render bin being for the page cache vs the rest of the render cache: memcache doesn't allow dump of stats beyond 1 MB per slab, AFAICS. Any suggestion to split the internal page cache from the rest ? A quick scan of the slabs suggests most of the entries are either page cache items or entity view items.
Comment #5
wim leersD'oh, I wanted to add the tag but forgot! Thanks @dawehner!
Yes! I thought there already was an issue for this, but apparently not: https://www.drupal.org/project/issues/drupal?component=page_cache.module Mind creating one?
EDIT: also: THANK YOU for that very valuable data, @fgm! 👏
Comment #6
catchI'd think we want to keep render caching of menus, footers and other site-level items since they could potentially be built millions of times even with dynamic_page_cache, but cached per-role/language.
Entity views are much less of an issue with dynamic_page_cache though, especially the full view mode since that's almost always displayed-per-page.
There are some which are much tricker - individual comment renders on this issue will have a much higher cache hit-rate than the dynamic_page_cache, even if they're only rendered in one context - i.e. if this was 8.x each comment would have been rendered once (excluding edits) but the page will have been rendered every time a new comment is posted, which has been more than one per hour. Similar with 'teaser' type view modes in listings and similar.
Lots of trade-offs.
Yet another option, disable render caching when the database cache is used? Then it can still be on for memcache.
Comment #7
wim leersYes. But I figured it was a discussion worth having.
Comment #8
moshe weitzman commentedInteresting proposal! Maybe we disable it and then add back a few targetted use cases when it makes sense.
Comment #9
fgmAdded issue about splitting the page cache at #2889603: Split the internal page cache from the rest of the render cache.
Comment #10
wim leersPatch posted for #2889603: Split the internal page cache from the rest of the render cache.
Comment #11
berdirDidn't see this issue yet, just staring at me unfortunately doesn't make we aware of something, especially not if the staring happens virtually :)
Moving the page cache away seems like a good idea.
Not sure yet what we should keep render cached and what not, but as discussed with Wim already a while ago, there are definitely parts that aren't worth caching. local actions/local tasks/breadcrumb blocks are a typical example that result a huge amount of variations on most sites but most likely aren't worth caching as they over time end up being by url/route and therefore the same as dynamic page cache.
The thing is that we don't really have a mechanism for that at the moment. max-age 0 is not what you want because that will mark it as uncacheable and will be placeholdered. We want something like CACHE_NOT_WORTH_IT. Similar to the never-finished issue of a min-age, that could be something that doesn't bubble up but just applies to its own render element/level. That would basically be the same as removing the cache keys, but the problem with that is that if someone removes the cache keys and then e.g. a hook_entity_build_defaults_alter() adds a key, it gets weird.
I think I would prefer to introduce such a mechanism and disable render caching where we know it's not worth it. There are many examples where it is definitely still very useful, the example from catch with a long list of comments, my standard example is always a news site frontpage (https://md-systems.github.io/drupal-8-caching/#/invalidation, the slides below that).
Another thing is that we could extend the render cache with something similar as auto_placeholder_conditions, but for things that we would not write to cache. Then we could add url/route cache contexts there, so all kinds of blocks/things that vary by route/url would no longer be cached separately. For contexts that bubble up, we would still end up doing a cache get but then we'd simply not save.
Comment #12
Andre-BWe had some major issues with cache_render on multiple projects. Drupal 8 in default configuration generates just way too many entries in cache_render table and most of it isn't used at all since the invalidation rules will invalidate multiple cache_render entries at once. So anything that is url specific is a big problem for sites with mainly anonymous traffic.
We were left with adding lots of storage to our database and frequently scanning the usage of the cache_render table - downloading 20-40GBs of cache_render just to scan it on a local machine. The biggest issue there is that we need the cache_render bin since it holds the page cache, hopefully that changes in the future, once that's done moving cache_render onto memcached becomes an option. A nice side effect on this is due to the reduced IO (saving those caches costs!) the overall page load time for the first hit becomes much better as well.
Anyways going forward we killed most of the redundant cache entries from our cache_render bin by using a module called cache_split https://github.com/derhasi/cache_split/ this adds overhead to the general cache gets and sets though - and with more rules to be inserted becomes slower. After cleaning up / identifying the parts worth keeping the render cache is actual usable for us - since there are some elements that are used across multiple pages. That cache_split module suggests using null cache backend - which I disagree with, memorybackend is a better alternative.
Comment #13
mxh commentedRelating this one #2877045: Add an option to pass through caching for instances implementing CacheableDependencyInterface which seems to have a need for @Berdir's suggested
CACHE_NOT_WORTH_ITin #11. Defining very granular elements which are fine to be cached inside bigger parent elements but not being cached as themselves would be helpful. Thinking of something likeCache::PARENT.Imo the render cache is still an important and helpful component of Drupal core, but there needs to be an easy way for being able to reduce cache granularity of the render cache - either by
['#cache']definitions and/or by configuration.(Probably offtopic, sorry) Has maybe someone already tried using the filesystem as backend for the render cache (not using
cache.backend.phpof course)? In theory, this might be an option for sites running on providers with small databases but enough file storage.Comment #14
ndobromirov commented@mxh This might work for smaller systems, but once the site needs to use multiple web-servers, the file-level cache can not work consistently.
Comment #15
mpdonadioFew random thoughts.
I wonder if CacheBackendInterface needs an isLRU() method on it, or maybe getStrategy() with some constants that we define. Then core could decide whether or not to cache something depending on LRU/strategy. Perhaps render caching only make sense on LRU or something that sanely prevents runaway growth or braindead strategies like the one APC uses.
I also wonder if instead of CACHE_NOT_WORTH_IT, we have should have a #importance key, that maybe goes from 0-10, and defaults to 5. The importance factor could then be used somehow. Maybe exposed as config 'dont_cache_below_this_importance', which also defaults to 5. And then #importance could also be used for purging. Also allow it to bubble up, so the parent #importance is the max of its children's #importance.
Similarly, we could do something with the size of what is being cached.
Comment #18
wim leersMakes sense. But is it worth it? That would mean no radical simplification, which is what this issue is about.
Hm, like a
no_cachecache context, that'd bubble up? @Fabianx proposed something along those lines a long time ago.Did I understand that correctly?
That's now solved in #2526150: Database cache bins allow unlimited growth: cache DB tables of gigabytes!.
Comment #19
berdirA cache context to actually prevent caching might be an option too, but not quite sure how that would work.
What I mean is something else, see #2232375-54: Support CacheOptionalInterface in BlockViewBuilder, use it in language switcher block to prevent max-age 0 bubbling.
I tried to come up with a proof of concept but the problem is that the auto-placeholder conditions are used for two things.. for actually deciding when to use auto-placeholder *and* to disable dynamic-page-cache completely on those.
What I'd like is a (new) setting that triggers auto-placeholdering but wouldn't disable dynamic_page_cache and then also prevents to actually write cache entries (maybe that should also be done for normal auto-placeholdered entries?). Then we would enable that for example for url.path and url.query_args.. so for example a block that depends on that would be auto-placeholdered, no longer cached *but* we don't want to disable dynamic_page_cache just because something else, for example the main content block depends on url.query_args.
Thoughts? One option would be to add a new setting just for the logic in \Drupal\dynamic_page_cache\EventSubscriber\DynamicPageCacheSubscriber::shouldCacheResponse() but not sure if that would be considered a BC break if someone customized the other setting?
Comment #20
berdirI'm sure this isn't perfect but it does seem to go in the direction I want when testing with #2232375: Support CacheOptionalInterface in BlockViewBuilder, use it in language switcher block to prevent max-age 0 bubbling.
Basic D8 installation with 2 languages and the language switcher block in sidebar.
On /user/ID
dynamic page cache: miss then hit
dynamic page cache has no url.query_args cache context (only url_query_args:_wrapper_format)
language block is auto-placeholdered and *not* cached
On /frontpage
dynamic page cache: miss then hit
dynamic page cache has url.query_args cache context because of the frontpage view.
language block is auto-placeholdered and *not* cached
If others think that this change would make sense then I'll open a separate issue. Also expecting test fails because I assume there are explicit tests that placeholders are render cached. Also some of my changes are likely not correct.
Comment #22
Andre-B@wim-leers
not solved- just hidden. It fixed the symptom not the cause. There's still IO involved without getting a real benefit from it- finding those bottlenecks is just harder now.
Comment #23
wim leers@Andre-B: I respectfully disagree. CPUs and operating systems also cache things without knowing for certain it'll be used many times/for a long time. That's part of caching.
However, if you would say , then I'd completely agree!
Comment #24
wim leersComment #25
wim leersI'd swear we have a pre-existing issue for the whole "not worth caching" thing that is being discussed here and that is sort of taking over this issue. But having scanned #2429287: [meta] Finalize the cache contexts API & DX/usage, enable a leap forward in performance, I can't find anything :(
Yes!
I think it does! To retain BC, and also to simplify configuration, I might suggest changing it from duplicating
auto_placeholder_conditionsto inheriting that, and then subtracting a whitelist of cache contexts that Dynamic Page Cache does consider cacheable at the page level. We'd then effectively be counteracting/contradicting this statement from the docs:Because those conditions indicate poor cacheability, and if it doesn't make sense to cache parts of a page, then neither does it make sense to cache an entire page..(EDIT: I think #20 may actually just work because you're auto-placeholdering
url.query_args, which means that even without the Dynamic Page Cache changes in #20 you'd get the behavior you described in #20, since the language switcher block is simply being placeholdered and never ends up being cached in Dynamic Page Cache.)Comment #26
Andre-B@Wim Leers I've seen performance increases of more than 30% due to avoiding of cache writes for unused caches. Since CPUs do not use a MySQL-Database as a Backend it probably doesn't matter that much over there ;)
btw. the performance impact of such a change with a memcached backend for instance is much lower - nature of things.
Comment #27
mxh commentedThis one is something related.
Comment #28
berdirCreated #2935804: Add cache context exclusion list to RenderCache::set()
Comment #29
wim leers#26: I was merely commenting on the fact that significant amounts of cache storage were used. We weren't evicting old cache items. That has been a design flaw in Drupal's caching system since day 1. Caches are always meant to be bounded in size. Ours wasn't. That's also what I'm referring to in my CPU-related comment in #23: CPUs also don't know for certain which things are most worthwhile to cache, that's why it's also continuously evicting things from its caches.
To reduce the I/O cost of evicting something too early, it even has a layered cache: L1, L2, L3 caches. L1 has lowest latency. L2 has higher latency than L1, but still much faster than reading from RAM/SSD. Drupal has that too:
\Drupal\Core\Cache\ChainedFastBackend, where the L1 and L2 caches are configurable, by default APCu is used for L1.Yes, the latencies of APCu are much higher than a CPU's L1 cache, and MySQL's much higher than a CPU's L2 cache. Then again, Drupal isn't hardware … and isn't designed or tested with remotely the same rigor.
Reducing response time with 30% by saving unnecessary cache I/O totally is possible. Ideally, yes, we wouldn't be caching things we know aren't really worth caching. But "worth caching or not" also depends on your setup. For starters, MySQL on the same host has a much lower query latency than MySQL on a separate host (because network latency).
My point is that there is no way to perfectly optimize for every particular Drupal site on every possible hosting setup. We are gradually getting better. Please help us get better faster!
Comment #30
wim leersThanks, Berdir!
Comment #31
dawehnerIs it #2241377: [meta] Profile/rationalise cache tags you had in the back of your head?
Comment #33
fabianx commentedThat is why I wanted an alter hook on block plugins [nutch nutch - not giving up ...], so they can have a say that they don't want to be cached (e.g. they can remove the #cache keys before being rendered), however contrary to what I might have said before:
- "no-cache" would not bubble, but it is a one-time instance, but it is hard to understand. Maybe drupal-do-not-cache-this is a better tag name.
- Generally the cache hierarchy is needed to avoid re-doing work again and again. If we don't use cache_block, but cache_render that is the same mistake as cache_page being in cache_render, the same for entities cache_render_node, etc. might be better. => e.g. lets have more granularirty. Smaller tables are better.
And that that the approach of cache hierarchy works we can see in e.g. the umami theme, where dynamic_page_cache is UNCACHEABLE for many pages.
Also list-invalidation is a big problem for DPC.
Much of the concerns above are related to MySQL as backend, but larger sites should not be using MySQL as cache backend anyway. Those could indeed use local filesystem OR SQLite3, which is blazingly fast even with pretty HUGE databases (yes it is possible just to put cache backends on SQLite3 backend).
- No idea how the MySQL issue was solved, but I found that we implemented cache tags wrongly (in an inefficient way). By using timestamps (as the checksum in the schema) we can with just one query cleanup 90% of old data in MySQL (e.g. all the expired items). That maps much more to last-modified and is an industry best-practice in HTTP anyway.
e.g. the big problem is that we never cleanup, not that cache items do not expire. We rather expire too much right now, e.g. for listings.
And while this is a generic brainstorming issue:
- I think there should be an drupal-auto-placeholder cache tag so that you don't need to specify #create_placeholder, but can also do it easily from lower levels, e.g. a block as it can't access it's final #cache properties directly ;).
- I think there should be more lazy builders used in general, but potentially need to be able to blacklist that I don't want this e.g. big_pipe'd. (new proposal: ['#cache']['blacklist_placeholder_strategies'] = ['big_pipe'];)
- We still need to implement the CachedPlaceholderStrategy (Did I create an issue for that? I can't find it right now ...)
- We can cache some things for sites where granularity in the server cache is not an option, in localStorage in the browser instead
Can we have new statistics from cache_render now that page cache was moved? cc @fgm
Comment #34
mxh commentedThis procedure during garbage collection currently kills our sites with huge page cache tables because of locking. Instead we switched to Gzip compression, which has shown that it is very efficient. Take a look at Andre's module Compressed Cache.
Comment #35
fabianx commentedI have questions to that:
- Which garbage collection? Are you using temporary (e.g. max-age != PERMANENT items)? This is more like a ban-crawler and would not affect that many items at once.
- What kinda locking is happening?
- Why are you not utilizing memcache / redis for such large installations?
Compression can now be useful, because D8 is - thanks to PHP7 no longer CPU bound, but again a little bit DB-bound, but usually the difference is only for pretty large items feel-able (e.g. 10 MB data structures) in a per-item comparison.
Comment #36
mxh commentedDrupal\Core\Cache\DatabaseBackend::garbageCollection()performs a delete statement regards expired items. It doesn't matter whether the items are expired or not. That statement is expensive and locks rows which others cannot access at this time. This is problematic especially when the table has a ton of rows. We then get a lot of lock timeout exceptions when this statement is being run, e.g.Uncaught PHP Exception Drupal\Core\Database\DatabaseExceptionWrapper: "SQLSTATE[HY000]: General error: 1205 Lock wait timeout exceeded; try restarting transaction: INSERT INTO {cache_*}We currently have to deal with a limited set of capabilities. It's also a matter of economic considerations when looking at the projects and decisions what solutions to use. Technically - of course - this makes sense at all when using Drupal 8 at these sizes, but not always economically.
Actually we currently wouldn't use a small Memcache for Page Cache at all, since we'd want as many existing Page Cache items as possible, not being thrown out via LRU because of other busy cache bins like render or data caches.
With compression enabled we now have doubled our storage capabilities for render, data, page and entity cache, without sacrificing noticable performance (have a look at some results here).
Comment #37
fabianx commented#36 Thanks for the elaboration:
So you are saying the:
is giving you trouble even if you don't have any max-age -1 items?
Can you try to just add a SELECT in front of it to see if any rows would be affected?
The other thing that I read in a very old thread is to LIMIT the amount of data removed per item.
e.g.
do GC in the following way:
which is recommended anyway and a major bug IMHO, because with an unbounded query it needs to do a full index scan, before returning control, while with LIMIT other processes can make progress in between the deletes.
(A limit has replication problems though and we need to order on expire in that case, too: http://mysql.rjweb.org/doc.php/deletebig)
Can you change that and report back if your deadlocks go away? (even with just one SELECT check at the beginning)
Another thing is that caches should not be inserted while being in a transaction (that is the cache consistent issue), I believe if we changed that to a simplified version of cache_consistent, we could win a lot (need to post on other issue):
While in a transaction:
- cache_get: Allow - if cache_prefix or cache tag had not been invalidated in this request [use simplified memcache key prefix algo, we could also be more strict and do invalidation by bin]
(else we run 100% cache-less, which is not really what we want even though most things are static caches anyway).
- cache_set: Disallow, treat like a record_cache_clear_all($cid), record
- cache_clear_all: Disallow, treat like a record_cache_clear_all($bin, $prefix)
- tag invalidation: Disallow, treat like record_cache_tag_invalidation
After the transaction is committed => commit all the cache clears, which is simple as just need to execute it.
That will solve the cache inconsistency for memcache / redis AND also reduce lock wait timeout exceeded problems within transactions as there will be no writes to the cache bins during the transaction.
Comment #38
mxh commentedThanks very much @Fabianx for sharing your thoughts and suggestions about this topic, appreciate this a lot.
Unfortunately this is not that easy to elaborate. We'll try to find some time for testing this (in a simple way).
Yes, this is the affected statement, produced by
DatabaseBackend::garbageCollection. As from our observations noted in #36 though, it seems that it doesn't matter what the expires value is. What does seem to matter is the huge amount of existing records in the table. I'm not an expert regards the inner guts of MySQL, but it seems that the created index on the expires column can't save us from deadlocks when big tables are in the game.Seems to conform with our deadlock observations noted above.
Agree. Could you post a link to that issue here if you have one, please?
Regards that lock timeout problems I recently thought about an approach based on a Drupal state variable:
1. Set the state variable "cache_rebuild" to true
2. do the cache rebuild stuff including garbage collection
-> while "cache_rebuild" is set to true, Drupal would let any attempt of cache writes run into a Null backend
3. When cache rebuild process is complete, state variable "cache_rebuild" is set to false so that Drupal "allows" cache writes again
Comment #39
Andre-B#37
DELETE FROM cache_X WHERE expired <> -1 AND expired < %d
Gives us troubles because it needs to read the whole table ignoring any indexes:
Result.
Using where - no sign of using index, leading to a full table scan. having a couple of gb worth of data this locked the table for good. btw. this happens also for
EXPLAIN DELETE FROM cache_page WHERE cid = 'someurl:html'also states Using where. So I am not sure if that information is trustworthy at all.
re select statement:
EXPLAIN SELECT cid FROM cache_page WHERE expire <> -1 AND expire < UNIX_TIMESTAMP()the select itself does not cause any problems.
this happens on a system with
innodb_version: 5.6.32-78.1
re:
this should / will reduce the time taken for the locking of that table - given a small limit such as 100-1000. Following up on http://mysql.rjweb.org/doc.php/deletebig adding a sleep 1 in the loop might be a good idea.
Comment #40
fabianx commented#39 We are getting off-topic here, but at least for unlimited storage (e.g. not using the row limit for the DB table) this clearly shows that the temporary items make trouble - as we don't have many in core anymore anyway.
- Probably might be good to move those to a cache_page_temporary table?
On the other hand a secure and fast and scalable deletion method is needed, because in the future we should add some kind of GC for cache tagged data.
One potential - albeit crazy solution - would be to use three tables:
- cache_page and cache_page_archive
Every X hours / days, cache_page_old is REMOVED, cache_page_archive is renamed to cache_page_old, cache_page is renamed to cache_page_archive
- An item not found in cache_page is tried to be found in cache_page_archive and if it is found and valid, then it is re-written to cache_page (chained refresh hit).
- While there is more effort for that, we only keep active items in the cache, tables never get fragmented
----
Another idea for GC would be to: Just set the:
- checksum => -1 (invalid)
- data => NULL
to save space in the table, but not run into the deletion problem. Eventually few rows of invalid items should be cleaned up however, but then that could be ordered by cid, checksum.
I'll ask our database specialist today about potential solutions for the whole deletion thingy ...
Comment #41
Andre-B#40 Reducing the occupied space of the table itself is a big performance gain (which is reflected in the garbage collection as well). See https://www.drupal.org/project/drupal/issues/1281408 and https://www.drupal.org/project/compressed_cache
Comment #42
fabianx commentedI spoke with @nnewton about it and while we are not yet sure about the performance characteristics of different DELETE methods (and chunking still might make sense), we strongly advise to set the transaction isolation level to 'READ-COMMITTED', because with 'REPEATABLE-READ' you have lots of gap locking problems, which would explain your deadlocks.
Comment #43
stsharp commentedThis may be off topic here but we are still seeing issues with very large database cache even running 8.5.3.
We were told my our hosting team that our database cache's were getting huge. We had just done some backups and part of that process runs a cache clear so the database cache was cleared. Within 24 hours we noticed a 500MB size database cache! I checked Google Analytics to see if there was some traffic spike but there were only 600 users and a couple thousand page views - small volume traffic. I will also say that we are running varnish.
So we made changes to the settings.php file as was recommended in several articles we read even though the articles mentioned that this issue had been fixed in 8.4 We tried this anyways. These are a few of the changes we made as an experiment
$settings['database_cache_max_rows']['bins']['page'] = 500;
$settings['database_cache_max_rows']['bins']['dynamic_page_cache'] = 500;
$settings['database_cache_max_rows']['bins']['render'] = 1000;
$settings['database_cache_max_rows']['bins']['config'] = 500;
We cleared the cache on the server and saw the database cache clear out - it wasn't at 0 but it was very low. After implementing this change in the settings.php file, after just 20 minutes of normal site usage we saw the following results:
+--------------------------+--------------+------------+
| Table | Num. of Rows | Size in MB |
+--------------------------+--------------+------------+
| cache_render | 7110 | 56.16 |
| cache_menu | 692 | 20.80 |
| cache_data | 2314 | 14.11 |
| cache_entity | 1192 | 11.69 |
| cache_dynamic_page_cache | 125 | 11.67 |
| cache_page | 67 | 11.36 |
| cache_default | 212 | 3.55 |
| cache_config | 1656 | 2.77 |
| cache_discovery | 127 | 2.55 |
| cache_container | 2 | 1.53 |
+--------------------------+--------------+------------+
Notice that cache_render is at 7110 rows when the settings.php file says to limit it to 1000 rows.
Does anyone see a syntax problem in the settings.php file or is there some other files in the drupal install that might be overriding this change that we can check?
We have been trying to resolve this for three days now with no effect.
Comment #44
berdirThe settings are just about purging existing cached data. They do not and do not intend to prevent data from being stored in the first place. So yes, it will grew above that and will be deleted again afterwards.
It doesn't make sense to set values *that* low IMHO, then you could just as well disable dynamic page cache or set the cache backend to null. It also doesn't make sense to limit config. That won't grow above those values, it's just how many configs you have + translations and so on. But it's very little data.
You will need to investigate which parts of your site result in that many variations for the render cache, e.g. by looking at the first N characters of the cache_render table, look for the mots frequents one and then how many variations they have.
Comment #45
acbramley commentedAgree with @Berdir here, you should investigate what's causing the large cache rows in the first place. I discovered recently that with search_api_tag caching on our search api views, each cached entry was around 5MB and was completely redundant as the data varied on search keys, which were also represented in the URL so the whole page cache varied on the same thing. Not all caching is good caching :)
Comment #47
berdirUpdating the title and adding the needs issue summary tag.
This is not about removing render caching, that's definitely still useful IMHO, it's about optimizing it, with issues like #2935804: Add cache context exclusion list to RenderCache::set(), so we for example have a way to not cache certain elements anymore in regular render cache.
Comment #53
andypostComment #54
andypostComment #59
kristiaanvandeneyndeSmall update:
Now that we have VariationCache and made sure that nothing in core still (ab)uses the render cache to store things other than render arrays, it could be easier to remove the render cache. Having said that, I still see value in keeping it. The comments also seem to be in favor of keeping it for some parts, so perhaps we could update the IS to strike out the part that suggests outright removing the render cache?
Having said that, I think it's hard to determine beforehand what we may or may not want to exclude. For instance the IS suggests disabling it for entities and blocks but keeping it for views. But what if I have a block in my footer that is really expensive to calculate and only ever varies by what day it is, I'd rather that block be render cached and every page request from thereon out gets the cached copy than have to recalculate said block on every page request for it to then be cached by DPC.
I really like @Berdir's suggestion in #11 but I'd turn that upside down. What if we do not cache anything in render cache unless we specify that it's expensive to calculate. This is a lot safer when you consider outside alters: If I flag something as CACHE_NOT_WORTH_IT and then someone alters my render array and adds in something expensive, we have a problem. But if my render array is by default considered as not worth caching and then someone alters in something expensive, adding RENDER_CACHE_REQUIRED, then that would make more sense.
Figuring out what we want cacheable or not by default is still possible that way, we'd only need to add RENDER_CACHE_REQUIRED to the things we know are often expensive.
Comment #60
catch#59 sounds promising, also seems like it would fit quite well with lazy builders/autoplaceholdering where you have to explicitly define something in a way it can be placeholdered.
Comment #61
hmdnawaz commentedPatch for Drupal 10.3
Comment #62
berdirI think render cache is by default still more valuable than not. Entity types can already easily opt out, even per bundle, and we do that for paragraphs for example, where we can safely assume that they are not reused outside of of a single node and by default, if one is invalidated by a save, then all of them are. But nodes very commonly still are absolutely useful to cache. Rendering entities is surprisingly expensive with view display config, many formatters, often multiple child entities like medias, paragraphs, layout builder and so on. It's quite easy to end up with something like a view in a block on all pages that is going to be frequently invalidate your complete dynamic page cache and then you have to rebuild all your node pages again which are most likely still cached.
Media entities I'm on the fence, I can see those being not really worth it and enough to inherit in the parent, I've also had multiple cases with per parent access and customization, where you have to consider the parent anyway. But we don't need any new API's for that, it's just a flag we could set on the entity type. But even for media, don't forget about things the media library, where we render a lot of media entities.
Also for blocks, IMHO the majority are still worth to cache, menu blocks can be very expensive with many links for example, it's IMHO also much easier for BC to add an opt out rather than an opt-in.
We now have rendering time built-into the render debug output, it would be fairly easy to automatically highlight all elements that take less than a configurable/estimated/measured render cache get. We could even build that into the performance tests?
for example, the default powered by drupal block for me reports as "RENDERING TIME: 0.003705978". The default frontpage node from the paragraphs_demo module with 7 paragraphs is "RENDERING TIME: 0.229816914". You can search for RENDERING TIME: 0.00 to find things that are propably not worth to cache:
on olivero, for me, additionaaly:
site branding: RENDERING TIME: 0.005095959 (interesting example, because if you do some more complex things with the logo, e.g. we often have per-language logos, or inline SVG logos, that could easily get slower)
search form: RENDERING TIME: 0.004688025
account menu is RENDERING TIME: 0.009571075
messages block: RENDERING TIME: 0.002686024
syndicate block: RENDERING TIME: 0.004869223
so, doesn't get much faster than ~0.03 on my system. didn't compare yet how fast a render cache get is.
Comment #63
kristiaanvandeneyndeThe render time avenue is worth exploring, but how would that work? The debug info is only available after rendering, so we can choose not to set it when we had to render it, but we'd still be trying to get it from the cache every time it is to be built. Ideally, we don't go to the cache at all, which would be possible using a tag like CACHE_NOT_WORTH_IT or RENDER_CACHE_REQUIRED.
So either we depend on render time and eat unnecessary cache lookups, but have the benefit of a "smart' render cache. Or we go with the aforementioned special flags/tags, but require manual intervention.
If we do choose to go with a smart system based on render time, we'd need a way for people to opt out of it. Let's say you have a block that renders fast but pings an external API, you might wanna cache that block regardless for a given while as to not run into a rate limit. (I know there's ways around that by using a service with an internal cache, but just giving a feasible example)
A downside of adding the render time checks to StandardPerformanceTest is that the time is machine-dependent, so we'd be introducing another test that can randomly fail on GitLab CI.
Either way, not a bad idea. Definitely worth exploring.
Comment #64
catchYeah I've been specifically avoiding any assertions based on times, but we could add something to the OpenTelemetry spans so that render elements show up and/or to webprofiler module.
Comment #65
catchSince the last comments here, we've done #3493911: Add a CachedPlaceholderStrategy to optimize render cache hits and reduce layout shift from big pipe which is not directly related to this issue but does help.
And #3500683: Allow access policies to opt out of caching which added the 'not worth caching' interface discussed here, although not yet applying it to render arrays yet.
For any render array that appears on a lot of pages (e.g. the new navigation toolbar), if you both set cache keys and also #create_placeholder TRUE, then it gets its own render cache item, but also it's only included in the dynamic page cache as a placeholder. That was already true before that issue, but now it will also skip big pipe rendering replacement and be fetched direct from the render cache. It also means that the cache tags are isolated from dynamic page cache item (which might help for things like node:list cache tags and similar).
This means less 'double caching' of items between the dynamic page cache and the render cache, although the placeholder metadata itself still ends up in the dynamic page cache so for tiny items it probably wouldn't save anything.
I think if we combine more placeholdering for some things, with the not-worth-caching logic for others, we should be getting closer to what this issue was trying to do.
Comment #66
catchWe use
'CACHE_MISS_IF_UNCACHEABLE_HTTP_METHOD:form'to prevent caching of forms on POST requests now, so adding another 'special' cache tag would be consistent with that.I think manual intervention is fine: we're mainly interested in:
1. Very high cardinality cache items that are usually inside something else, mostly to save writes/space. Like the paragraphs example but that already has a separate opt-out.
2. Very high frequency cache items that are as-cheap to render without caching, mostly to avoid unnecessary gets, like some of the 'site chrome' blocks identified by @berdir.