Problem/Motivation
Spin-off from #3493406: Add render caching for the navigation render array.
We could render the entire navigation menu via a placeholder, this would have the following benefits:
1. With dynamic_page_cache, we'd only cache the placeholder HTML in the cache item, reducing overall cache bin sizes by quite a lot.
2. With big pipe, we'd be able to send HTML <head> and the page header, and start rendering them earlier. On some sites this will significantly speed up Largest Contentful Paint.
3. With big pipe, navigation-specific libraries will be added via Big Pipe in isolation from other libraries on the page, this should improve CSS and JavaScript aggregation efficiency and cache hits rates, because they won't end up bundled with other aggregates that may or may not include other libraries loaded by different front and admin-facing pages.
However there is one potential drawback:
The navigation bar is consistently in the same spot on every page and it currently loads fairly quickly. Placeholdering it may make it load later relative to other elements in the main viewport and could increase Content Layout Shift / (jank) and potentially make Largest Contentful Paint worse (contrary to the pro above) if it moves elements around that would otherwise have finished rendering.
If this is an issue, we might be able to improve it with a big pipe preview, but then we want to avoid flickering, layout shift etc. from that too. It would at least need to be exactly the same size container just without the inner text or something. Because it's full height and fixed width (I think) that might be doable.
We should be able to get an idea on the last point via manual testing between Standard, Umami and Drupal CMS.
Steps to reproduce
Proposed resolution
Add a CachedPlaceholderStrategy - this will fetch any placeholders from the render cache if they're in there and immediately replace the placeholder, operating before Big Pipe and other strategies run. The result of this is that with a warm render cache, big pipe rendering may be completely bypassed, completely eliminating Content Layout Shift. On total or partial cache misses, big pipe will be used to placeholder the items that aren't cached.
To support the CachedPlaceholderStrategy, add render cache and variation cache ::getMultiple() methods to optimize the cache operations.
Remaining tasks
User interface changes
Introduced terminology
API changes
Data model changes
Release notes snippet
Issue fork drupal-3493911
Show commands
Start within a Git clone of the project using the version control instructions.
Or, if you do not have SSH keys set up on git.drupalcode.org:
- non-recursive
changes, plain diff MR !11069
- 3493911-add-a-cachedplaceholderstrategy
changes, plain diff MR !11064
Comments
Comment #2
catchComment #3
catchComment #4
catchComment #5
catchThought about this more. Because this should have a high cache hit rate and be on every page, it's more part of 'site chrome' than most things. So instead of trying this now, I think we should get #2880237: [meta] Refactor system/base library and similar issues done first - which would result in more other js/css added via placeholders. And #3438976: Implement a caching strategy for the menu links which would reduce the amount of HTML stored by the dynamic_page_cache.
Comment #6
catchAnother thought.
For dynamic_page_cache efficiency, we would want this in a placeholder.
But to avoid cumulative layout shift, we might not want it to be rendered via BigPipe.
BigPipe already has non-js placeholders, for placeholders within attributes etc. which are blocking and sent with the main response. Maybe we can add some kind of additional hint for the placeholder strategy so that the navigation placeholder always gets rendered via a nojs placeholder (i.e. inline with the main response) - this would address both caching efficiency and layout shift then.
Comment #7
catchTalking to Fabianx:
Comment #8
fabianx commentedTo comment myself:
The reason we have a ChainedPlaceholderStrategy in core by default is not only to support big_pipe, but because I envisioned that a lot of things that are not bound by URL would become placeholders in the first place.
As can be seen from the linked issue in #7 I had this whole idea of making it super easy to put e.g. a shopping cart placeholder on the page and replace it via JS, AJAX, etc.
And therefore the natural way to go about this is to use the chain:
- CachedPlaceholderStrategy
- BigPipePlaceholderStrategy
- SingleFlushPlaceholderStrategy
The reason is that the cached placeholder strategy can do a cache->getMultiple() [for anything that can be cached] while the SingleFlushPlaceholderStrategy is just rendering them one by one.
So even for just core this can save performance.
Comment #9
fabianx commentedWhoever works on CachedPlaceholderStrategy here is how the raw architecture should look like (untested code):
The reason for the recursion is to support the following scenario (part of ORIGINAL 2014 big pipe demo):
- There is a sleep(1) on the page in a cacheable block content, which sets max-age=0
- Then this makes the whole block uncacheable.
- As soon as you create a placeholder within the block for the
current_time, then the block becomes cacheable again. You still want to create a placeholder for the block as it contains other content that is independent of the page.With the CachedPlaceholderStrategy with above implementation:
- If the block is uncached then the whole block is streamed via big_pipe.
- If the blocked is cached then just the placeholder is streamed via big_pipe.
Without the recursive calling of the strategy, this would fail and the `current_time` would again make the whole page slow.
Comment #10
catchSo to make this work:
Add a CachedPlaceholderStrategy - this does a multiple get on render cache items, replaces the placeholder directly with the cached HTML.
This runs all the time, before big pipe's own strategy (if big pipe is enabled). Big pipe then will only deal with placeholders which aren't already render cache hits.
When you have several placeholders render cached for stuff within the viewport, this will significantly reduce content layout shift and maybe largest contentful paint too, because all the HTML gets sent together until it can't be.
It will also work very nicely in tandem with #3394423: Adopt the Revolt event loop for async task orchestration because then anything expensive in the uncached placeholders that can be done async will be.
edit: crossposted with @Fabianx.
Comment #11
fabianx commentedWhile this is untested, here is the code that Grok + ChatGPT created together with some direction for VariationCache::getMultiple():
Comment #12
fabianx commentedWith that in place, we can simplify our strategy significantly:
Let's just add a getMultiple() to RenderCache as well (help from ChatGPT).
Then our CachedStrategy looks like:
Comment #13
catchDiscussed with @plopsec and I'm moving the navigation stable blocker tag to this issue from #3438976: Implement a caching strategy for the menu links.
#3493406: Add render caching for the navigation render array just landed which unblocks this issue.
We might need to split this issue into separate 'add the API' and 'use a placeholder' issues, but since the end goal of this issue is that there's zero impact on the front end, except for a render cache miss served by BigPipe, not sure how to test all of this yet, so will be nice to be able to piggy-back off the existing test coverage and manually test etc.
Comment #14
catchBumping to major because this is potentially a big Content Layout Shift improvement for all big pipe responses, it will also work very nicely with the cold cache optimisations that #3394423: Adopt the Revolt event loop for async task orchestration will enable.
Comment #15
plopescI could have some bandwith to start to work on this issue this week.
Went through the comments and it's not 100% clear to me from where I should start my investigation, or even if all the pieces to start are already in place.
Would be great if you could add comment to indicate the possible steps to follow that would help me to kick it off.
Comment #16
catch@plopsec so #12 has a draft of what it should look like.
We need:
1. RenderCache::getMultiple() as a new method - this is allowed via the 1-1 rule I think. It's not strictly required to make this work, but it'll be an additional performance improvement. If it turns out to be hard we can split it to its own issue.
2. The CachedPlaceholderStrategy itself, these needs a priority so it runs before any other placeholder strategy (e.g. so cached placeholders are replaced first, then bigpipe et al get to process what's left over).
I am pretty sure that a working MR will cause performance tests to fail - e.g. on bigpipe warm cache requests different things will happen. If they fail in the right way and most other things pass, then it's working.
Not sure how best to add explicit test coverage yet but that might be clearer once it's up and running.
I'm pretty excited about the possibilities this issue opens so please ping me in slack if the above isn't useful or if something else comes up. I can't speak for fabianx but my guess is he'd be the same given it was his original idea 8+ years ago.
Comment #18
plopescThank you for your feedback.
Started to work on a draft MR based on the boilerplate code from #11 & #12 with some minor adjustments to make it work, but found some issues related to recursion of CachedStrategy & Big Pipe. After the 1st load, once the cache is warm, the page content is not being loaded and the following error is thrown instead:
The #lazy_builder property must have an array as a value, containing two values: the callback, and the arguments for the callback. in <assert() (line 327 of core/lib/Drupal/Core/Render/Renderer.php)It seems due to the fact that the same placeholder is being rendered in BigPipeStrategy::createBigPipeNoJsPlaceholder() twice from different places, ended up in having the same #lazy_builder callback duplicated. Hence, when the #attached arrays are merged as part of the page build process, the callback array has 4 elements instead of 2. This is happening for the Logout link placeholder, which is rendered both in the User Account menu and in Navigation.
Steps to reproduce
I'm not sure how to proceed here, we could indicate at build time that #lazy_builder callbacks should not be merged, but that could be complex. Another option could be to keep track of the generated placeholders somewhere and avoid duplicates.
Any idea?
Comment #19
catchOne question on the MR but it could be complete misdirection because I didn't try to verify or anything.
Comment #21
catchSo the current placeholder implementation explicitly supports duplicate placeholders on the page like CSRF links, I think it would be a bc break to prevent that.
I did some debugging of the original recursion bug that plopsec found, tried some different versions of the MR etc. and tried to fix it without a bc break (but keeping the recursion) in various unsuccessful ways.
The recursion is definitely resulting in the big pipe placeholder strategy being run twice on the same placeholder - once called from within CachedPlaceholderStrategy and once by itself when the chain iteration reaches it. This does not happen without the recursion.
I have read and re-read this from #9 and I can't see how it's the case:
This MR has zero impact on a cache miss - it will behave exactly the same as now with everything streamed via big_pipe. I'm not seeing how anything can get worse on a cache hit either, placeholders within something that is returned from a cache hit will be picked up as they otherwise would too (e.g. by bigpipe if that's what ends up handling them).
So... I think the answer here is to remove the recursion. Would be good to get either confirmation from Fabianx that #9 was mistaken or a test case to demonstrate that there really is a problem.
Comment #22
catchTagging as a release priority for navigation.
And also as a release highlight since it's a general render caching improvement.
Comment #24
catchBriefly discussed the recursion with Fabianx in slack - he said it would only optmise an extreme edge case which may not matter in practice. I think it would be good to have a follow-up to investigate whether it's needed for that edge case, and if so how to implement it transparently without a bc break, but definitely no need to block this issue on getting it working (which I am glad about because I tried for a couple of hours or so and didn't get anywhere).
Comment #25
smustgrave commentedNo additional input just appears to need a rebase
If you are another contributor eager to jump in, please allow the previous posters at least 48 hours to respond to feedback first, so they have the opportunity to finish what they started!
Comment #26
plopescBack to Needs Review once conflicts have been solved and last comments in the MR addressed.
If the changes here are good and the approach is validated, it might be time to figure out how to implement specific tests.
Comment #27
kristiaanvandeneyndeReviewing the whole thread, bear with me:
+1
Re #9 and #21 Would indeed be great to get some extra information here from @fabianx.
Re #11: Holy shit, AI wrote that? Because that looks like it might work.
Re #12: I need to start believing in AI more, both this and the above proof of concept look amazing.
Will check the MR next.
Comment #28
kristiaanvandeneyndeOkay so I'm willing to RTBC this from a code point-of-view, provided a few things are cleaned up.
First and foremost: The new methods' documentation on the interfaces is far too verbose and opinionated. If you want, I can rewrite them for you but I'll give you first dibs.
Secondly, I don't think it's wise to gut VariationCache::get() to make it use ::getMultiple(). It's slower, it's less readable and we still have ::delete() and ::invalidate() using the original ::get() body, so we'd be confusing people trying to read and understand VariationCache as to why ::get() is doing weird stuff.
Finally, the same could be said about RenderCache::get() but I'm a bit less fussed here. I still think we should preserve as many performance optimizations as the original code had so I'd like to see a bin existence check earlier in the process of getMultiple().
Overall really exciting MR, though. Thanks for working on this!
Comment #29
kristiaanvandeneyndeAlso at this point should we tackle navigation in here or in (yet another) follow-up? It seems the discussion and MR here went in a totally different (yet logical) direction :)
Comment #30
catchWith navigation it might be easier to do it in a spin-off issue - it's soft-blocked on this issue because this issue fixes content layout shift from placeholders, but the implementation itself doesn't require anything here, that would allow us to work on them/review/commit in parallel. afaik the only thing we need to change for navigation is from #pre_render to #lazy_builder + setting #create_placeholder = TRUE.
I haven't fully digested the re-use vs not question on the methods, but if we think it's better optimized to keep the logic separate, that seems fine.
Comment #31
plopescI already made the Navigation changes in the old MR that was replaced by the current one, so I could easily cherry-pick them. On the other hand, keeping the focus here would make this issue simpler.
I'm totally OK if you feel that's better to have methods separated, just wanted to reduce a bit the amount of code, but the downsides could be higher than the benefits.
Regarding docblocks, I can take a look there, current ones were just a copy&paste from the suggested in #12.
Comment #32
kristiaanvandeneyndeThanks, feel free to ping me on Slack for feedback or assistance with the last few bits. Again, this is looking amazing so I definitely want to give you my full support on getting this in.
Comment #33
catchWent ahead and opened #3504386: Use a placeholder for the navigation toolbar - we can cherry-pick the commits from the MR here over to that issue.
Comment #34
plopescWorked on the bits mentioned in the MR. I think this is ready for a new review.
Comment #35
catchOne very minor comment on the MR but no actual complaints, this is looking really good to me.
Comment #36
catchRe-titling now that the navigation change is split out.
Comment #37
plopescMade a rebase once #3504386: Use a placeholder for the navigation toolbar was merged and addressed the last comment in the MR.
Comment #38
kristiaanvandeneyndeSome documentation still mentions redirects where we're supposed to be agnostic of its existence. Will update myself, but as far as I am concerned this is RTBC. Just give me a minute to make some final adjustments.
Comment #39
kristiaanvandeneyndeDone. If I had only updated the docs I would RTBC, but I made a few changes to RenderCache to make it more readable and bail out early if we could not load any bin, just like ::get() does. For that reason, I'll yield to @catch to RTBC.
But everything I did not touch and the general concept is RTBC to me now.
Comment #40
berdirReviewed.
Comment #41
kristiaanvandeneyndeAddressed all feedback. Some nice finds, @berdir.
Code looks a lot cleaner after this last round of feedback too :)
Comment #42
berdirThanks, looks good to me now.
One thing I noticed is that CacheFactory does not have a static cache, every time you ask it for a bin, it creates a new instance. That's likely the reason why the Renderer in the past used the service and not the cache factory directly.
This probably doesn't have a big impact on the database. It does for redis because it stores some information like the last deleted flag, but the redis cache backend factory does have a static cache per bin for this reason.
Comment #43
catchJust committed #3493813: Drupal\Core\Theme\ComponentNegotiator::negotiate uses a lot of memory which may cause merge conflicts here.
Comment #44
catchYes need a rebase.
Comment #45
berdirRebased this, the only conflicts where in PerformanceTest and StandardPerformanceTest, and the resulting numbers are identical, it's just that there were already some decreases in HEAD.
So I think it's OK if I set that back to RTBC.
Comment #47
catchSo glad that Drupal CMS performance testing uncovered a way to improve navigation caching which in turn resurrected Fabianx's 8 year old idea to do this, and that we got it done.
This unblocks #3437499: Use placeholdering for more blocks which should in turn increase the impact of several other issues.
Committed/pushed to 11.x, thanks!
Comment #49
andypostit needs summary update about API change at least, or change record
Comment #50
catchAdded a change record and filled out the proposed resolution in the issue summary for any git archaeologists.
https://www.drupal.org/node/3504958
Comment #51
catchFabianx pointed out we should add some explicit test coverage, opened #3505115: Explicit tests for VariationCache::getMultiple() and RenderCache::getMultiple() for that. There's a lot of implicit test coverage that this doesn't break anything but a more specific test would help us understand why if/when we do break it.
Also opened #3505117: Evaluate recursive placeholder replacement in CachedPlaceholderStrategy to evaluate the sleep(1)/recursion edge case.
Comment #53
catchComment #54
borisson_The change record that is linked to this issue is not correct it seems?
Comment #55
kristiaanvandeneyndeIt is. It's the new API that we introduced here.
The CachedStrategy itself isn't that CR-worthy (IMO) as it is merely an optimization.