Problem/Motivation
The toolbar takes ~15 ms and ~17,000 function calls to render. On every page load. That's ridiculously slow. See #21.
The Toolbar module has optimized the rendering of the menu tree, by aggressively caching it. But… the rendering of the "main" toolbar (the black bar at the top of the page, with the white trays underneath it if you click any of the black items) is not yet cached.
Proposed resolution
We have cache context bubbling (#2429257: Bubble cache contexts). Use that: let every item returned by hook_toolbar()
specify cache contexts if it depends on contextual information.
See #22 for detailed performance numbers. But here's the gist of it:
Remaining tasks
Review.
User interface changes
None.
API changes
There's a case of "API refinement":
- Roughly speaking: no API changes.
- Strictly speaking: we now require
hook_toolbar()
implementations to associate cache contexts if they are dynamic. Since that would be an API break in Drupal 8.x.0, marking this is as critical.
Original report by Wim Leers
Problem/Motivation
Given a relatively optimized setup (PHP 5.5 with OpCache enabled, no xdebug/xhprof), super simple pages like /contact
take 190–200 ms on my computer to generate. That's ridiculously slow.
So I did some profiling and easily found a piece of tasty low-hanging performance fruit: toolbar_pre_render()
.
The Toolbar module has optimized the rendering of the menu tree, by aggressively caching it. But… the rendering of the "main" toolbar (the black bar at the top of the page, with the white trays underneath it if you click any of the black items) is not yet cached.
Render caching the "main" toolbar seems easy enough, but the user tray is personalized for each user and the shortcuts tray is not by default, but might be. The easiest solution for that is to cache the toolbar per user, but that easily becomes very expensive.
Proposed resolution
The only mechanism that we have to cache per role but still be able to personalize per user is #post_render_cache
. If we apply that, we get the following improvements (measured using the excellent https://drupal.org/project/webprofiler), using PHP 5.5 with OpCache enabled:
/
-
Before After Δ DB queries 183 150 -33 Response time (ms) Variance too high. Memory (MB) 13.8 13.2 -0.6 Cache gets 74 66 -8 Config gets 50 46 -4 /node/1
-
Before After Δ DB queries 173 140 -33 Response time (ms) Variance too high. Memory (MB) 13.8 13.2 -0.6 Cache gets 73 65 -8 Config gets 54 50 -4 /contact
-
Before After Δ DB queries 136 103 -33 Response time (ms) ~190 ~170 -~20 Memory (MB) 11 10.2 -0.8 Cache gets 46 38 -8 Config gets 41 37 -4
Remaining tasks
Review.
User interface changes
None.
API changes
None.
Postponed until
Comment | File | Size | Author |
---|---|---|---|
#86 | 2254865.86.patch | 13.56 KB | alexpott |
#86 | 83-86-interdiff.txt | 3.38 KB | alexpott |
#83 | interdiff.txt | 3.73 KB | lauriii |
#83 | toolbar_pre_render-2254865-83.patch | 13.12 KB | lauriii |
#73 | interdiff.txt | 5.02 KB | lauriii |
Comments
Comment #1
Wim LeersIn the process, I've discovered a new set of problems previously undiscovered:
drupal_render()
call to render the entire page. However, in practice,DefaultHtmlFragmentRenderer
renders the specialpage_top
andpage_bottom
regions first, and only then renders everything else. This causes#post_render_cache
callbacks inside those two special regions to be executed again when rendering everything else.drupal_render_collect_post_render_cache()
was running too earlydrupal_render_children()
by definition is a recursive call, but it was pretending to be the root call every time it rendered a child, therefore executing#post_render_cache
callbacks far too early (toolbar.module usesdrupal_render_children()
, hence I discovered this)Comment #3
jessebeach CreditAttribution: jessebeach commentedGiven the significant changes coming down the pipe in #2099131: Use #pre_render pattern for entity render caching and the nature of the failures in #1, perhaps we should postpone this work for the moment. We've seen errors like this before in our
#pre_render
issue work.Comment #4
Wim LeersAgreed.
Comment #5
Wim LeersWorking on #2241235: Shortcut/ShortcutSet entity types should use cache tags in the mean time, that will allow this patch to make an even bigger impact, because it'll allow us to render cache the shortcuts as well!
Comment #6
Wim Leers#2099131: Use #pre_render pattern for entity render caching landed, but let's get #2241235: Shortcut/ShortcutSet entity types should use cache tags done first.
Comment #7
jessebeach CreditAttribution: jessebeach commentedThis is unpostponed.
Comment #8
sunWait, why are we (ab)using the render cache for everything now?
The issue summary mainly talks about caching data?
Also, it refers to per-role caching, but the toolbar actually needs to be cached per-session. (?)
Comment #9
Wim Leers#7: I had left it marked as postponed as per #6 ;)
#8:
Comment #10
Wim Leers#2241235: Shortcut/ShortcutSet entity types should use cache tags landed! Now let's get this one done :)
Comment #11
Wim LeersRerolled, now that both #2099131: Use #pre_render pattern for entity render caching and #2241235: Shortcut/ShortcutSet entity types should use cache tags have landed. No interdiff; too many changes, and this is the first patch up for review anyway.
Now, instead of re-rendering the shortcut links on every page load, they're also render cached.
So, overall, in a Standard profile install with this patch:
#post_render_cache
callback replaces this with the shortcut tray contents, which are render cached themselves as well (to prevent all shortcut entities from being loaded on every page load)#post_render_cache
callback replaces this with the user tray contents, which are not cheap to render but are user-specificThe difference has become even slightly more stark now (D8 @ a0961ff65b273483f1a1d2f92318b1d21bfaceae): at
/contact
without this patch, 136 DB queries are performed (down from 139 when this issue was opened), and with this patch, 103 DB queries are performed (down from 108 when this issue was opened)… so a difference of 33 DB queries!.There's only one special hunk in this patch, which is the one for
DefaultHtmlFragmentRenderer::render()
. Instead of the crazy hacks I applied in #1, I propose a much saner approach here: don't letDefaultHtmlFragmentRenderer::render()
abuse the renderable page array anddrupal_render()
in unintended ways.DefaultHtmlFragmentRenderer::render()
must create aHtmlPage
, andHtmlPage
expects both "top-content" (setBodyTop()
) and "bottom-content" (setBodyBottom()
) to be separate things, and both of those most already be rendered HTML. Butpage_bottom
andpage_top
are part of the renderable array like all other regions, even though they should be rendered separately. So the fix is simple: we pull them out of the renderable page array and treat them separately, just likeHtmlPage
expects us to do.Without this change, any
#post_render_cache
callback that was applied for$page->setBodyTop(drupal_render($page_array['page_top']));
would be re-applied for$page->setContent(drupal_render($page_array));
, because$page_array
is a superset of$page_array['page_top']
, hencedrupal_render()
re-applies#post_render_cache
callbacks.(Yes, we're still cleaning up the half-baked mess that was introduced in [#2170551]…)
Comment #12
jessebeach CreditAttribution: jessebeach commentedLook at some performance numbers for the patch in #11. This is my standard testing scenario: a single, simple article with body, image and tags. I compared 8.x warm cache to PATCH warm cache, running 6 page loads for each and picking the fastest run to compare.
I get a 64ms(!) drop in page load time for a single article. Wowzers, that's awesome!
Comment #13
jessebeach CreditAttribution: jessebeach commentedIt seems like this was outright missing from the #pre_render work we did, no?
Just curious. Why the unsetting? Wouldn't this renderable just be marked as 'printed' after rendering?
Oh, you unset the page top and bottom above so that the cache tags won't get mingled together with the content of the page, right?
Comment #14
Wim Leersdrupal_render()
has some things that descend down subtrees that have been marked as 'printed'. This sidesteps that problem. Plus, it's conceptually more honest: we have three parts of the page that must be rendered independently, so then let's do it truly independently.Comment #16
Wim LeersSadly, the promises of this patch were too good to be true.
I made a HUGE mistake: I failed to consider the "menu" tray (called
toolbar_administration
in code)… which is of course rendered menu tree. I was caching this with the rest of the toolbar, so render cached per role, theme and language, instead of creating another post-render cache placeholder for it (hence the test failures). If I fix that, and render that tray in another#post_render_cache
callback, the deltas look like this:That's of course not great.
The only sane solution is to finally fix #1805054: Cache localized, access filtered, URL resolved, and rendered menu trees. So I'm going to work on that now.
What this patch does overall is still valid, because the different parts of the toolbar have different personalization aspects (for menu trees, see #1805054: Cache localized, access filtered, URL resolved, and rendered menu trees; for the shortcuts tray: each user may have a different shortcut set, but many have the same; for the user tray: this is obviously per-user, but is cheap to render).
So, this is now postponed on #1805054, once that is done, we can reroll this patch and it should still be a net win. But the huge win lies in #1805054.
Comment #17
Wim LeersLots of blockers for #1805054: Cache localized, access filtered, URL resolved, and rendered menu trees went in, I'm currently rerolling #1805054: Cache localized, access filtered, URL resolved, and rendered menu trees. I rerolled this patch as well to get a sense of the performance improvements of this in *today*'s HEAD with #1805054's patch (WIP) applied.
This can save an additional 6 DB queries per page, ~8 ms, and 5000 function calls.
I might merge this into #1805054, we'll see how that evolves. In the mean time, uploading an updated patch. The majority of the changes in
toolbar.module
from the above patches has already been committed to Drupal core, hence this patch is about half as big.Comment #18
jessebeach CreditAttribution: jessebeach commentedIn my queue to review.
Comment #19
Wim Leers#1805054: Cache localized, access filtered, URL resolved, and rendered menu trees is moving forward again, and once that lands, we'll be able to do this too.
Comment #20
Wim Leers#1805054: Cache localized, access filtered, URL resolved, and rendered menu trees landed, so this is now finally unblocked!
Comment #21
Wim LeersTo decide whether this is still relevant, I did some profiling and benchmarking. The answer is a resounding "YES", this is still worth doing.
Testing as UID 1, the root user. The costs for other authenticated users would be even higher.
Toolbar::preRenderToolbar()
ab
requestsIf the solution for this turns out to be impossible to do in 8.1, this will therefore be bumped to critical as per the guidelines in #1744302: [meta] Resolve known performance regressions in Drupal 8
Comment #22
Wim LeersThe patch to solve this is completely different than previous patches. It is much simpler. Because we now have #2429257: Bubble cache contexts, so we let the system take care of it instead of making
hook_toolbar()
complicating implementations.The performance difference is very stark.
This is profiled as UID 1 on the frontpage, using the Standard install profile. Fresh install. UID 1 because that is *cheaper* in terms of access checking (lots of things return early for UID 1) and because that user can access the toolbar.
In other words: the numbers below present the minimum expected improvement, for more complex set-ups and for non-admin users, the improvement will be much bigger.
Profiling (XHProf)
Benchmarking (
ab -n 1000 -c 1
)Before vs. after
IS updated. Also marking this as critical, because AFAICT this would be considered an API break in Drupal 8.x.0, and it exceeds the ">=10 ms faster for warm caches" requirement.
Comment #23
Wim LeersComment #24
Wim LeersOh, and as a bonus, this should also reduce the number of failures in #2429617: Make D8 2x as fast: Dynamic Page Cache: context-dependent page caching (for *all* users!).
Comment #26
dawehnerWhy do we have to explicit care about those cache contexts here. Won't things bubble up automatically? OH I see there is a call to permissions in here.
So ideally #access should be able to contain this cacheablity metadata, right?
The entry edit profile is also per user ...
Will this be automatically done via a placeholder in the future?
Comment #27
Wim LeersEasy fix, that points to another bug in the cacheability metadata.
Obviously, all of this still needs test coverage. The toolbar module has test coverage, but not for this area.
Comment #28
Wim Leers#26 Thanks for the review!
Comment #29
catchIs there any value to creating a 'current users's shortcut set' cache context? Then if a single shortcut set is used across the site, we'd not need to vary per-user.
Would have to lookup the shortcut set each time in the cache context implementation - bit of overhead there but given a lot of sites just have one shortcut set is probably a good trade off?
Comment #30
Wim LeersHahaha! Glad you asked! This is exactly what crossed my mind as well :) I figured to not bring it up to not distract the discussion :)
Reasons to not do that here:
user_toolbar()
is already per-user anyway.Comment #31
Wim LeersComment #32
Wim LeersAlso, fun fact: see #1805054-144: Cache localized, access filtered, URL resolved, and rendered menu trees, which contains the perf numbers for the issue that was blocking this one. It is that issue that brought down the UID1 front page load time down to 105 ms. This one is bringing that 105 ms down to 90 ms. They're adding up quite nicely!
Comment #33
Wim LeersUnassigning. I'm working on other performance patches now, hopefully somebody else can write the test coverage.
Comment #34
Fabianx CreditAttribution: Fabianx for Acquia commentedCNW for test coverage, code looks great!
One question:
Similar to dawehner I wonder why this needs to be set for the whole thing. Can't it be set on the "Edit user account" render array entry and bubble up?
Comment #35
Wim Leers#34: no, that's impossible, because what you're seeing is NOT a render array, but the value for the
#links
property.Comment #36
Fabianx CreditAttribution: Fabianx as a volunteer commentedWould the "Edit user account" link be anytime be differently cacheable than per user? So could the link itself have this? (not to block this - the solution is fine, just very curious).
Comment #37
Wim Leers#36: Well, it's the route parameter that is being set to the current user:
So, no, the link itself could not have this, because this is not a
MenuLinkInterface
object. It's just a simpleUrl
object.Comment #38
Fabianx CreditAttribution: Fabianx for Acquia commentedOpened #2495779: Make #theme => links take cacheability metadata as an argument to clean this up. Thanks for the explanation!
Comment #39
lauriiiWriting tests for this
Comment #40
lauriiiComment #42
lauriiiOopsie..
Comment #45
lauriiiComment #46
googletorp CreditAttribution: googletorp at Reveal IT commentedLooks good.
Comment #47
lauriiiMinor typo s/Disbale/Disable. Can be fixed during commit:)
Comment #48
webchickMethinks this is a catch-patch.
Comment #49
alexpotthis => their, he => they
Duplicate #cache keys - which one is right - atm only the last is being used.
Comment #50
catchAlso this was bumped to critical due to an API change, but we're missing a change record.
I didn't spot anything that wasn't in #47/#49 otherwise.
Comment #51
Wim LeersWe repeat this everywhere. Let's put that in a
$required_cache_contexts
variable and merge that with the other arrays. That'll allow us to remove repetition, and will hence make it clearer what the actual expectations for each specific thing are.In fact… why not create something like:
So that we can iterate over the various toolbar + other module combinations and define the expectations?
s/menu/toolbar tab+tray/
s/toolbar/toolbar tab+tray/
This is not testing the right thing yet. Anonymous users never get to see the toolbar, which is why they have the same expected cache contexts across all situations!
What this really needs to be testing is that if one authenticated user has the "access contextual links" permissions and another does not, that for both of those users the
user.permissions
cache context is present.Comment #52
Wim LeersWe could clean up the
$links_cache_context
stuff if we land #2495779: Make #theme => links take cacheability metadata as an argument. That has a patch now.Comment #53
lauriii#51.4 Isn't those cache contexts attached even if the user don't have the permissions? I guess my test cases proves that they are attached even without permissions because of the user in my test is never given the permissions because the cache contexts are always there even without permissions.
Comment #54
Wim LeersYes, they are, but the anonymous user doesn't see the toolbar at all, so those tests as the anonymous user don't really test anything :)
Comment #55
lauriiiWorking on fixing comments from #51
Comment #56
lauriiiFixed the nits on the tests.. Might be working on this later on. Unassigning so others can take a look if they are willing to do so
Comment #57
Wim LeersThis looks vastly better! IMO this is RTBC-worthy. What did you plan to do later on? (It sounds like you still wanted to make some changes, but I'm not sure what still needs to be improved?)
Granting additional permissions would be slightly better, but wouldn't actually matter here.
Nit:
string[]
.s/has/have/
Cache::mergeContexts()
Comment #58
borisson_Fixed nitpicks 2, 3 and 4 of #57.
Comment #59
Fabianx CreditAttribution: Fabianx for Acquia commented#49 was not yet addressed.
Still needs a change record for the API changes.
Comment #60
Wim LeersTo address #49, use
Cache::mergeContexts()
.Comment #61
Wim LeersCR created. This reminds me: we must also update the documentation for
hook_toolbar()
.Comment #62
borisson_Addressed both remarks from #49
Comment #64
Fabianx CreditAttribution: Fabianx for Acquia commentedNeeds to be an array and not an array of arrays.
Also first parameter needs to be an array, too.
Comment #65
JeroenTFixed the failing tests as suggested by fabianx in #64.
Comment #67
JeroenTAdded interdiff in txt format instead of patch format.
Comment #68
Fabianx CreditAttribution: Fabianx for Acquia commentedRTBC, looks great now!
@Core Committers:
Can I haz commit credit, please? (for code reviews)
Comment #69
lauriiiDouble commit credit for Fabianx!
Comment #70
webchickBack to catch, Fabianx is checked. :)
Comment #71
alexpottCalling the same method lots of times with a test method makes it much harder to debug when something goes wrong within the
testToolbarCacheContexts
method. Personally I think making the method an assert would make it easier to know what has gone wrong. But at the very least outputting some kind of verbose message in thetestToolbarCacheContexts
to let us know what is being tested would be good.Let's wrap this into a private method to install a module since atm we're doing an unnecessary
drupal_flush_all_caches()
on the first call to the method. Also I think thedrupal_flush_all_caches()
should be a $this->rebuildContainer(). So the test's container is properly updated.This didn't get fixed since it was reported... :(
Comment #72
lauriii#71.2: The problem is that $this->rebuildContainer() doesn't rebuilt routes why it didn't work in this case
Comment #73
lauriii#71.1 & 3 fixed
Comment #74
tim.plunkettInstead of making this a private function to avoid simpletest from running it (since it starts with the word test), please please make it protected and name it something else.
Comment #75
Wim Leers#71.1: Oops! I should've caught that.
#74: +1, please rename it to
assert*
, which is the convention. i.e. in this caseassertToolbarCacheContexts()
.Comment #76
dawehnerOnce you use assert() the line number are changed and so its harder to debug that test. Do you mind adding some
$this->pass("Some helpful string of what happens next")
here?Comment #77
Wim LeersPerhaps "verify"is better, then.
We have a strange test system.
Comment #78
alexpott@dawehner really? If it fails on any lines of the private
testToolbarCacheContexts()
you actually do know which call to it failed - which is silly. It's a custom assertion knowing where it failed intestToolbarCacheContextsCaller
is the entire point.Comment #79
alexpottWe're still doing an unnecessary
drupal_flush_all_caches()
. IU think we should still wrap the module installs and this in a method on the class.Comment #80
dawehnerPlease reread my comment. I haven't argued against using
assert*()
but rather making it easier inside the assert function to see where you are.Comment #81
alexpott@dawehner sorry - obviously had our real life discussion in my head :(. oops.
Comment #82
dawehnerDon't feel bad, I feared this came.
Comment #83
lauriiiI think I have addressed all the points since last patch in this patch
Comment #84
dawehnerIf this is the only thing which can happen, I think we should be more explicit and rebuild the router:
\Drupal::service('router.builder')->rebuild()
Comment #85
Wim LeersNW for #84.
Comment #86
alexpottassertMethods()
should return a boolean... and I've removed passing the message on to assertCacheContexts so we get the helpful verbose output from assertIdentical.Fixed #84 so that we do the absolute minimal about of work after installing a module. Imo WebTestBase should have a helper method for this - but that is for another issue.
Comment #87
dawehner+1 for not create another master assertion, which is hard to handle.
Comment #88
Fabianx CreditAttribution: Fabianx for Acquia commentedRTBC + 1
Comment #89
catchCommitted/pushed to 8.0.x, thanks!
Comment #91
Wim LeersAwesome! Thanks :) :)
Comment #92
David_Rothstein CreditAttribution: David_Rothstein as a volunteer commentedThis seems to have caused a regression with the "Back to site" link.