Here are current flame graphs of node/1. For background on understanding and interacting with these, see #2370667: [Meta] user/password flame graph. Lets discuss and open subissues as needed.
Some average page timings:
- Anon, Page cache disabled: 188ms (was 203)
- Anon, Page cache enabled: 16ms (was 15)
- Auth (regular user), Page cache disabled: 442 (was 280)
Environment
- PHP 5.5
- Zend Opcache enabled
- APCu enabled
- node/1 is an article node with 3 comments and comment form.
| Comment | File | Size | Author |
|---|---|---|---|
| #11 | Screen Shot 2015-08-19 at 10.12.19.png | 189.21 KB | wim leers |
| #11 | Screen Shot 2015-08-19 at 10.10.55.png | 190.71 KB | wim leers |
| #11 | Screen Shot 2015-08-19 at 10.09.51.png | 50.71 KB | wim leers |
| #8 | 2552873-comment_form-8.patch | 553 bytes | tim.plunkett |
Comments
Comment #2
dawehnerRandom thoughts:
IMHO we should remove block config entity access control handler and just fetch the plugin and start from there. Could save a bit to be honest
Comment #3
berdir* the node links changes are in #2530846: Fix and improve comment cache tag usage currently, need to open a separate issue.
No. The access control handler checks visibility conditions, which we moved out of the plugin and also calls the block access hook.
Comment #4
moshe weitzman commentedComment #5
moshe weitzman commentedComment #6
moshe weitzman commentedI added an authenticated flamegraph in the issue summary. That comment form jumps out as a sad panda. Perhaps discuss that at #2526472: Ensure forms are marked max-age=0, but have a way to opt-in to being cacheable
Comment #7
moshe weitzman commentedComment #8
tim.plunkettMoved the comment form to another page.
Comment #9
wim leers#2
#2526472: Ensure forms are marked max-age=0, but have a way to opt-in to being cacheable, also related: #2504139: Blocks containing a form include the form action in the cache, so they always submit to the first URL the form was viewed at
Caching of cache context results (per request) would hugely help there I think.
Comment #10
moshe weitzman commentedre: @berdir in #3 - what part of #2530846: Fix and improve comment cache tag usage speeds up node links? I'm not immediately seeing it.
Comment #11
wim leersI've heard the concern raised that "OMG rendering is so expensive". But this is not actually true! It's important that you first learn how to interpret flamegraphs before jumping to conclusions :)
See http://www.brendangregg.com/flamegraphs.html. I'll quickly quote the summary, because it captures the gist:
So, the large "rendering" bars do NOT represent how long rendering took. They are so large because they are present *in the callstack*.
Interpreting the authenticated user flame graph in this issue
HtmlRenderer::renderResponse(). So that's already 25% that definitely went to other things, like bootstrapping, routing, access checking, etc.HtmlRenderer::renderResponse(), we get to see the most frequent callstacks beyond that. It splits up in two parts there:HtmlRenderer::prepare(), which takes 12.52% to render the main content. That seems acceptable. But then 61.56% is taken up by thatRenderer::renderRoot()call. What's up with that? Within that 61.56%, 17.12% goes to rendering a specific template. I'm pretty certain that's thehtmltemplate, but the compiled Twig templates make that darn hard to see. It'd be useful if we'd have the template name in the compiled Twig template filename too, and not just a hash. Anyway, let's look at the meaty part: 43.23% went toRenderer::replacePlaceholders(). Woah! 43% of the total time to render the page is for replacing placeholders!? Seems like that merits further scrutiny!Renderer::replacePlaceholders()bar, we get to see where roughly half the time of rendering the page was spent. Did we really just spent half the time of generating a HTML response in the renderer? Let's find out.So 23.83% (roughly a quarter of the page generation time) went to
CommentLazyBuilders::renderForm(). But woah, 12.93% went to just building the comment form array structure and applyingFormBuilder(not even rendering it), 5.72% of which went to just generating the plain form array structure. And to be able to do that in the first place, we have to generate an emptyCommententity, which on its own took 10.88% of the total page generation time! Let that sink in for a while: 10.88% of the total page generation time went to generating an empty Comment entity object. So if we can make building the form array structure of entity forms and instantiating empty entity objects significantly faster, we can make this page generate ~15% faster.However, also when rendering the form do we find a gross inefficiency: what's previously described all just resulted in a render array to be rendered. When rendering it, we end up calling
Editor\Element::preRenderTextFormat(), which gathers the assets (CSS, JS, settings) that should be loaded to have CKEditor loaded. That alone took up another 7.3% of the page load time. Ouch! Seems like that could be an easy win.So, roughly speaking:
That's 56.5% of the time not spent on rendering, and only having looked at the huge things. Contrary to what the visual representation of flamegraphs would have you thinking initially/intuitively.
I'm sure the rendering system can be improved still, but it's not where the big gains are to be found.
The 7.5% thing is in an area I know well, so I opened an issue to fix that: #2553695: Cache EditorManager::getAttachments().
Comment #12
wim leersPatch posted for #2553695: Cache EditorManager::getAttachments().
Accurate observation. But:
Definitely true though. It means you always have to look at the children whenever you see any "render" stuff happening, as also demonstrated in #11.
Comment #13
dawehnerAnd this is why a flamegraph is bad. It makes it quite hard to look at what actually is slow, because you don't want to see the flamegraph but basically the first derivative of it.
You want to see what takes time exclusive relative to its parent.
Comment #14
berdir@moshe: the comment cache tag change itself does not help that much directly (but it helps on real sites, a lot, see my throughput graph that I just added to the issue summary there). But what helps this is what I experimented in comment #10 and earlier, which I now moved to a separate issue: #2556767: Remove placeholdering of node links
Comment #15
berdirOpened #2558395: Cache CacheContextsManager::convertTokensToKeys() (or specific cache contexts).
Comment #16
berdirSo...
We, in some cases almost accidentally, made some very interesting changes.
* The comment form is now cached thanks to the changes in #2571909: CommentForm selects using the user formatted name
* #2556767: Remove placeholdering of node links was fixed, which has nice to insane performance improvements, depending on how many nodes you are looking at.
* Smartcache got in ;)
So it's time to redo the flamegraphs I think. Additionally to page cache, we should also test with and without smartcache.
Note: To test in a meaningful way with smartcache, you need to use a user that does not have access to the toolbar as that currently adds the user cache context and causes smartcache to opt-out.
The class name still doesn't, but the filename does have it now and it's also inside the file. So if you run those flamegraphs yourself, you just need to find that class and you can see it easily now.
Comment #17
moshe weitzman commentedI have updated the IS with new flame graphs as of today. This time I added 3 comments to the article node.
Oddly, the authenticated performance is much worse. Perhaps I have a profiling misconfiguration or something. Click the links to view the flamegraph. Reminder that you can click the bars in the flame graph for ease of drill-down.
A reaction to each graph:
Comment #18
berdirYeah, reproduced. That's not good.
#2571909: CommentForm selects using the user formatted name backfired. Badly. Yes, we no longer need to vary it by user. But it's still a form, with a form token, which forces max-age 0. That bubbles up to the node and completely disables render caching for nodes :( That's critical I think?
#2463567: Push CSRF tokens for forms to placeholders + #lazy_builder should fix that I think, so we could just get that in, or we need to re-add the #create_placeholder thing back for now, would be a one line fix but go back to having to build that form on every page.
Comment #19
wim leers#2463567: Push CSRF tokens for forms to placeholders + #lazy_builder alone will not be enough, we'll also need #2578855: Form tokens are now rendered lazily, allow forms to opt in to be cacheable.