Problem/Motivation
To reproduce:
1. Create an article
2. Add a comment
3. Go to frontpage (still logged in), comment shows up
4. Separate browser window, anon, frontpage, so that it gets cached
5. Back to the logged in user. add another comment.
6. Back to frontpage with that user. Comment count is correct, thanks to placeholdered node links
7. Refresh page in anon browser window. Still shows the old comment count.
Additionally, node/1 with comments adds the comment_list tag. That seems like a really bad idea, because any added or updated comment will invalidate all the render cached nodes that have comments.
The effect of this cache tag can be seen on the following graph:
Each of those bumps is a comment.
Proposed resolution
Unless I'm missing something, I think we can fix both issues by removing the comment_list cache tag from the comment formatter and instead invalidate the cache tag of the commented entity. I've actually been doing this in a custom hook for months in my install profile, I just noticed another related bug there that is specific to my install profile when (finally) enabling page cache by default which reminded me of this.
Additionally, I think we can actually remove the placeholdering of node links then. Opened #2556767: Remove placeholdering of node links for that.
Remaining tasks
User interface changes
API changes
Data model changes
Comment | File | Size | Author |
---|---|---|---|
#31 | node_pages_throughput.png | 28.88 KB | Berdir |
#28 | comment-caching-2530846-28-interdiff.txt | 1.25 KB | mbovan |
#28 | comment-caching-2530846-28-comment-only.patch | 8.96 KB | mbovan |
#28 | comment-caching-2530846-28-test-only.patch | 3.84 KB | mbovan |
#20 | comment-caching-2530846-20-interdiff.txt | 4.81 KB | Berdir |
Comments
Comment #1
BerdirShould be that easy :)
For testing, also removed the placeholdering of node links. Can still revert that if we want to do that in a separate issue.
Comment #2
BerdirAdding the user.permissions cache tag.
Comment #3
BerdirAnd here is a patch with just the comment changes.
Comment #4
BerdirComment #5
BerdirThat needs to use getCacheTagsToInvalidate() now of course. Not going to upload a 4th patch until test results are back, though ;)
Comment #6
Fabianx CreditAttribution: Fabianx as a volunteer commentedActually I like to use placeholders for the whole comment area, to avoid especially that problem that the node is invalidated when comments are newly created.
However removing the list cache tag is fine I think.
Comment #10
Berdir@Fabianx: I don't really think that would work with how placeholders work right now. We'd have to rebuild the comments every time or another cache inbetween. That said, it's a formatter, so whatever that one is doing is very easy to replace, you can just write one that uses a placeholder.
Note that the node links section has nothing to do with the actual comments, so removing placeholders for that doesn't mean anything for the actual comments.
That said, we can't remove the list cache tag without adding a replacement, and invalidating the node tag will not work with your idea. So for that we'd need
Fixing a few tests, prevent fatal errors if there is no commented entity. that wouldn't require to change unit test but I only had that idea after I changed it already ;)
Comment #11
Wim LeersUnnecessary change.
<3
We can do this thanks to #2429257: Bubble cache contexts, which predates the use of placeholdering for node/comment links!
This should yield a small performance improvement too.
Now outdated.
How do we know that we want
user.permissions
here? Is it just a "sane default"?Comment #12
Berdir4. Yes, I was just wondering if that makes a difference with the tests. The problem is that we are using #theme links I think, so #2495779: Make #theme => links take cacheability metadata as an argument. Maybe we can just switch to #type links?
Also, I think statistics failed, not sure what to do with that :(
Comment #15
BerdirIf you're displaying 50 or so nodes on a page then the performance improvement that you get from this is not small :)
We can open a separate issue to do the node links switch, but what are we going to do about statistics_node_links_alter() That's just not cacheable and we definitely don't want to invalidate to support that..
My only idea right now is that we set a max-age to a configurable time, like an hour or so. Page cache caches it anyway, indefinitely, so what do we lose, really?
Comment #16
catchStatistics should probably use a placeholder itself? Or mixed placeholder/js like history module.
1 hour TTL will affect cache hit rate for everything else and there's an issue to bubble max age to the page cache.
Comment #17
BerdirI thought about placeholder too, but that won't help with page cache, that will still cache this indefinitely (yes, this is already a problem in HEAD). And even less sure it's worth to have an additional JS request just to display that number.
That's the problem with optimizations.. it really depends on the site what makes sense. If you have a lot of requests, I'd imagine it's a lot more effective to just update the node once per hour compared to having a placeholder/JS request that needs to be called every time.
Comment #18
Fabianx CreditAttribution: Fabianx as a volunteer commented#17: The trick is to do both:
Compared to #post_render_cache, #placeholders can be lazy created _and_ cached at the same time! :)
Example:
That way the nodes will be cached indefinitely, while the statistics are refreshed e.g. every hour.
Once we have multipleGet (on the roadmap, but no API change needed, so just major priority), all placeholders can be lazily gotten at the page / smart cache / render strategy level.
And with render_strategies someone would be flexible to convert that even into a JS callback instead or remove it or ...
Comment #19
Wim LeersComment #20
BerdirFixed the test fail, addressed #11.1, the other points there about the node links, no longer part of this patch. Will open a new issue for that.
Please welcome a new member to our zoo aka test suite ;) test-only is supposed to fail, let's see if that works as expected.
Comment #22
Wim LeersThis is actually plural, even though we only ever herd a single hippopotamus. In which case the singular form would be
.Extraneous newline.
Hrm, aren't the camelid and hippopotamida completely separate entities? Ah, this is because before this patch, we set the comment list cache tags. Can you add a comment to the test to make that clear?
Comment #23
Berdir1. You, sir, win the nitpick of the month award ;) I've copied this from https://en.wikipedia.org/wiki/Hippopotamidae (or actually https://en.wikipedia.org/wiki/Hippopotamus) which never uses the singular version? Also nothing prevents us from adding more than one hippo? ;)
3. Exactly. Suggestions on how to write that comment? Comments that refer to "that's how it worked in the past" are IMHO always a bit tricky, on the other side, this is a regression test for exactly that not happening.
Comment #24
Wim Leers1. YAY!!! :P The Wikipedia page says:
, which means it's plural. And, the typehint prevents us from having multiple hippos assigned to that one variable, so :)3. A comment like "Ensure only the commented entity is affected" should be sufficient, and not suffer from the "X in the past" problem?
Comment #25
catchHippopotamidae is the family for hippopotamus in the sense that canidae is the family of canis and vulpes. So it's not really plural, it's 'the thing above a genus'. By the way it is fun discussing taxonomy on a d.o issue :P
So it's hippopotamus for singular and any of hippopotamuses/hippopotami/hippos for plural according to wikipedia, although we could probably use family/genus if we wanted to.
Or... there is the collective noun which can be any of bloat/pod/dale/herd.
Comment #26
webchickNERDS!! :P ;)
Comment #27
BerdirWhat I wanted to get is the same relationship with hippos as the test already has by using Camelids (which is already plural, btw, just not the variable) and Llama.
Yes ;)
Comment #28
mbovan CreditAttribution: mbovan at MD Systems GmbH commentedRerolled and fixed #22.2 and #22.3.
Comment #30
Wim LeersComment #31
BerdirAdded the following new relic throughput graph to the issue summary:
Also updated the issue summary a bit and opened #2556767: Remove placeholdering of node links.
Comment #32
catchThanks for the throughput graph, flat throughput graphs are nicer.
Committed/pushed to 8.0.x, thanks!