Quoting @effulgentsia in #2428703: Add a 'user.permissions' cache context (was: "Should cache contexts be able to associate a cache tag?"):
How about this to address #74: we embed the hierarchy in the context name / service ID only when that hierarchy is intrinsic to the meaning of the context, not merely an implementation choice.
[…]
In a followup, I think we can add a service tag to convey an implementation-driven parent relationship. For example:
cache_context.pagers: ... tags: - { name: cache.context, parent: cache_context.url.query_args:page } cache_context.route: ... tags: - { name: cache.context, parent: cache_context.url }And thereby allow CacheContexts::optimizeTokens() to optimize for implementations where pagers are solely determined by a single URL query argument and routing is done exclusively by URL.
| Comment | File | Size | Author |
|---|---|---|---|
| #57 | 2453835-56.patch | 23.06 KB | Abhishek_Singh |
| #56 | interdiff_55_56.txt | 983 bytes | Abhishek_Singh |
| #55 | reroll_diff_52-55.txt | 14.83 KB | pooja saraah |
| #55 | 2453835-55.patch | 23.06 KB | pooja saraah |
| #52 | allow_non_intrinsic-2453835-52.patch | 23.21 KB | berdir |
Issue fork drupal-2453835
Show commands
Start within a Git clone of the project using the version control instructions.
Or, if you do not have SSH keys set up on git.drupalcode.org:
Comments
Comment #1
wim leersComment #2
wim leersComment #3
fabianx commentedDumping some thoughts here:
- Maybe one parent is enough, maybe not, languages with parameter is special right now for that in core.
Especially with languages:user for user dependent language content, but that would likely automatically have 'user' - if its per user.
More tricky is if language negotiation takes a users preferred language into account AND uses url based language neg.
Hm, maybe not a problem too as that page is then per user, too.
Still thinking of a use case where caller would not specify the additional cache context itself and where the cache contexts of the cache contexts are needed.
Comment #4
wim leers#2428703: Add a 'user.permissions' cache context (was: "Should cache contexts be able to associate a cache tag?") landed, this is now unblocked.
Comment #5
wim leersThis will be quite important for some advanced caching use cases, but AFAICT it's a pure API addition, so could even happen in 8.1 or later.
Comment #6
berdirHere is the comment that I promised in #2512866: CacheContextsManager::optimizeTokens() optimizes ['user', 'user.permissions'] to ['user'] without adding cache tags to invalidate that when the user's roles are modified.
The API added here seems to be a very bad fit for that issue overall. It would make both cache context classes and the manager more complex and as far as we see, we would either have to make that a required definition (which would be a considerable API change) or it would not be possible to achieve the main purpose of that other issue: Return an empty list to have no parent (since the returning nothing would imply to use the default hierarchy).
Instead, I suggested that this is easier to solve by introducing a new optional interface (e.g. DynamicParentContextInterface::getParents()): 100% BC, optional, self-documenting.
Comment #7
effulgentsia commentedI don't think I understand #6. I think once we do #2512866-84: CacheContextsManager::optimizeTokens() optimizes ['user', 'user.permissions'] to ['user'] without adding cache tags to invalidate that when the user's roles are modified, this issue becomes trivial:
How about we postpone this on #2512866: CacheContextsManager::optimizeTokens() optimizes ['user', 'user.permissions'] to ['user'] without adding cache tags to invalidate that when the user's roles are modified and re-evaluate then?
Comment #8
fabianx commented+1 to #6
The main goal here is that we state which assumptions we actually make in core here in regards to cacheability (e.g. route is derived by url only, theme is (potentially) user and url based, etc.).
So getParents() is a much better way to solve that, because if you exchange e.g. the UserPermissions service, you also probably have to update the cache context anyway if it has new assumptions.
Edit: X-Post with #7
Comment #9
fabianx commentedPostponed on #2512866: CacheContextsManager::optimizeTokens() optimizes ['user', 'user.permissions'] to ['user'] without adding cache tags to invalidate that when the user's roles are modified
Comment #10
berdirYes. but that's what the service ID already implies. And now try to do the opposite. Try to make user.permissions *not* depend on user (for whatever reason). If you don't return anything then it will use the default, so it will still be user. Unless we make the definition there required and completely ignore the service ID. Which would then be an API change.
But yes, waiting is fine either way, not a focus right now.
Comment #11
wim leers#2512866: CacheContextsManager::optimizeTokens() optimizes ['user', 'user.permissions'] to ['user'] without adding cache tags to invalidate that when the user's roles are modified landed, so this is unblocked.
Comment #12
berdirPer #2429617-188: Make D8 2x as fast: Dynamic Page Cache: context-dependent page caching (for *all* users!), something we should consider here is that user.roles could be optimized away if *either* user or user.permissions is provided, so it would be nice if this API allows for that use case as well.
Comment #13
wim leers#12++
@Berdir++
Great thinking.
That is possible since #2417895: AccountPermissionsCacheContext/PermissionsHashGenerator must special case user 1, since permissions don't apply to it.
Comment #17
berdirHere's a first proof of concept patch with unit tests and some questions in form of @todo's.
Very unsure about naming.
Comment #18
berdirAs mentioned, one interesting side effect of being able to optimize contexts is to be able to avoid cache redirects if it's covered with the initial/default contexts. For example the user roles or a specific role as implemented here.
But another example is timezone. So far we said that a cache context doesn't hurt if you only have one possible value... but that's not true. Most nodes have the timezone as additional context because they display some kind of date. So they all need a cache redirect, but if your site only has one timezone and doesn't support configurable then that's pointless and a waste of time and cache storage.
So I think we can also use this to e.g. optimize the timezone context away without a specific parent, either with a trick (by dynamically defining theme as the parent) or a separate, boolean method.
Comment #20
berdirTest fails are mostly expected, user.roles no longer part of those cache context lists. Not sure about the cache tag differences...
Comment #21
wim leersVery interesting! I think a separate boolean method that inspects site configuration makes sense. We could use that same API to optimize
languages:language_interfaceaway when there's only a single interface language.Question: Should this also support calculated cache contexts?
I'm leaning towards , but I haven't fully thought that through. Perhaps we can design this in such a way that that could be added later?
Because of the location of this code (i.e. the code that runs before & after this in this method), the current implenentation:
This indicates test coverage is incomplete.
Comment #22
berdirAdded isGlobal() as a test. Please suggest a better name ;) But I'm not sure yet. The thing is that while timezone and language are both global in that there is only one value, that one value can change. So we would have a problem if a site decided to change the default timezone. Question is if we really bother to support that, we could do a render cache clear like we do for some other changes like date format changes? Otherwise we could also add this as a services.yml setting for sites to opt in to ignoring certain contexts, but that requires a lot more insider knowledge to optimize your site.
Added a few more tests, as discussed, calculated cache context services are supported, just not per argument, add a test to show that providing an argument or not is irrelevant for optimizeTokens() (the test service doesn't even implement the interface). Also explicitly documented that parents of parents are not automatically resolved, as discussed I think it would be problematic to support that completely.
Comment #23
wim leersHow about
hasNoVariations()orhasOnlySingleVariation()?Optimizing a certain cache context away requires that cache context's cacheability metadata to be returned.
This is also optimizing away of a cache context.
So I think
TimeZoneCacheContext::getCacheableMetadata()should return the cache tag forsystem.date.That then also answers:
Comment #24
kristiaanvandeneyndeSo why not call it
hasVariations()or given how it's supposed to be interpreted:applies()? The docs would then have to explain why a context can mark itself as "does not apply to this site right now".Do we really want to introduce a system where we can have it both ways? While I am a fan of the idea of allowing the context itself to define (multiple!) parents in a method rather than a magic name, we already have a system to indicate a (single) parent. Allowing developers to define parents in two different ways sounds like the type of thing we want to avoid on account of obscure DX.
P.S.: We already have an issue for this example: #2749865: Rename the 'user.permissions' context to 'user.roles.permissions', to ensure correct cache context optimization
is this really necessary? I can see it leads to cleaner code further down, but isn't this a bit tedious to ask people to state the obvious or perhaps unknown?
By invalidating anything that is cached using this context when the timezone changes globally.
That said, I am excited about the idea behind this, though!
Comment #25
berdirhasVariations() or so would work for me.
If you have a context as A.B.C, then we will optimize it away by default if either A or A.B is provided. With the new interface, just returning A.B will *not* be enough, you need to provide both A and A.B. Since this is in a fairly critical path, I'm not sure how much recursive parsing/checking we want to do.
Not sure what you mean exactly with both ways. It was always the goal to expand the current hardcoded system and allow for overriding it.
#2749865: Rename the 'user.permissions' context to 'user.roles.permissions', to ensure correct cache context optimization is the wrong way, would need to be user.permissions.roles, as roles can be optimized away for permissions, *not* the other way round. And it would be an API change unless we introduce a context alias/BC system, which would also be more complexity. IMHO that can be closed as a duplicate of this.
Comment #27
kristiaanvandeneyndeGrab the return value, explode every entry using '.' as a separator and merge that into one big array?
I can't speak for the original intent as I wasn't there when the cache context system was conceived.
My point is that we historically have had multiple means in core to achieve the same thing. This often led to confusion as you'd be left guessing why something was acting the way it was and where in the codebase it was occurring.
Suppose you want to debug why a cache context is optimized away (weird, but suppose). Right now all you have to check is the names of all defined contexts. With this system in place you'd still have to check that and all implementations of the introduced interface.
I understand that it's hard to move to the new system without breaking BC, so maybe there is no alternative here. But if there isn't, could we please make sure that we document this new behavior very clearly? Perhaps even mark the old "magic name" approach as deprecated and remove it in D9?
I really don't want to derail this thread with the contents of that issue, but I can assure you that the linked issue has it right. Please have a look at its comments. If the patch in this issue lands, we could repurpose that issue to change the user.permissions context to implement the interface mentioned here. But it is far from a duplicate of this issue.
Comment #28
kristiaanvandeneyndeCaveat: If we want to do away with the old "magic name" system in D9, we should actually list all parents. Because we can otherwise not guarantee that the functionality will still work when we remove the old system. So I am fully aware that my previous comment was self-contradicting in a way :)
Comment #29
berdirSome more work on this, will try to respond more later.
* Method renamed
* Lots of tests updated, but added a @todo to the majority of them, because we kind of lose test coverage. Maybe for some we can update the required cache contexts to be able to still test them in a useful way
* Added a NodeCacheTagsTest variant that disables configurable timezones (this is enabled by default interestingly, wondering if we want to consider changing that in a follow-up, how many sites really need that?), that's why no additional tests failed.
* The added cache tag will however likely result in a bunch of additional test fails.
Comment #30
berdirAlso noticed some weird things about NodeCacheTagsTest and its parents:
\Drupal\Tests\node\Functional\NodeCacheTagsTest
- \Drupal\system\Tests\Entity\EntityWithUriCacheTagsTestBase
- - \Drupal\system\Tests\Entity\EntityCacheTagsTestBase (deprecated)
- - - \Drupal\system\Tests\Cache\PageCacheTagsTestBase (deprecated)
- - - - \Drupal\simpletest\WebTestBase
NodeCacheTagsTest is already moved to being a BTB test, but its parents are still WebTestBase, some of the parents are deprecated, others are not...
Not exactly sure how this is even still working :) I guess because run-tests.sh is nice and finds the tests despite being in the wrong location. Or to be more specific, finds tests in both locations and then runs them based on the parent class and not the location/namespace.
Comment #32
berdirMore test fixes.
Comment #33
wim leersI agree this has a downside. But the point is that we've shipped with (and have gotten by so far) with only non-dynamic parents. Declarative parents, if you will. It's simple. It's clear.
However, even when we introduced that, we knew that there were some use cases that could hugely benefit from dynamic parents: cache contexts whose behavior depends on a particular site's configuration. For example: if your site is entirely in English or entirely in Dutch, or is always in a single timezone, then you're suffering from an unnecessary cache redirect. (See #18.) If we optionally allow cache contexts to have some logic so they can be made aware of site-specific configuration, then we can further optimize this.
Of course, the central question then is:
I think that question still needs to be answered. We need hard numbers showing a significant performance/scalability gain. Otherwise I agree we should not add this capability.
WTF!
Can we also start doing this for
cache_context.languages? Single interface language, single content language, single URL language. They're all possible.Then we'd have two sample implementations optimizing based on site configuration, rather than one.
AFAICT this is because the
user.rolescache context is being optimized away, and hence the current user's cache tag is being added?I don't see how the presence of these cache tags is correct?
Comment #34
berdirThat's a bit more complicated. The language cache context is defined in Drupal\Core. \Drupal\Core has no knowledge of configurable languages, That's in language.module. We could do this, but then we need a dummy implementation of the language context in core that doesn't have variations. And replace it with language.module which then implements the dynamic check with the cache tag. That also implies that for the (very theoretical) scenario of a different language module/system/provider, it would be responsible for providing its own cache context implementation.
Which is I think why we might want to move that to a follow-up issue.
Comment #35
berdirProfiling might be a bit hard, especially on a standard installation. But I think the benefits are *definitely* there and easy to prove.
To reproduce on a standard install, all you need to do is create a node, go to node/1. Yo you can see that you now have 2 rows starting with "entity_view:node:1:full:" in cache_render. Apply the patch, clear cache, make sure you have configurable dates disables, refresh. Now there is only one record.
One additional query doesn't make a huge difference, but I just tried the patch on a site we are currently buildig. Without the patch, I have 54 node render cache entries just from the frontpage, with the patch, I have 27. So, out of the box, with the correct configuration, you get a cache redirect on every single node that is displayed, this avoids that (This happens by the way even if you do *not* display the article date for a node because template_preprocess_node() always renders the created date.)
Comment #36
wim leersI lost track of this. What's next here?
Comment #37
berdirGood question. There's IMHO still the problem that this does basically the opposite of #2749865: Rename the 'user.permissions' context to 'user.roles.permissions', to ensure correct cache context optimization IMHO, that doing both doesn't make sense and there has been some discussion about that.
I've now included the latest patch in our install profile to do some testing "in the field". I'll check if it has an impact on the number of redis get/writes that happen and report back about that.
Comment #38
kristiaanvandeneynde@Berdir: Suppose we do the logical thing semantically and change user.permissions and user.roles as suggested in #2749865: Rename the 'user.permissions' context to 'user.roles.permissions', to ensure correct cache context optimization. What parts of it would clash with what you're trying to do here? I appreciate the value of your patch here a lot, so finding a way to make both issues land in core would be awesome.
Comment #39
berdirIf user.permissions is renamed to user.roles.permissions but this issue optimizes user.roles away in favor of its children user.roles.permissions, which also happens to be a default required context, how does that make sense? :) That's the opposite of how we optimize otherwise. And optimizing in the other direction doesn't make sense as long as user.roles.permissions is a required default context.
We deployed this patch to our first site now. It's quite hard to compare as the amount of requests vary quite a bit. Redis currently only has 1GB memory and it's a pretty big site, so a lot of stuff is constantly evicted. So it's full within a short time either way. But it seems to be working well so far.
One thing that I'm seeing is that the total number of cache entries and distribution across cache bins changed quite a bit. Before there were 112k entries total, 33k render cache, 27k entity, 11k data. Now the total is only 80k, 30k entity, 24k render, 12k data. And timezone vanished from the used cache contexts. My interpretation of that is that there are far less small cache redirect entries now and instead we can store more actual, bigger cache records.
I'll keep an eye on redis get/set statistics over the next days to see if I can see a trend there.
Comment #40
kristiaanvandeneyndeHmm, that does make it a bit difficult.
So basically the problem is that semantically and logically user.permissions should become user.roles.permissions, but practically it might be better if the parent-child relationship were the other way around. So how about this:
This would allow you to make two classes that extend the user roles and permissions contexts, define getParentContexts() on them and swap out the original contexts for the new ones. It would also keep the original contexts in core behave the way they semantically and logically should.
I really need to look into this whole cache redirect thing as Wim Leers pointed out in #2749865-22: Rename the 'user.permissions' context to 'user.roles.permissions', to ensure correct cache context optimization. There might definitely be something I'm missing, but if the current system would make it more practical to let a context designate its child as its parent, there may be something wrong with the system itself :)
Comment #41
dawehnerThis sounds like a good compromise of explicitness vs. magic. In retrospective I would have voted for making it explicit only. Its not obvious that this feature exists in the first place.
Comment #42
slashrsm commentedThis is a reroll against 8.3.x. In case anyone needs to patch their projects...
Comment #43
kristiaanvandeneyndeJust to clear things up. This should preferably be committed after #2749865: Rename the 'user.permissions' context to 'user.roles.permissions', to ensure correct cache context optimization lands, which in turns needs to wait for #540008: Add a container parameter that can remove the special behavior of UID#1. The latter being ready to be reviewed (ignore its last patch for now).
Comment #45
berdirAnother rebase for 8.3.x projects. I'll try to catch up on the related issues soon, ignore this for now.
Comment #47
berdirSo much about those plans, here's a 8.4.x reroll for now.
Comment #52
berdirGoing to try and revive this ;)
8.4.x => 9.3.x was an.. interesting reroll. Lets see how this does.
I'm considering to split the interface up and just focus on hasVariations() here. I think that is less controversial and the user.roles has some tricky bits even though it would be interesting. For example, the cache tag for the current user that it gets is weird for user.permissions and doesn't make sense there. For user, that's fine, as the cache is then for that specific user anyway, and invalidating it if that user changes is OK. But for user.roles => user.permissions, we'd add the cache tag of the first user that the page gets cached for. the user permissions has in its current permission does consider the current roles, so that's still fine, the cache tag is just bogus and results in a few test changes as well
Comment #55
pooja saraah commentedFixed failed commands on #52
Attached patch against Drupal 9.5.x
Comment #56
Abhishek_Singh commentedFixed drupal cs issue.
Comment #57
Abhishek_Singh commentedHere is the updated patch.
Comment #61
kristiaanvandeneyndeFollowing up on #43: I now think #2749865-52: Rename the 'user.permissions' context to 'user.roles.permissions', to ensure correct cache context optimization can be closed because of the comment I linked. Meaning this issue is no longer blocked by it nor by #540008: Add a container parameter that can remove the special behavior of UID#1. Having said that, it no longer makes sense to fold either user.roles into user.permissions or vice versa (as explained in the other issue), so we should definitely not add that as an example in the documentation.
Comment #63
berdirTried to investigate the failing test. There's an issue with bubbling of that stuff. Within VariationCache::set(), that cache context is correctly optimized away and the cache tag added, but the information doesn't bubble up because RenderCache::set() doesn't do anything with the cacheability object. and I think it should. But it seems to work in the other test method, just not the node within the test entity.