Problem/Motivation
#2158003: Remove Block Cache API in favor of blocks returning #cache with cache tags introduced the concept of cache contexts. #2167039: Regression: page cache tags broken, also: some routes incorrectly use _controller -> No page object in template variables -> Fatal in seven_preprocess_page() introduced CacheableInterface
.
CacheableInterface
only had getCacheKeys()
, because that's what was relevant to the scope of that issue. About cache keys, from CacheContexts::convertTokensToKeys()
:
* Cache keys may either be static (just strings) or tokens (placeholders
* that are converted to static keys by the @cache_contexts service, depending
* depending on the request). This is the default cache contexts service.
But, by now we're using CacheableInterface
for use cases where we don't want to return any static keys, but only tokens, because the object that implements CacheableInterface
is not itself cacheable, but affects the cacheability of a containing object.
In other words, we now have situations where it's confusing to implement getCacheKeys()
because we want implementations to not return actual keys, but only to return tokens (cache contexts) by which the cached thing should be varied.
Proposed resolution
Add CacheableInterface::getCacheContexts()
.
This will allow both BlockBase()
and AccessResult
to be easier to understand.
When building a CID for a cacheable object, we'll just have to do array_merge($o->getCacheKeys(), $o->getCacheContexts())
.
Remaining tasks
Discuss.
User interface changes
None.
API changes
CacheableInterface::getCacheContexts()
Comment | File | Size | Author |
---|---|---|---|
#18 | cacheable_keys_contexts-2329101-18.patch | 35.23 KB | Wim Leers |
Comments
Comment #1
Wim LeersBlocked on #2287071: Add cacheability metadata to access checks.
Comment #2
Wim LeersDiscussed this with msonnabaum at DrupalCon Amsterdam — he strongly agreed, and was even very much surprised it wasn't already the case.
Patch attached.
I was just opening a new issue for this, but my browser's autocomplete functionality suggested this title, hence I knew I'd already created this >1 month ago — nice :)
Comment #3
msonnabaum CreditAttribution: msonnabaum commentedI remember now that I designed it the way it was with the idea that we'd never treat contexts as anything other than lazy keys, but now that we know we want to bubble the contexts, I agree that they need their own method.
Patch looks sensible to me.
Comment #4
catchI know we came up with a special case for access checks, but that requires executing the access checkers to get their granularity/contexts prior to generating the cache key for the parent. Also it took me a couple of iterations to get my head around the chicken/egg situation there.
For other cases like say an entity rendering, how do I get the cache contexts of the children? Surely the parent has to know about the implementation of CacheableInterface to call downwards (for example whether an entity has field formatters that require access checks on entities, or none), but that's different to automatically bubbling up.
Comment #5
Fabianx CreditAttribution: Fabianx commentedFWIW, I just designed the same for contrib then saw that CacheableInterface does not support getCacheContexts().
Ultimately we want #cache to match CacheableInterface, too.
So ['#cache']['contexts']?
There is merits to both approaches. One issue I ran into (that does not apply to D8) is the order and where to put the cache contexts section into the in the end generated Cache ID, having the Cache Contexts as part of the keys is nice for that, as your Cache IDs still make sense.
Playings devils advocate here:
- Isn't it enough to return getCacheKeys() and use isCacheable() == FALSE for things where only the context matters?
Without needing that API change?
Comment #6
Fabianx CreditAttribution: Fabianx commentedThe solution Wim and me came up in IRC was to:
- Enforce usage of @cache_context.user as placeholder - this matches symfony syntax and it _is_ a service after all
- Create Cache::filterContexts(), which just uses array_filer($keys, function($key) { return strpos($key, '@') === 0; } );
That makes it both performant and container independent to filter the dynamic cache contexts, also improves DX as a dynamic placeholder can easily be distinguished from a static one.
The reason to be able to find the Cache Contexts is that a child might want to know the cache context of its parent and render a placeholder when the cache context is not matching. This needs to be added to the issue summary.
Comment #7
Wim LeersLet's finally finish this one, since it has repercussions for at least #2099137: Entity/field access and node grants not taken into account with core cache contexts and #2396333: BlockContentBlock ignores cache contexts required by the block_content entity.
#4:
For entities + fields, we have #2099137: Entity/field access and node grants not taken into account with core cache contexts to solve that. And yes, the parent has to know that
CacheableInterface
is implemented by the children. But no matter how it ends up calling those children, it still needs a reliable way of getting only the cache contexts, not also the cache keys. And it's perfectly possible that they return both.I don't see why this is a reason to not do #2?
#5: the first half of the comment actually is an endorsement of the patch in #2. The second half is playing devils advocate, but the answer is: no, that is not enough. It's perfectly possible for something to be cacheable (
isCacheable() === TRUE
) and still return cache contexts. Something may be individually render cacheable (and hence return both keys and contexts in itsgetCacheKeys()
method), but also need to affect a parent when that is cached.#6 feels like a painful solution, one that is pretty much equally problematic as what we have today, because it still relies on a specifically formatted string, which is not a strong guarantee (nothing forbids cache keys from beginning with an
@
).Comment #8
Wim LeersQuite incredibly, #2 applies cleanly after 4 months! (With the sole exception of the
HtmlFragment
hunk, which is obsolete now since that class is gone.)Comment #9
Wim LeersFabianX and I talked on IRC. We came to the following conclusion + plan:
::getCacheContexts()
+#cache['contexts']
) part out from "keys" (::getCacheContexts()
+#cache['contexts']
), how do we ensure that they're always combined into a CID in the same way? If one place first uses the contexts, the other first uses the keys, then they'd have different CIDs.A: By convention, we generate a CID by first using the keys, then the contexts. This also aids debuggability, because it ensures that the things unique about the thing are listed first, and the contexts it must be varied by are listed last.
A: D8 doesn't support that: the API doesn't list it anymore, and the hashing we do in the DB cache back-end to ensure we don't exceed the limit of 255 characters for the CID already prevents using prefix-based invalidation anyway.
#cache['keys']
, but also#cache['contexts']
. This would only be a minor disruption, since very little code in core uses cache contexts anyway (it's almost all in the block and entity view builders, which generate almost all renderables). And it'd make things much clearer.CacheContexts
service be more strict: rather than only converting the placeholders it is given that are actually cache contexts, we can now throw an exception whenever it is given a non-existing cache context, because the calling code knows exactly which things are contexts and which aren't. This further helps DX: if there's a typo, the developer will instantly know why things aren't working as expected.'cache_context.user'
as the cache context token/placeholder is because using just'user'
had a 99% probability of collisions (i.e. a static cache key accidentally being'user'
too and hence end up being treated as a cache context unintentionally). The use of'cache_context.user'
made that a 1% collision probability.By separating the two (keys vs. contexts), and introducing the above exception, we make it a 0% probability. Which is quite important for Drupal 8 in the long run, because somebody will probably end up causing a collision even in HEAD's code…
'user'
instead of'cache_context.user'
, precisely because we won't have to parse the cache keys and try to identify the cache contexts among them; we'll already know which strings are cache contexts. So this will again make the DX better: rather than having to type the pretty awkward, implementation detail-leaking'cache_context.language'
, they'll just be able to type'language'
— much, much better.'cache_context.language'
->'language'
) by keeping the service declarations identical, but changing the compiler pass, to only pass the service ID without the'cache_context.'
prefix, in combination with requiring cache context services to have that prefix (and throwing an exception otherwise). That way, we leverage the container's preexisting requirement of all services being required to have a unique ID to ensure that all cache contexts and associated cache context services have a unique ID.The many DX benefits listed above should make the cache APIs less intimidating, and hence they will end up being used more often.
Comment #10
Wim LeersTurns out we need less than 30 net additions to make this happen! :)
Comment #11
Fabianx CreditAttribution: Fabianx commentedIs this intentional?
And should we not define default_cache_contexts as well?
This reminds me:
Should we order cache contexts alphabetically when creating the CID to avoid fragmentation?
Overall looks really good, almost RTBC (just feedback to those two points).
Comment #12
Wim LeersYes, we could define
$default_cache_contexts
as well, but the current solution just handles that single context in a hardcoded manner (it resolves to'core'
) because the theme is always going to be the same. I can change it to$default_cache_contexts
+ use the cache contexts service though.RE: sorting: It would definitely be harmless. But I can't think of a single reason why it would be necessary though? As long as the cache contexts are passed around without changing the order, and the code that actually specifies cache contexts is deterministic (i.e. always returns things in the same order), we're fine. If there was a problem in this area, we would have encountered it a long time ago already?
Comment #13
Fabianx CreditAttribution: Fabianx commentedI was thinking of cases where we pass cache contexts upstream in the future, but we can do the sorting and deduplication indeed then do during the merging itself.
RTBC
Comment #14
Wim Leers#2318437: Replace the hardcoded langcode key on blocks with the 'language' cache context landed, which conflicts in several places. Straight reroll.
Comment #16
Wim Leers#10 failed only because testbot is so ridiculously backlogged that a whole bunch of commits happened before the patch even got to being tested… #14 is the reroll to fix that. Hopefully it comes back green.
Comment #18
Wim LeersFailure due to #2415441: Automate finding @covers errors which also landed after #10 was posted; exception due to an incorrect change in a test. Keeping at RTBC, because there are no actual changes.
Comment #19
effulgentsia CreditAttribution: effulgentsia commentedI don't get this. What's an example where we want the implementor of this interface to decide what should go into its cache ID, but that isn't something that needs to bubble up? Seems to me that implementors of this interface should only implement getCacheContexts() and never implement getCacheKeys(), in which case, should we remove getCacheKeys() from the interface? As an example, BookNavigationBlock::getCacheKeys() currently returns the book's active trail, but why is it we wouldn't want that to bubble up to whatever larger element contains that block?
Comment #20
effulgentsia CreditAttribution: effulgentsia commentedSorry, I hate un-RTBC'ing issues, especially for a question that might have an answer that doesn't require any change to the patch, but I'd feel better having an answer to #19 before this gets committed.
Comment #21
Wim LeersLet's look at a generic example to explain this.
Every block must be varied by language and theme. Because a block config entity has a label that is displayed to the user and can be different for every language. And because a block is rendered into HTML markup, and the templates may differ per theme. Hence
::getCacheContexts() === ['language', 'theme']
.But just those cache contexts don't make up the cache ID: there is nothing that uniquely identifies the block. So the cache keys we use (by default) are
['entity_view', 'block', $entity->id()]
. The first two of those keys are still generic: they apply to all blocks. The last key is what uniquely identifies the block. But the first two keys are important parts too: they already help narrow down what thing it is that is being cached: . Omitting the first two keys would make it easy to have collisions with something else. So: the cache keys allow us to uniquely identify the object.Now, when we render a block, the things it must be varied by (language & theme) "infect" the parents/containers of the block too: regions in the page, and the page itself. In other words: if a block must be varied by theme and language, then so must the region that contains it, and so must the page.
However, looking at the cache keys, it doesn't make sense to cache the page using a cache ID that includes the keys
['entity_view', 'block', $entity->id()]
, because that triple uniquely identifies the block, not the page. Imagine if we did include in the page's cache ID: we'd have to include the cache keys of every single block, every single node, user, term, etc. That doesn't make sense.I hope that was clear.
Looking at the example you pointed at —
BookNavigationBlock::getCacheKeys()
/SystemMenuBlock::getCacheKeys()
— you pull out the one example in all of core that is confusing :PCache contexts allow you to say "my thing depends on this bit of the request context, vary by it when caching". Examples: user, user's roles, URL, theme, language — all of those are clearly "request contexts". Of each of those, there is one and only one of them per request. Which is why we can use them as placeholders: we know that the placeholder/token can be (efficiently) replaced with the effective value for the current request… precisely because there is only one value for them for every request.
Now, the problem with the active trail (in menus/books — they're the same concept, same implementation, same trickiness) is that there is no
, there are multiple active trails! That is, there is an active trail per menu, an active trail per book!The only way we could possibly turn that into a cache context, is by allowing a cache context to receive parameters: we could then have an "active menu trail" cache context that receives the menu name as a parameter, in which case "one and only one per request" is true again, given that parameter to select a menu to get the active trail for.
Until now, we've chosen not to go that way. We could. But IMO that would be a follow-up issue.
OTOH, as soon as a parameter needs to be passed, it indicates that you're no longer dealing with mere request context. You're also dealing with a fair bit of application logic. In which case, you could argue — as we've done so far — that you're no longer dealing with "vary by a request context", but "object uniqueness". And uniquely identifying an object is the responsibility of the cache keys again.
I'd generally be in favor of having "parametrized cache contexts", but IMO that'd belong in a follow-up issue.
Comment #22
Wim LeersFiled a follow-up issue: #2428563: Introduce parameter-dependent cache contexts.
Comment #23
Fabianx CreditAttribution: Fabianx commentedThis is just an idea I had when reading #21, but thinking about this some more, maybe we should add a cache tag whenever we add a cache context, so that its no longer the one adding the cache contexts to also add the cache tags, which might not even be known at that time or double the logic.
Just food for thought and not in this issue for sure.
Comment #24
Wim Leers#23: that is exactly what I proposed more than a year ago in #2158003-30: Remove Block Cache API in favor of blocks returning #cache with cache tags :) Let's continue that discussion in another follow-up: #2428703: Add a 'user.permissions' cache context (was: "Should cache contexts be able to associate a cache tag?").
Comment #25
webchickThis smells catch-ish.
Comment #26
Wim LeersComment #27
catchI'm holding off committing this because I'm not sure about #2429257: Bubble cache contexts, and I think that's the only issue blocked by this.
Comment #28
Wim Leers#2428563: Introduce parameter-dependent cache contexts is also blocked by this.
Comment #29
Fabianx CreditAttribution: Fabianx commentedI think this patch is a necessity for Drupal regardless of the other parts, here is why:
Cache Keys - Identify the thing, based on its configuration (e.g. view_mode, language, ...) - its the unique address
Cache Tags - Allow invalidation of the thing
Cache Contexts - show the variation of the thing - based on some state that is _outside_ of the thing.
Already in render_cache I learned that e.g. modified time is something that is usually not a true variation (later versions overwrite earlier versions) and view_mode is something that is configured on the thing, but 'phase of the moon' - being external - is a true variation where the same configuration has different outputs.
It is important to distinguish both cases as it makes talking about this things way simpler and also makes the whole system simpler to understand.
This also much improves the DX of being able to write just $build['#cache]['context'][] = 'user';
Comment #30
Wim Leers#27: Even if we don't do bubbling of cache contexts, then we still want this to happen. Because we still want e.g. the entity view builder to gather the cache contexts of the field formatters that it's about to use.
Plus, as #29 indicates, this makes the distinction between keys & contexts much clearer. This will give us a clear cache metadata trifecta:
Technically, we already have those 3 things. But we mix the keys & contexts, which makes it significantly less clear.
Comment #32
catchOK found and read the google doc which covers all the points I raised on #2429257: Bubble cache contexts. If we're going to do some combination of cid caching, placeholdering and cache context comparison/hierarchy then this should not only allow for much better performance but also help to resolve intractable issues like the entity/field access ones.
One typo fixed on commit.
context.
Committed/pushed to 8.0.x, thanks!
Comment #34
YesCT CreditAttribution: YesCT commentedchanging to the more commonly used tag, so I can delete the one lesser used