Problem/Motivation
CacheContextsManager::optimizeTokens() contains logic to remove redundant cache contexts. For example, if a rendered entity has one field that varies by 'user'
and another that varies by 'user.permissions'
, the rendered entity can safely be cached just by 'user'
, because for any given user, his or her permissions do not vary by any other context. This optimization can also be leveraged by ESI integration modules, where reducing the context variations / Vary
headers can have a large impact.
However, while AccountPermissionsCacheContext::getContext()
doesn't vary by any context other than 'user', it does change when various configuration changes, such as when permissions are added to or removed from one of the user's roles. Therefore, when 'user.permissions'
is optimized away in favor of 'user'
, the cache tags for those roles needs to be added to the cached item, so that the cache is invalidated when those roles are edited.
This is a general problem, not limited to user.permissions. For example, whenever the route.menu_active_trails:main_menu
context is optimized away into route
, the item needs to be tagged with all of the objects (in this example, the main menu's cache tag) that affect the active trail.
Proposed resolution
Remember that "caching" is basically "avoiding unnecessary computations". Therefore, optimizing a context away can be thought of as caching the result of the context service's getContext()
method. In this case, it's an implicit cache (the value is discarded rather than stored), but the effect is the same: on a cache hit, the getContext()
method is not called, hence: computations avoided.
So, the proposal is to do for cache context services what we do elsewhere for expressing that something is cacheable: use CacheableDependencyInterface
. But, given that we also have calculated cache contexts, that receive parameters, we cannot actually implement CacheableDependencyInterface
(since the cacheability metadata may depend on the parameter given to the calculated cache context, e.g. the route.menu_active_trails:main_menu
cache context needs the config:menu.main_menu
cache tag).
Therefore, we choose to use the next best thing: we update the two cache context interfaces:
interface CacheContextInterface {
…
/**
* @return \Drupal\Core\Cache\CacheableMetadata
*/
public function getCacheableMetadata();
}
and
interface CalculatedCacheContextInterface {
…
/**
* @return \Drupal\Core\Cache\CacheableMetadata
*/
public function getCacheableMetadata($parameter = NULL);
}
Then, the logic in CacheContexts::convertTokensToKeys()
needs to return not a string[]
containing the keys, but a new value object containing those keys plus cacheability metadata that is bubbled.
In thinking this through/analyzing this, we've also come to realize that HEAD is wrong in optimizing away user.node_grants
when user
is present, since node grants can have arbitrary complexity. But a specific site still may have perfectly cacheable node grants.
Thanks to the fact that cache contexts now return cacheability metadata, including max-age, we can now actually correctly leverage that for fixing this bug: the node grants cache context must return a CacheableMetadata
object with max-age = 0
. This tells the cache-context-optimizing-away logic that if it chooses to optimize the user.node_grants
cache context away because user
is also present, that it'll need to make the resulting cache item cached for at most zero seconds: i.e. it makes it uncacheable. So, this allows the logic to instead decide to not optimize the user.node_grants
cache context away, because that'd in fact be worse than optimizing it away.
However, a specific site can override the default node grants cache context, and return a max-age of e.g. 3600 (1 hour), because its node grants are cacheable for at least an hour. On those sites, it'd then be possible to do the ['user.node_grants', 'user']
-> ['user']
optimization after all, but consequently only cache the result for an hour at most.
Remaining tasks
- Tests.
- Update the patch to have
getCacheableMetadata()
never return NULL, but always return aCacheableMetadata
object.
User interface changes
None
API changes
- Added
CacheContextInterface::getCacheableMetadata()
- Added
CalculatedCacheContextInterface::getCacheableMetadata($parameter = NULL)
- Return value of
CacheContextsManager::convertTokensToKeys()
is changed
Data model changes
None
Beta phase evaluation
Issue category | Bug because stale data is returned when configuration/content changes. |
---|---|
Issue priority | Unclear if Critical or Major. It's kind of an information disclosure bug (permissions change, but you still see something you now shouldn't), but you can only see what you already once saw, so unclear if that would qualify as a true security vulnerability needing an SA. |
Disruption | Not disruptive, because the new interfaces are optional. |
Comment | File | Size | Author |
---|---|---|---|
#127 | 2512866-127-interdiff.txt | 7.45 KB | Berdir |
#127 | 2512866-127.patch | 53.73 KB | Berdir |
#123 | 2512866-123-interdiff.txt | 749 bytes | Berdir |
#123 | 2512866-123.patch | 52.58 KB | Berdir |
#121 | 2512866-121-interdiff.txt | 6.52 KB | Berdir |
Comments
Comment #1
Wim LeersFiling as major because I'm too tired right now to thoroughly test it. Feel free to remove the "potential" keyword from the title and bump to critical.
Comment #2
Wim LeersDiscovered this by working on a fix for the failing tests in
ToolbarAdminTest
at #2429617-130: Make D8 2x as fast: Dynamic Page Cache: context-dependent page caching (for *all* users!), yet the fix at #2217985-5: Replace the custom menu caching strategy in Toolbar with Core's standard caching. does not solve it, this is AFAICT the explanation. Did not have time yet to dig deeper.Comment #3
Fabianx CreditAttribution: Fabianx as a volunteer commentedYes, we should have used 'permissions', because the permissions of a user are not part of the user itself and hence not invalidated, but generated by an external service that is even pluggable.
Proposed resolution:
Internally re-parent user.permissions to just permissions, so we don't need to rename all things ...
- user.permissions
- tags:
- { name: 'cache_context_reparent', 'new_context_name': 'permissions' }
Then push that information to the cache contexts service to reparent contexts before simplifying.
Alternative syntax:
Then take aliases into account in cache contexts and do the rename as usual ...
Security issues are critical per definition.
Comment #4
Wim Leers+1 to
permissions
instead ofuser.permissions
.Permissions depend on roles, not on a user. This is why
user.roles
is correct: roles are a property of a user. Butuser.permissions
is wrong, because permissions are a property of roles. So, there's a level of indirection, anduser.permissions
pretends there isn't one.Comment #5
Wim LeersIMO keeping
user.permissions
everywhere is problematic, because:user.permissions
instead ofpermissions
) to be used everywhereNote that in important mitigating factor is that 99% of access checks are using
allowedIfHasPermission(s)()
andcachePerPermissions()
. That means none of those need to be updated!Therefore I think we should bite the bullet and do a BC break.
Comment #6
Fabianx CreditAttribution: Fabianx as a volunteer commentedI am okay with breaking BC, just showed a possibility to keep it very easily.
We could keep BC also for a few beta releases, then remove it - not sure how much that CC is used already.
Lets ask berdir.
Comment #7
lauriiiComment #8
Wim LeersFrom IRC:
My reply:
Comment #9
Fabianx CreditAttribution: Fabianx as a volunteer commentedThe gist of #8 is:
We will create a BC break patch first, then get core committer feedback.
Comment #10
effulgentsia CreditAttribution: effulgentsia at Acquia commentedI don't know if this one is, but keeping it as such for now. How would you get information disclosure? You cached some rendered content for a user that the user had permission to see at the time you cached it. You then removed a role from the user or a permission from one of the user's roles. Now the user can see the same cached content even though he or she shouldn't based on the new configuration. But, you haven't disclosed any information that the user didn't already see at one time.
I don't think I follow that logic. The permission hash generation is pluggable, but is there ever a use case for the permissions to depend on any dynamic information (e.g., the requested URL or time of day) other than the user? If so, then I agree with making it top-level, but if not, then I think that's incorrect. Because if all this is about is needing to associate some cache tags when folding a nested context into a parent one, can we then do that instead?
Comment #11
lauriiiComment #12
effulgentsia CreditAttribution: effulgentsia at Acquia commentedWell, it's probably an edge case, but since it is theoretically possible, I guess it's correct for
permissions
to be top-level. However, I'd still like to be able to fold permissions into user when we get #2453835: Allow non-intrinsic (implementation-dependent) cache context services to specify their parents via DynamicParentContextInterface::getParents() done, which means at that time, we'll still need a way to add cache tags as part of that folding. No need for that to block this issue from proceeding as planned though.That's a nice way to ensure core is following the correct pattern everywhere. Don't know yet if we'll need to add the BC-layer for contrib.
Comment #13
lauriiiThere is some modules which are using Drupal permissions handling to create time based permissions e.g. Tupas Authentication. There it wouldn't make sense to create custom cache context because what they are only depending on is user permissions. So I think that permissions can change on the fly.
Comment #14
Wim LeersMe neither, I think the pluggability of permissions is the weakest argument.
catch was strongly against that, but I can't find the issue at the moment. I proposed letting cache contexts specify cache tags at some point, but catch had pretty solid arguments for not doing this.
I think my explanation in #4 is better/clearer:
Comment #15
Fabianx CreditAttribution: Fabianx as a volunteer commented#14: Is not correct either, the permissions hash theoretically allows also user based permissions that are only granted for this user and maybe even just for a certain time frame.
With the hash based approach that works perfect obviously - but yes usually permissions are still per user so our original thing made sense in hindsight.
That said another approach would be:
- user should incorporate its children in the return of the 'user' cache context, if you purely need the users ID use [ user.uid ].
Maybe that is a better BC break?
Reasoning:
- [ user ] is similar to route that also has the [ url + request paramters ]
- user.uid is a strict subset of user
while currently
- 'user.roles' is a subset of 'user'
but
- 'user.permissions' is not a subset of 'user' (the output of it)
( which is the problem here)
So either:
- we cannot fold and move over to 'permissions'
OR
- we incorporate permissions hash into 'user' [user.uid:user.permissions] and ensure that code that only needs the users ID uses 'user.id' instead, which could be correctly folded into user as its again a subset.
Comment #16
lauriiiDid a search and replace to change user.permissions to permissions. Also added test coverage for this.
Comment #17
lauriiiComment #20
catchThis seems sensible to me.
@Wim while I didn't want to associate cache tags with cache contexts in the previous discussion, that was more about a 1-1 relationship between contexts as tags, whereas this is more about invalidating the cache contexts themselves. So not 100% against, but if we do 'user' vs. 'user.uid' that seems happier.
Comment #21
lauriiiComment #22
lauriiiStarted implementation for user.uid and user to be uid + permissions. I don't have a interdiff here because it wouldn't make any sense because there is very few things similar for the previous patch and I didn't even use it as a base for this.
Comment #23
dawehnerDo you mind to just make that implement CacheContextInterface given that there is simply no need for it? It is also semantically wrong now to extend it
Yeah some more docs about when to use what would be nice
We should explain why its not enough to just use the ID
Nitpick: Single quotes :P
Comment #24
larowlanThis looks pretty close
nit: needs a space before {
[1] looks a bit magic/fragile. Can we document this line or is there a constant we could be using? getRoles has a method to exclude locked built-in (anonymous/authenticated) roles, so could we use that and then
reset()
instead of hard-coding a 1? Using 1 is relying on an implementation detail, which we could avoid if we exclude the locked roles and use reset.Comment #25
Wim LeersEh… so
['user']
should auto-expand to['user', 'user.is_super_user', 'user.node_grants', 'user.permissions', 'user.roles']
?Logically speaking, that makes sense: "
user
impliesuser.*
".That was also the reasoning behind #2432837: Make cache contexts hierarchical (e.g. 'user' is more specific than 'user.roles'). But… that would completely undoes the optimizing part of #2432837. Are we okay with that? Or are you saying we only do the optimizing part in the
X-Drupal-Cache-Contexts
header?I'm not yet convinced.
Comment #26
Fabianx CreditAttribution: Fabianx as a volunteer commented#25: No, that is not what this is doing:
It includes those things in the result, not expands it out.
e.g.
user => u.[user.uid].[user.permissions] as "result" means that all cases:
user.is_super_user => yes, included
user.roles => yes, included
user.node_grants => yes, included (because node grants are checked per user)
user.permissions => no, not completely included => add it hard-coded into the 'user' cache context.
Your main concern was that there is a legitimate case to only cache something by the users uid (not based on permissions) and user.uid (or user.id) solves that.
=> A [user] is the users UID plus the permissions hash.
Comment #27
Wim LeersOh wow that's even more confusing :(
I think it makes no sense to add a
user.uid
cache context. That's whatuser
is intended to represent. But it could be argued that that is the best way out. Which doesn't make it "right", but could make it acceptable.But, a stronger counter argument is: "where does it stop?" — i.e. what if the Organic Groups module in Drupal 8 adds a
user.og
cache context. That's very similar to the currentuser.permissions
cache context: there's a level of indirection there. Just like a user has roles, a user can be a member of organic groups. But just like permissions exist on the role (user -> role -> permissions), the membership of a user can be determined at the Organic Groups level (user -> organic group -> members). Or, even, group-based permissions (user -> group -> permissions). Hence the same problem will exist there.Do we just keep expanding the
user
cache context like this then? Yes, the Organic Groups module could override theuser
cache context with its own implementation. But… what if multiple modules do something like this? Only one module can override the service — one will win.Comment #28
dawehnerWell, in case we would tell people how to use proper decorator services, this would be possible, but I agree its an odd statement.
Just an idea, could we get rid of using the service ID as magic name but rather require a service tag used to specify the cache context name? Once you have that, you could support multiple and take all their information into account.
Comment #29
catchSo for user.permissions and user.organic_groups.
I think we can fix both of those by attaching the current user's cache tags and those of their roles to those cache contexts.
If a user's organic groups membership changes - invalidate the cache tag for that user (and possibly the organic group itself but that's not an issue here).
if a role is updated with a new permission, then the role cache tag is updated.
This is where I argued against cache tags for cache contexts: https://www.drupal.org/node/2428703#comment-9737035
I think the arguments there (for language, and organic groups not associating the cache tags of all the groups) still apply, and this is really an exception.
However, it would be good to make it not-straightforward to add cache tags to cache contexts, since I think that could massively confuse people who are getting to grips with the concepts in general.
Comment #30
effulgentsia CreditAttribution: effulgentsia at Acquia commentedCorrect me if I'm wrong, but doesn't this problem exist anywhere that CacheContextsManager::optimizeTokens() optimizes away a child context in favor of a parent one, when in fact the child one's getContext() method returns something that depends on some configuration?
So, for example,
route.menu_active_trails
depends on the configuration of menu links in addition to the route, just asuser.permissions
depends on the configuration of roles in addition to the user.So I think what we have here is a concept that cache tags for these configurations isn't needed so long as CacheContextsManager::optimizeTokens() doesn't optimize away the context, because the getContext() method itself returns a different value for a different configuration state. But, we need to add these cache tags whenever the context is optimized away in favor of its parent.
Not sure if such a more limited concept of when to apply the cache tags helps address #29.
Comment #31
Fabianx CreditAttribution: Fabianx for Drupal Association commented#30: That is correct.
Lets look at the big picture of the cache context hierarchy:
There is:
The big picture of the idea is that e.g. Varnish can reliably cache things, because all things collapse to either:
[url] or [user]
The idea is also that at least [url] is added before the final cache contexts are processed in a response subscriber of e.g. a varnish module, so that everything collapses to [url] that is dependent on that.
Yes, indeed then all the depends-on-config-using-cache-tags-being-invalidated will fail also in Varnish.
e.g. route.menu_active_trails
=> Having cache tags on contexts is a useful concept to also ensure *SI will work well.
For user on the other hand, the context hierarchy is checked downwards and appropriate values are returned (so that caching just user.roles or just user.permissions is possible).
However that result is returned by a controller handler cached by the users session in Varnish, so we would like to re-auth the user when e.g the permissions change, as then their hash changes. Currently the naive implementation is just time-based.
While the controller could very well hard-code cache tags for known things for now and provide an alter hook, that is a fragile solution when contrib/ comes into play.
=> Without cacheable metadata on contexts, any controller exposing context values to another system will need to hard-code the assumptions the contexts make. With cacheable metadata the merging of values will just work. (e.g. max-age=10 min for permissions hash, tags for config invalidation)
Therefore:
Having a proper API of getCacheTags(), getCacheContexts() (should be [] all the time), getCacheMaxAge() on cache contexts itself would be good, e.g. cacheable dependency interface.
That API is used not generically, but either when values are collapsed or when values are retrieved for caching the context values itself, not something using the context values.
Does that make sense?
Comment #32
pwolanin CreditAttribution: pwolanin as a volunteer and at Acquia commentedComment #33
lauriiiComment #34
lauriiiAgain completely new approach. This is definitely a WIP patch but I would like to hear feedback of different ideas how we could pass parameter for CacheContexts implementing CalculatedCacheContextInterface.
Comment #35
Wim LeersInteresting — this does not return cache context objects, but cache context *services*.
I think we should keep
$keys[] = $this->getService($context_id)->getContext($parameter);
, but just have that return a cache context value object. That'd look like this:Then this can go away.
This architecture cannot actually work for calculated cache contexts, i.e. cases like
user.roles:<role name>
. Because the cache tags need to vary based on the specified role, and this architecture doesn't allow us to do so.Hence my proposal above.
Comment #36
Fabianx CreditAttribution: Fabianx for Drupal Association commentedclass CacheContextsValue {
$value;
$cacheableMetadata;
}
we should probably also support max-age, though disallow cache contexts for obvious reasons ...
Comment #38
Wim LeersRationale?
Comment #39
lauriiiThanks for the review! I was quite crazy with the previous patch. Hopefully I have included a little bit less craziness in this patch :P
Comment #40
Fabianx CreditAttribution: Fabianx for Drupal Association commentedWell, max-age is not critical.
There is nothing to inherit from?
--
I thought we also have plain config:user.role cache tag, which I think would be better?
@Wim: Thoughts?
I still prefer just passing one cacheable metadata in as second argument that can be modified and is passed to each context.
This will add 100s of calls for before-cache-case to getCacheTags() even though its always empty ...
@Wim: Thoughts?
Comment #41
lauriiiComment #43
effulgentsia CreditAttribution: effulgentsia at Acquia commentedRight. I don't think we should do it there. Per #30 / #31, we should only call the context service's getCacheTags() method when that context is being optimized away (currently in optimizeTokens(), though currently that function doesn't have a place to attach those tags to, which is maybe the challenge this patch is having).
Actually, I think a cache context service implementing getCacheContexts() has some good use cases, such as in the case of a cache context implementation varying by more than its ID implies. For example, core's implementation of user.permissions only varies by user (and configuration for which there are tags). But not by, e.g., the URL or time of day. But, suppose that some crazy contrib module swapped out how permissions were calculated and made them vary by URL or time of day. In this case, that user.permissions implementation could implement a getCacheContexts() method that returned those extra contexts, and optimizeTokens() could be improved to not optimize away contexts that depend on additional contexts unless those additional contexts are already in the token set. This could all be a non-critical followup, just pointing out that I don't think there's any conceptual reason to forbid this part of the interface.
Comment #44
effulgentsia CreditAttribution: effulgentsia at Acquia commentedRe: the issue summary changes in #32:
Per #10, I continue to be doubtful of this requiring an SA if it were discovered after release or being considered a security vulnerability even if discussed publicly, like it is here, after release. I still think it makes sense to work on until we know for sure whether it can be safely downgraded, and even after, but if we're going to keep it a critical, I'd like to know why it's a security vulnerability.
Comment #45
xjm(Unassigning from @lauriii as he mentioned that more feedback is needed before proceeding.)
Comment #46
Fabianx CreditAttribution: Fabianx for Drupal Association commentedI still feel strongly about:
a) keeping the API change to a minimum
b) not add a performance regression for OOP-nicety
My proposal is:
That makes it optional, though we always pass it in and hopefully is not even an API change.
It also has just one CacheableMetadata instantiation as cost, which is cheap.
Comment #47
Wim Leerss/cacheability metadata/cache tags/
This is wrong. Plus, no need for cache tags here (since it only cares about the specific user, not the values of the user). This can be removed.
I see the intent here (fix the specific problem described in the IS), but this doesn't solve the generic problem.
To solve the generic problem ,we'll make sure that the cache contexts that the
user
cache context (this one) implies (the implied ones beinguser.permissions
,user.roles
etc.), when they are optimized, have their cache tags still be retained, if they have any.Fabian said
… but that's an exaggeration, there will only be as many calls as there are things with both
#cache[keys]
and#cache[contexts]
. So, perhaps dozens of cheap calls. This is not at all concerning.This doesn't test the
user.permissions
cache context specifically, but the case of that cache context being optimized away. So I think the test name should be updated.Great test :)
This is where we go from
['user', 'user.permissions']
to['user']
.Hence we need to make sure here that the remaining
'user'
cache context gets the cache tags of the'user.permissions'
cache context.(In other words: the cache tags of a context that is optimized away are bubbled to the cache context that implies them.)
#43:
-1, but congrats on blowing my mind.
This opens the door to infinite recursion. Besides, alternative implementations of
user.permissions
could just specify their parent cache context with #2453835: Allow non-intrinsic (implementation-dependent) cache context services to specify their parents via DynamicParentContextInterface::getParents(), so in your example,user.permissions
would be implied byurl
.Definitely out of scope here.
Comment #48
Wim LeersSo then… you're instantiating a
CacheableMetadata
object whenever you callgetContext()
. Because it is required to always collect the cache tags. IfCacheContextsManager
were to callgetContext()
with$cacheable_metadata === NULL
, then we would indeed do less work, but we also wouldn't be solving this issue's problem. (Or am I missing something?)So how is that any different from having
getContext()
return aCacheContextValue
object, which by the way has fewer properties, so should be marginally faster?Comment #49
Fabianx CreditAttribution: Fabianx for Drupal Association commentedWell, I was suggesting a reference, so the context itself would instantiate the cacheable metadata.
Given we still have very limited use cases, what about having an interface instead.
If not CacheableDependencyInterface, what about CacheContextDependencyInterface with one getCacheContexts($value = NULL) method?
Then we have one simple instanceof check, have no API change and you can very easily opt-in.
We also open potential extensions for the future.
Comment #50
pwolanin CreditAttribution: pwolanin as a volunteer commentedLooks like we need to update the issue summary with proposed solutions and any new information
Comment #51
effulgentsia CreditAttribution: effulgentsia at Acquia commentedWhat about matching AccessibleInterface::access() and doing
getContext([$parameter = NULL,] $return_as_object = FALSE)
?Comment #52
effulgentsia CreditAttribution: effulgentsia at Acquia commentedOr adding a separate method like getContextMetadata()?
Comment #53
Fabianx CreditAttribution: Fabianx for Drupal Association commented#51: That was only, because we could not control access, but with cache contexts the only other user that directly looks at cache contexts is probably me.
I would really like a new Interface on this that is specific to this use-case - and avoid introducing any overhead as little as it may be - , but allows extension if needed.
Comment #54
effulgentsia CreditAttribution: effulgentsia at Acquia commentedDiscussed with Fabianx, and here's what we came up with:
For BC, contrib cache context implementations can choose to just implement CacheContextInterface/CalculatedCacheContextInterface, rather than the Cacheable* versions, in which case, they won't be optimized away. Core's should implement the cacheable ones, to set a good example.
Comment #55
Fabianx CreditAttribution: Fabianx for Drupal Association commentedTagging ESI, because #54 is exactly what ESI (for 8.1) needs.
I think it is okay to use the same interfaces as everyone else (CacheableDependencyInterface) - even though at the moment we might only care for cache tags.
Comment #56
effulgentsia CreditAttribution: effulgentsia at Acquia commentedCompletely rewrote issue summary.
Comment #57
effulgentsia CreditAttribution: effulgentsia at Acquia commentedComment #58
jibranWhy we have to put "Follow-up for #2432837" in title? Removing it for now.
Comment #59
effulgentsia CreditAttribution: effulgentsia at Acquia commentedI changed my mind on that. For example,
UrlCacheContext
isn't itself cacheable -- you can't optimize it into any other context, which means it's nonsensical to express that its getContext() result can be cached. So, I think that's a good example of a class that should not implement CacheableCacheContextInterface. It could, but then to express itself properly, it would need to implement getCacheMaxAge() as returning 0. But I think not implementing the interface at all is a cleaner expression of its nature.Comment #60
effulgentsia CreditAttribution: effulgentsia at Acquia commentedRetitling to focus on the bug independent of whether it's a security problem (which per issue summary, I don't necessarily think it is). Leaving the security tag though until it's clearer whether that can be removed.
Comment #61
catchFwiw +1 to #54
Comment #62
Wim LeersI'm very concerned about one implication of #54: a cache context can return other cache contexts. The recursion potential there is scary.
Fabianx/effulgentsia, could you please also clarify the pseudocode in #54? Because
public function getCacheableMetadata($parameter = NULL);
does not make sense if it's an implementation ofCacheableDependencyInterface
.Comment #63
lauriiiWorking on this now since we have a clear direction where to go
Comment #64
Fabianx CreditAttribution: Fabianx for Drupal Association commented#54: If someone misuses that, they will get an error. I don't think mis-use is a reason to babysit things. If contrib does not implement the new interface, someone that wants to ESI their site can:
a) either exchange the service and/or write a patch for the contrib module, which then likely is correct.
The CacheableCalculatedCacheContextInterface does _not_ implement CacheableDependencyInterface, but just has a method:
Comment #65
Wim LeersBut the non-calculated cache context interface does:
… and you're saying that the calculated one does. That seems highly inconsistent.
Comment #66
lauriiiStarted implementing the idea described in the IS.
Comment #68
lauriiiReroll after #2443323: New convention: CacheContextInterface implementations should mention their ID in their class-level docblock
Comment #70
dawehnerLet's provide $tags = []; to ensure that the result of the function is always an array.
The comment refers to all kind of cacheable metadata, i guess mentioning cache tags is enough. Does someone documenting why just cache tags?
Once we document please let us make it clear what is the difference between them or why we need both rather.
Comment #71
Fabianx CreditAttribution: Fabianx for Drupal Association commentedNeed to update all callers of this function, e.g. views cache plugins and tests, that should be responsible for many of the failures.
This should just use:
$cacheable_metadata = $cacheable_metadate->merge($context->getCacheableMetadata());
I like the idea and simplicity of this.
The best way is to use the pattern of:
CacheableMetadata::createFromRenderArray($elements)->merge($contexts['cacheable_metadata'])
->applyTo($elements);
Comment #72
lauriiiThis should be green patch
Comment #73
Fabianx CreditAttribution: Fabianx for Drupal Association commentedOverall this looks wonderful to me!
Unless I miss something the createFromObject is redundant as two cacheable metadata should be able to be merged directly.
---
I could not find more, assigning to Wim for review.
Comment #74
lauriiiIt's not possible because merge requires the parameter to be CacheableMetadata instead of CacheableDependencyInterface
Comment #75
Fabianx CreditAttribution: Fabianx for Drupal Association commentedYes, right, thanks #74!
Comment #76
Fabianx CreditAttribution: Fabianx for Drupal Association commentedI think given #74 it is better to indeed return a CacheableMetadata object here.
And just use the CacheableMetadata::createFromObject($this);
Thoughts?
Comment #77
lauriiiHmm that sounds good to me. Did that change and some doc updates. I also implemented the logic that cacheable metadata will be bubbled up only from the cache contexts that are optimized away
Comment #78
dawehnerCan we expand the unit test to also test the new inheritance?
Comment #79
BerdirSeems OK overall, some feedback below.
Note that I haven't read most comments, just the issue summary and the patch. It will possibly also overlap with the review from Wim that he's working on.
* It might be worth mentioning the issue that introduced user.permissions in the issue summary. That was introduced to solve this problem for user.roles but the optimize use case was forgotten then. Discussed with @WimLeers, having user.permissions is still useful, even though we could make user.roles functionally equivalent by adding those cache tags there too and always using it. Agreed that this is preferable
* The other example mentioned in the issue summary is route.menu_active_trails, that doesn't seem to be addressed yet, would we do that in a follow-up? that one has arguments AFAIK, so it might actually be worth exploring here if my ideas below make any sense about caring about parameters when collecting the metadata.
Should we maybe use $rid to clarify what it is (I know that getRoles() is a bad method name, please don't hit me)
We have many different ways to do something like this now:
* BlockRepository now uses a by-reference parameter (there it is actually a list of cacheable metadata).
* Link generation uses an object (GeneratedLink or so I believe)
I don't think we're using arrays yet, maybe it would be better to follow the by-reference approach, which has the advantage of not having to enforce an API change if we don't want to (but we can, by making the argument required)
*have* been optimized
Is it worth considering the $parameter here somehow and e.g. pass it to getCacheableMetadata()?
That might lead to less cache tags that need to be verified, for example.
Either *a* cache context or use plural I think?
What do we gain by supporting *both* getCacheableMetadata() and CacheableDependencyInterface? We have an easy shortcut to do this with createFromObject() that would save quite a few lines of duplicated code/docs? Except if we want to support that optoinal parameter argument but then we should only have one method..
Why implement the interface, nothing is using it here? It's the permission cache context that actually needs it?
Ah.. that extends from this so it gets the other methods through this. That doesn't really seem like a good idea to me, since it means that we need to merge int data from more contexts that aren't doing anything?
Ah, unit tests might be the reason this was done, although I actually can't see any that would return something.
It is possible though, although a bit convoluted because you need a returnValueCallback(), an example is in BlockPageVariantTest::testBuild()
Comment #80
Wim LeersLet's add an
@see \Drupal\Core\Session\PermissionsHashGenerator::generate
.This is not receiving the
$parameter
yet in case of a calculated cache context.This points to a gap in the test coverage.
This is my biggest concern.
The first implements
CacheableDependencyInterface
, the second does not. This is for a good reason: the first doesn't need a parameter, but the second does.Therefore, the second must use a different way of specifying cacheability metadata: a
getCacheableMetadata()
method. But, for consistency reasons, we then add that method to the first as well.At that point though, it seems a rather convoluted system. It feels like it'd be better to have
Cacheable(Calculated)CacheContextInterface::getContext()
change the return value ofgetContext()
fromstring
to aCacheableContextValue
(name TBD), which contains thestring
being returned in HEAD (inCacheContextInterface
), plus cacheability metadata.IOW: what I said in #48.
Performance difference of returning a string versus a value object should not be a concern: http://3v4l.org/qKsPF/perf vs http://3v4l.org/AO74i/perf.
There's no parameter in this case.
Comment #81
Fabianx CreditAttribution: Fabianx for Drupal Association commented#80: Lets not introduce an API change here - if we can do it without.
I am okay with both just defining getCacheableMetadata($value = NULL) and then using in the non-calculated cache contexts as well:
This is likely even faster as before we needed to create the CacheableMetadata anyway from an object.
I don't think we need to implement CacheableDependencyInterface, because we have dedicated Interfaces providing the getCacheableMetadata() method.
Edit:
And yes, I agree we need test coverage for the calculated case.
Comment #82
lauriiiComment #83
lauriiiAssigning away from Wim
Comment #84
BerdirOk, I've been discussing this issue at length here with @WimLeers and @lauriii. We basically made two circles and a looping around every option we could think of to solve this issue and discussing pros and cons.
First, to get this out of the way, we think that taking #2453835: Allow non-intrinsic (implementation-dependent) cache context services to specify their parents via DynamicParentContextInterface::getParents() into account here implies we are solving at least a significant portion of that issue. Which would then pretty much mean solving *that* issue as part of this *critical* issue. I will add a comment there why we should not build on top of this to solve that issue.
Now, to this issue, here's what we suggest:
1. Add *only* getCacheableMetadata() / getCacheableMetadata($parameter = NULL) to the existing two interfaces. This is an API change, but:
a) None of us are aware of any contrib/custom implementations since this is all fairly new, and contrib hasn't really picked this up and there really aren't that many implementations possible. (same argument as the one that moved all the cache context classes to a new namespace)
b) It avoids the complexity of having a total of 4 interfaces to choose from for anyone implementing a cache context. It also gives us a good place to document when and why you need this, since you are basically forced to implement it. (We could allow a null return value to avoid creating empty objects and merging them).
2. We use the CacheableMetadata object even though we only use tags and max-age, we just document on the new methods that any cache contexts set on the object will be ignored, even if set.
3. However, we do add a new value object CacheContextKeys (...Metadata?) that contains keys + tags + max-age, with getKeys(), getTags(), getMaxAge() and applyTo(). CacheContexsManager::convertTokensToKeys() would return it. That is again an API change but so would almost all other options there (except an optional argument) but we think that makes it easy to use and makes it clear that there is something additional that you have to care about if you're using it.
We discussed a number of other things, including interfaces to hint whether a context should be optimized away or not, many different options to return/collect the cacheability metadata but I think those three points above are a good compromise for solving this and keeping it easy to use.
As a last resort, we also discussed whatever-we-give-up-everything-is-stupid option to resolve this issue: Remove context optimization entirely since core itself doesn't need it. But I think that's not needed.
Comment #85
Wim Leersberdir++
lauriii++
Comment #86
effulgentsia CreditAttribution: effulgentsia at Acquia commented#84 works for me.
Per #59,
UrlCacheContext
(and others) will need to either returnNULL
or something likeCacheableMetadata::createFromObject(NULL)
. I think both are sufficiently clear, so I don't have a preference on whether we allow the former or not.Would it be ok to add a @todo to this doc indicating that that might change in #2453835: Allow non-intrinsic (implementation-dependent) cache context services to specify their parents via DynamicParentContextInterface::getParents()?
Comment #87
Fabianx CreditAttribution: Fabianx for Drupal Association commented+1 to #84.
I like the approach and I am onboard with the API change, especially if its just one method added.
Just one question as that did not get clear:
- Why explicit getTags(), getMaxAge(), ... on that new value object instead of just encapsulating the metadata?
Is this to _protect_ from anyone misusing context here?
If that is the reason I would argue an assert() (atm. _trigger_error) would be better here that cache contexts are indeed empty.
If there is another reason, I would love to hear about it.
Comment #88
lauriiiOnce again a new solution.
In this patch I didn't create getKeys(), getTags(), getMaxAge() and applyTo() methods but instead a getCacheableMetadata method which return CacheableMetadata object.
One question is that do we explicitly want to disallow adding cache contexts inside cache context services?
Comment #90
lauriiiFixed the fatal and some clean up
Comment #91
Fabianx CreditAttribution: Fabianx for Drupal Association commentedLooks great already!
The if looks wrong here as its always true.
Unused in here.
Still needs tests for a FakeCalculatedCacheContext with a dynamic value.
Comment #93
lauriiiMaybe better luck this time. Also fixed #91
Comment #95
BerdirFixing the test fails there.
Comment #96
Wim LeersNice, green!
Given the new
CacheContextOptimizationTest
in this patch, we do have tests, so removing the corresponding issue tag. That being said, we do want to expandCacheContextsManagerTest
as well I think.But, what this patch really needs now, is a discussion/review with Alex Pott and catch, to get their take on the design/consequences of this. Assigning to catch to make that clear.
Comment #97
Fabianx CreditAttribution: Fabianx for Drupal Association commented#96: You asked for test coverage of a calculated cache context in #80.2 and CacheContextOptimizationTest does only test what is in the issue here, but not with a parameter.
Hence:
"Still needs tests for a FakeCalculatedCacheContext with a dynamic value."
Comment #98
BerdirI always found the 'keys' key in #cache to pretty bad (I would have preferred cid_parts or something) and this isn't clear either.
I'm not saying that we should change it, but we should atleast clarify what we are actually talking about here, possibly also on the convert method.
needs a description.
Same.
has => have.
Maybe reword it a bit. Something like "Contexts can return NULL if they have no cacheable metadata that needs to be bubbled up/merged."
I'd say the parameter is string|null?
Yes, you could write something like 'whatever:5' but it's still a string when being passed in here.
Should also clarify what NULL means maybe? not sure how the other methods do that.
This is the one that also needs something according to the issue summary. Then we also have something to be used for the parameter-test?
This extends UserCacheContext, so it doesn't have to re-implement this...
This test seems to move the cache tags assertion around, makes it a bit harder to review?
return NULL?
As I said, cid_parts is so much clearer ;)
Comment #99
BerdirOk, partially addressing my own review, but we have a few problems:
* MenuActiveTrailsCacheContext misunderstands what NULL means for getActiveTrailIds(). It does *not* mean all the menus, it means not restricted to a specific menu. So using that cache context without a parameter will not do what you think at all. I think that should just throw an exception that this is isn't supported?
* NodeAccessGrantsCacheContext: As far as I see, this must *not* be below user. There is no cacheable metadata that we can add that would fix this. This returns whatever hook_node_grants() returns and that can be *anything*.
Comment #101
xjmTagging per #96.
Comment #102
catchWhich parameter value?
Agreed with berdir on this, $cid_parts is much clearer on how this gets used.
Cache keys sounds like a list of $cids which is not what it is. And are they even keys or really cache context values?
Do we care that they're optimized?
We discussed static caching of cache contexts in other issues, and the cacheability metadata would allow that to happen (i.e. use the cache.static backend from #2501117: Add static caching for PermissionsHashGenerator::generate() the cache tag invalidations work even during the request). The important thing is that the render cache itself doesn't get the extra cacheability metadata unless the context is optimized away, but the information can still be useful for other use cases when it isn't.
Nit 'have'.
I'm not sure about this:
We have only a single use case in core for a context with cacheable metadata - the user permissions cache context.
Contrib may have use cases - for example "user's organic groups" might get optimized to user, although even then I'm not sure whether that would need to be tagged with the groups, possibly not.
On the other hand, it's very easy to get turned around by this concept, and think that the language cache contexts should have language cache tags as metadata when they don't at all.
So having an extra, esoterically named interface that 99% of cache context implementors can ignore might not be so bad, compared to adding the method that encourages people to put something in there, when they really shouldn't.
However, I don't feel strongly about this at all. The number of people actually implementing cache contexts is going to be tiny anyway, since core provides what's needed for most cases - so the finer points of interface vs. method are not likely to affect many people at all, whereas fixing the user permissions case in core is important to get right.
As effulgentsia has pointed out, the consequences of not getting this right are that things get stale (or for my concerns get invalidated too often) - as opposed to other cacheability issues where different people see the wrong content. It's more of a cache race condition compared to other cache poisoning/information disclosure we're dealing with elsewhere, which is less of a worry if contrib gets it wrong too.
Comment #103
BerdirMy cid_parts/cache keys feedback was not very helpful. The problem is we *are* using cache keys all over the place, most notably in
['#cache']['keys']
and the cid we return is cache keys + keys from the contexts. So I don't think we should rename it here, it was more a remark a long "I wish we'd have used cid_parts instead of keys 1/2 years ago".Comment #104
catch#103 also makes sense.
Comment #105
Wim LeersDiscussed the proposed solution in this issue with @catch, @alexpott, @effulgentsia, @Gábor Hojtsy, @Berdir and @dawehner. Both @catch and @alexpott approve of this solution, so we can work on finishing this patch, notably the additional test coverage that we want and updating the patch to have
getCacheableMetadata()
never return NULL, but always return aCacheableMetadata
object.IS completely rewritten.
I hope I captured everything.
Comment #106
BerdirStarted implementing the things we decided, will need to update tests, also reverted my bogus changes in CommentRssTest.
Comment #108
Wim LeersInterdiff review
Let's use strict equality.
Accidental change.
Nit: It's "active trails", plural.
Patch review
s/based on the parameter value//
(This is just a copy/paste error; it comes from
CalculatedCacheContextInterface
, where it is correct.)The "or NULL" part needs to be removed.
If this just contains
CacheableMetadata
, why not just change this toclass CacheContextKeys extends CacheableMetadata
?Incomplete.
s/through/over/
s/has/have/
These bits are identical. Shall we put them in a closure (or protected helper method)?
s/If c/If a c/
In case of a calculated cache context, there's a parameter, and that parameter needs to be passed to
getCacheableMetadata()
. That's missing here.The "or NULL" part is wrong here too.
The remaining changes in these files can be reverted.
Signature change without docblock update.
The
$contexts
variable name feels wrong/misleading now.This one was missed in the update in #106.
Comment #109
BerdirWas a bit early for a nitpick review, I just hacked on this until I was too tired to continue, but addressing it now :)
Interdiff:
3. Then the documentation that was added recently was wrong too, I copied from there ;)
patch:
3. Inheritance vs. Composition. I like it this way, it's a different kind of object.
5. I pointed this out in my reviews twice but forgot to update it too.
6. Not sure about the bets way to do this. What if we check this in parseTokens() and throw an exception there? We could also change the return value to be array($context_id => $parameter) instead of array(array($context_id, $parameter)), then we can avoid the list(). Why is this public?
7. Same as 5.
12. Yes. While doing this, it actually occurred to me that ContextCacheKeys is a much better name than CacheContextKeys. I think. So renamed.
Also fixed the unit tests. Hopefully without fatal errors now.
Comment #111
Wim Leersinterdiff.3: #sadpanda :( Sorry about that!
patch.3: Discussed with Berdir, he agrees with #108.patch.3 now, considering that that is then consistent with
GeneratedUrl
,GeneratedLink
andFilterProcessResult
.(Calculated)CacheContextInterface
can't do that because it needs to accept a parameter; that's why it's okay for that one to diverge from this pattern.patch.6: Hrm. I'd swear this has already been discussed in the past, but I can't find it. I think it's wrong to throw an exception in
parseTokens()
because then it's no longer just parsing. However, I realized we can basically do this:Nit: 80 cols.
Comment #112
BerdirComment #113
BerdirNote: The last patch does not yet address the 80 characters nitpick.
Comment #115
Wim LeersNit: This is technically out of scope, but yeah, the current FQCN was wrong. So +1. But this should not have removed the parameter name, let's bring that back :).
This one should not have a parameter.
Nit: s/user based/user-based/
Nit: UserCacheContextBase
Nit: s/service/class/
Comment #116
Wim LeersI re-reviewed the entire patch, with a focus on the test coverage, and didn't find any problems.
Nit: s/cache context keys/cache contexts' keys/
"cacheable" vs "cache" vs "cacheability". Let's be consistent.
Nit: s/associated//
s/change/changes/
80 cols.
Comment #117
BerdirTests fixed, addressed the nitpicks. Changed ContextCacheKeys to extend from CacheableMetadata. Can't say I really like it, but I can live with it ;)
Comment #119
BerdirWeird typo.
Comment #120
Wim LeersCan be simplified to
Or, if
ContextCacheKeys
had asetKeys()
method — just likeFilterProcessResult
,GeneratedUrl
etc. have setters also — then the initially created object could in fact be aContextCacheKeys
object already, and it'd be as simple as:Still an 80 cols violation.
s/cacheable metadata/cacheability metadata/
This should be deleted.
Why does this have
max-age = 0
? We should document the reason.Comment #121
Berdir1. I didn't think that works, using that now. I don't like setKeys() that means it is no longer an immutable object :) (as far as the primary value goes)
5. I guess this was based on #2483181: Make book navigation block cacheable, but but I was just guessing. Looked into it now and documented and implemented it properly. You're welcome to find any flaws in my thinking there ;)
Comment #123
BerdirComment #124
Wim LeersTwo typos that can be fixed upon commit.
s/that/then/
s/we/it/
Hurray! :) Berdir++
Comment #125
alexpottLet's improve the documentation to mention what it means when this method just returns an empty CacheableMetadata object.
Is this still true?
The SafeMarkup use statement can be removed.
Is it worth mentioning in the documentation that if the argument is invalid that the implementation should throw an
InvalidArgumentException
I think we should document what interfaces implementations are expected to implement.
According to @Berdir this class is not supposed to implement this interface.
Comment #126
Wim Leers@throws \LogicException
, because that's whatvalidateTokens()
throws.sprintf()
should be used instead now. But note that this is a deleted line, so :)CacheContextInterface
andCalculatedCacheContextInterface
that this change was made. (HEAD was wrong there, but PHP being PHP allowed it.)Comment #127
BerdirImproved the docs and @throws, fixed the nitpicks from #124.
Comment #128
Wim LeersLooks splendid!
Comment #129
alexpottCommitted 46e6f72 and pushed to 8.0.x. Thanks!
Removed a redundant (and incorrect - should be themselves) word.
Comment #131
Wim LeersDocumentation updated: https://www.drupal.org/node/2459039/revisions/view/8630634/8652850.
Comment #132
Wim LeersI was investigating the failures in #2429617-160: Make D8 2x as fast: Dynamic Page Cache: context-dependent page caching (for *all* users!), the toolbar failures there were unexpected.
That's how I discovered that AFAICT we forgot about something in this issue all this time:
user.permissions
continues to work when a user receives *additional* roles. And because we "fixed" the issue here by adding role cache tags when optimizing awayuser.permissions
, that case is *still* broken: it only captures the changes of permissions assigned to roles, it does not capture the changes of roles assigned to the user!But fortunately we already have the necessary infrastructure in place thanks to this issue, because we solved it generically here. At least one solution is to use the same approach that we used for
user.node_grants
: consideruser.permissions
impossible to optimize away.Thoughts?
Comment #133
catchWhat about adding the user account cache tag to the per-user context?
Comment #134
Fabianx CreditAttribution: Fabianx for Drupal Association commentedYes, we should add the 'user:[uid]' cache tag in that case as its a property of the user.
I am also okay to make it non-optimize-away-able in addition, but we should add that cache tag anyway.
Comment #135
YesCT CreditAttribution: YesCT commentedDo we need a separate issue to do what @catch asked for in #133, or is there an already existing one?
Comment #136
Fabianx CreditAttribution: Fabianx for Drupal Association commented#135: Yes, we need a follow-up.
Comment #137
YesCT CreditAttribution: YesCT commentedHere it is #2533768: Add the user entity cache tag to user.* cache contexts that need it I wasn't sure about the priority or if it should similarly have the security tag or not.
Comment #138
Wim LeersThanks @YesCT for creating that stub, I've helped flesh it out further now.
This issue can now be closed automatically in 2 weeks :)