Problem/Motivation
Currently the EntityAccessControlHandler::createAccess()
method does not take the optional $context
array into account when statically caching the access result. This leads to unpredicted behaviors when you want to perform create access checks with different contexts.
For example, think of an entity that might only be created if some context parameter has a special value (e.g. the ID of a parent entity). When performing several access checks with different context values, the first call gets cached and its result is always used regardless of what the context array contains.
Current code is:
$cid = $entity_bundle ? 'create:' . $entity_bundle : 'create';
if (($access = $this->getCache($cid, 'create', $context['langcode'], $account)) !== NULL) {
// Cache hit, no work necessary.
return $return_as_object ? $access : $access->isAllowed();
}
As you can see, only $context['langcode']
is respected when statically caching the create access result.
The only solution right now, is to completely override the createAccess()
method with custom logic for the cache ID in an access handler (if you have the chance to do so).
Proposed resolution
- Move cache id generation into a helper method, so it is easier for inheriting classes to customise.
- By default, skip static caching when there is context other than langcode.
Remaining tasks
Provide a patch to fix this issue- Review patch
User interface changes
n/a
API changes
n/a
Data model changes
n/a
Comment | File | Size | Author |
---|---|---|---|
#40 | 2886800-40.interdiff.txt | 670 bytes | claudiu.cristea |
#40 | 2879087-40.benchmark.txt | 2.22 KB | claudiu.cristea |
#40 | 2879087-40.patch | 4.89 KB | claudiu.cristea |
#39 | 2886800-39.patch | 7.12 KB | claudiu.cristea |
#39 | 2886800-39.interdiff.txt | 17.93 KB | claudiu.cristea |
Comments
Comment #9
claudiu.cristeaThis is a valid request. I'm facing this in #2879087: Use comment access handler instead of hardcoding permissions , where I need to cache different access results based on the commented entity or even on the parent comment, in case of replies. Right now with the
comment
entity, regardless of$context
content, the static cache location is the same.@hctom,
What if we keep the same structure of
$this->accessCache[$account->id()][$cid][$langcode][$operation]
but we allow the classes extendingEntityAccessControlHandler
to provide their own$cid
? I'm thinking on creating a new method onEntityAccessControlHandler
:Then your class, extending
EntityAccessControlHandler
, will just have to extend::buildCid()
and create a custom cid.Comment #10
claudiu.cristeaHere's a patch based on #9. Does anybody think that this needs tests?
Comment #11
claudiu.cristeaThis blocks #2879087: Use comment access handler instead of hardcoding permissions .
Comment #12
hctom@claudiu.cristea: Yeah that is definitely a good idea to move the actual cache ID creation out of the
createAccess()
method, so the developer experience is improved, because you only need to override thebuildCreateAccessCid()
method from then on to get correct cached results based on whatever additional conditions your custom access control handler might have (e.g. specific context values).But for the record: This will not solve the actual problem - but as we discussed on the Slack call: This might never be solved by Drupal core's default entity access control handler(s) because you never know what kind of values are passed as context and thus you are lost normalizing them for a unique cache ID. Perhaps this is something that should be documented in the code or wherever instead...
Comment #13
jonathanshawThis seems perfect as is. We have no current test coverage for the internals of EntityAccessControlHandler, and this is a very minor refactoring.
Comment #14
larowlanThis looks like a good approach to me, but without any tests, it might break in the future.
Can we add an implementation to one of the test entities that verifies this works as we expect (and intend to use in #2879087: Use comment access handler instead of hardcoding permissions )
Thanks.
Comment #15
Sam152 CreditAttribution: Sam152 as a volunteer and at PreviousNext commentedIs there a reason that
$context
can't be serialized and hashed as the default cid? While it allows individual entity types to fix the bug at their own discretion, I do feel like thi does leave an open bug by default for all other entity types.Comment #16
larowlanI'm guessing performance is the main consideration
Comment #17
claudiu.cristea@Sam152, @larowlan
Me and @hctom, we had a discussion on this. Indeed serializing the whole context would cover all cases. However, we both agreed that this could be performance cost. Context may contain arbitrary data, including entities. We probably would need to index context by keys, then serialize the whole array and, finally, convert to a hash. I'm sure this would be too expensive.
Added tests.
Comment #18
jonathanshawWe could serialise if there were no objects in context, and throw an exception (or for now trigger a deprecation error) if there were. So any entity type that provided objects in context would be warned that it needed to provide custom cache id logic.
Comment #21
claudiu.cristeaI don't understand what's the deal with that deprecation.
@jonathanshaw,
I fact we plan to pass objects in #2879087: Use comment access handler instead of hardcoding permissions . And I think we complicate too much a minor refactoring. A deep discussion about static caching should cover all aspects, static caching is problematic in many places. Probably, in many places we better move to a memory caching service that allows also modern cache invalidation and varying by using cache metadata. I would not overcomplicate this particular issue.
Comment #23
jonathanshawThat makes sense, especially if there's a related issue we can link to.
My concern is the same as @Sam152 in #15.This isn't simply a minor refactoring task. It's filed as a bug report. And we're currently not actually fixing the bug, just enabling others to fix it more easily.
We could do trigger_error / throw() if getCacheId is called with context values other than langcode, indicating a need to implement a custom getEntityCacheId not rely on the default implementation.
We could also punt this suggestion to a bug follow up and change the present refactoring issue to a minor task. Or create a new issue for the task and postpone this bug on the task.
Comment #24
claudiu.cristea@jonathanshaw my main point here is that we are overdoing and trying to add some complex logic for something that doesn't add to much value. How many custom entity types would enrich the cache ID with context vars? We hit this in the other ticket, with comments. Finally, I will not touch this issue but I'll unblock #2879087: Use comment access handler instead of hardcoding permissions and move this trivial fix in the scope of that issue. Then this can be fixed as a bug, but I really don't have time resources to work on such issue. The comment ticket is blocked here and that blocks other big comment ticket #2786587: [PP-1] Add thread depth configuration to comments, both I need to my current project at European Commission.
Comment #25
jonathanshawHow's this? If there is context that need to be considered for caching, then don't cache.
Comment #26
jonathanshawWIP test coverage.
Comment #27
jonathanshawself review:
should be bundle_2
Comment #28
Sam152 CreditAttribution: Sam152 as a volunteer and at PreviousNext commentedI'd be interested in some profiling that proves fixing this properly is slow. For the vast majority of handlers that don't pass a large number of complex objects, I'm guessing it wouldn't be a huge penalty?
Comment #29
jonathanshawTests fixed.
Comment #30
jonathanshawComment #31
jonathanshawI opened #3180960: EntityAccessControlHandler::createAccess() and EntityAccessControlHandler::access() should use whole context array when caching as a feature request to allow caching when there is context "aka fix this properly" (per #21 or #15/#28), focusing this issue on solving the bug that the current implementation can return false positive cache hits.
Comment #33
jonathanshawMinor coding standards and test fix.
Comment #34
jonathanshawComment #35
claudiu.cristeaAcronyms should be uppercased: id > ID. See https://www.drupal.org/drupalorg/style-guide/content#abbreviations
Lower camel case local variable naming are allowed by Drupal coding standards but should not be mixed with "snake case" naming. See https://www.drupal.org/docs/develop/standards/coding-standards#s-functio....
Also the variable name doesn't reflect exactly what the variable is capturing. There is also the entity type as standard context var. Probably,
$has_non_standard_context
and cast the whole result to(bool)
.The main problem here is that this patch is breaking the static cache when a non-standard context var is passed, even that var has no intention to participate to the cache ID value. This is clearly a regression.
Comment #36
jonathanshawCould you help me understand this? My thinking was: if someone is supplying context, then they must be supplying it so that it will be used in access checking. In which case we get false positive cache hits if we ignore it.
It's true that if someone is providing context unnecessarily, then the static cache will stop working for them. But that seems to me like an acceptable outcome: it's better to only use cache when we can do so safely. If you want caching, don't provide unecessary context. Providing context means "consider this context".
I reviewed all the uses of createAccess I could find in http://grep.xnddx.ru that provided a context array.
The following seem affected by this bug, and would be helped by the proposed fix.
The following use context, but are not affected by this bug. They are also not affected by the fix I propose because they override createAccess() entirely with their own implementations.
I found no cases of a contrib module that would suffer from the regression you suggest, no cases where a contrib module provides context but does not use if for checking access.
Am I misunderstanding something?
Comment #37
hctom@jonathanshaw: +1 for your solution. I also think it is acceptable that static cache will be bypassed for contexts other than language. Everything is better than getting wrongly cached results that might even result in security issues!
Just one more minor issue in addition to claudiu's:
As this is a generic class that is used for by default for different entity types, this might lead to trouble when two or more entity types have bundles of the same name. With this implementation (without the entity type ID in the cache ID), a cache ID would be returned due to the fact that different
entity_type_id
context values are not handled on their own. Even though this would only happen when checked on the same page... this might be quite unusual but may happen at some point.Comment #38
jonathanshawTechnically this should be a seperate issue, but it's simple to handle here as it doesn't seem to have any BC edges. Looks like our default cache ID should be create:entity_type:bundle.
Comment #39
claudiu.cristeaThis is a try with the original idea of the ticket and I've benchmarked the solution against the current implementation from HEAD. The benchmark script is included in the patch just to understand how it was tested.
Benchmark results:
This gives an answer to @Sam152 question from #28.
Comment #40
claudiu.cristeaThere's a mistake when creating the hash.
However, I think the benchmark shows that serialising big objects has a cost.
Comment #41
claudiu.cristeaRe. #37
@hctom, that's not exactly true. Yes, it's the same class but it's not the same instance. Each entity type will create a separate instance because the class constructor receives the entity type. For each entity type, a new instance is created and each instance encapsulates its own static cache. There's no potential collision.
Comment #42
jonathanshawYou're right, thanks for clarifying that. I'd got confused by us calling it a "static cache" when it's not actually declared static.
@claudiu.cristea what do you think in response to #36?
Comment #43
hctom@claudiu.cristea: Sorry, of course you are right. I also got misleaded by the term "static cache"... I should have taken a closer look at the code.
Comment #44
claudiu.cristeaRe. #42. So, we're trading here to fix a security bug with the cost of losing some static caching for some projects.
Side note: There might be more modules we're breaking the internal cache, than the list from #36, as we don't know what's outside, in the wild, in custom code.
Thinking on this aspect, I'm more and more tempted by the solution from #36. What about creating an issue in each project mentioned in #36 as soon as this is RTBCed with "Postponed" status. So that we inform the maintainers that when this goes in, they will have to fix the internal caching in their project?
Comment #45
jonathanshaw+1
No need to postpone it. They're already have a security issue, #33 just changes it to a caching issue :)
Comment #46
joachim CreditAttribution: joachim at Factorial GmbH commentedExpanding the scope -- this affects access() as well as createAccess().
I'm seeing this problem in a kernel test where I am testing the result of requests made to a REST service:
My test fails because subsequent requests for the same node get the same cached access value from EntityAccessControlHandler::$accessCache.
Instead of a class property that statically caches the access values, shouldn't we inject MemoryCache instead?
The only service that uses that currently is entity.memory_cache which is used for entity storage, so we might want to define a new service so it's a separate cache bin.
But then that should take care of cached data being invalidated automatically, if we set the proper tags.
Comment #47
jonathanshawRe #46 moving away from static caching to memory caching was discussed in #21, but suggested as a separate issue, with this issue just a minor refactoring.
Comment #48
joachim CreditAttribution: joachim at Factorial GmbH commentedThe current approach has been worked on for 4 years, so I don't think it counts as a minor change. It actually sounds like the current proposed fix is more complicated than switching to using memory cache, because of the problem with other modules doing unexpected things with $context.
And anyway, including the $context that's in createAccess() isn't a complete fix for this bug: A module might implement hook_entity_create_access() and return an access value that's based on something else. It would correctly add a cache tag for that dependency into the returned AccessResult, only to have that ignored by the static caching.
Comment #54
claudiu.cristeaThis is at least Major if not Critical
Comment #55
jonathanshawWe had converged on consensus on the solution in #36. But #48 makes this point:
I think this is a valid concern.
It seems therefore that in order to fix the bug here, we should use MemoryCache.
Comment #56
jonathanshawI think I'm wrong above in #55.
I don't see a way to get information back from the cacheable metadata added by hooks about what createAccess() $context they used in their logic. Therefore we cannot handle context intelligently even with MemoryCache.
So we either have to serialize the whole context, which has performance concerns, or we just don't do static caching when there is context. Which is the approach in #36 we mostly all like.
This is valid regardless of whether we use MemoryCache or not in the future. $context could contain anything, and so we shouldn't automatically be stuffing it into MemoryCache either.
MemoryCache would solve the issue identified in #48, but it's not the right solution for this, so let's rule #48 out of scope here and move it to another issue.
Comment #57
needs-review-queue-bot CreditAttribution: needs-review-queue-bot as a volunteer commentedThe Needs Review Queue Bot tested this issue.
While you are making the above changes, we recommend that you convert this patch to a merge request. Merge requests are preferred over patches. Be sure to hide the old patch files as well. (Converting an issue to a merge request without other contributions to the issue will not receive credit.)