Problem/Motivation
This is a child issue of #1805054: Cache localized, access filtered, URL resolved, and rendered menu trees.
Unfortunately, because Drupal only wants to show menu links to a user that are accessible for that user, rendering menus presents a complex performance problem. A menu consists of menu links arranged in a tree. Each menu link points to a route (or an external URL). Each route can have any number of access checks defined. And, finally, those access checks can be of arbitrary complexity, and therefore of arbitrary cacheability. Some may be globally cacheable, others per role, yet others per user, per page, per language, and whatnot.
But we currently don't even have that cacheability metadata!
This issue is about adding such cacheability metadata.
Footnote
Note: we currently are caching rendered menu trees, but as #1805054: Cache localized, access filtered, URL resolved, and rendered menu trees says:
#2158003: Remove Block Cache API in favor of blocks returning #cache with cache tags enabled render caching of menu blocks, but wrongly assumed that all menu links can be cached per role. Current HEAD is hence broken in that regard (steps to reproduce: create two non-admin users of the same role, give them the "view own unpublished content" permission, let create user A an unpublished node, put that in a menu that's in a menu block, look at that menu block with user B and you'll see the link to the unpublished node).
Proposed resolution
Allow access results to specify if they are cacheable, and if so, which cache contexts (@cache_context.url
, @cache_context.language
, @cache_context.theme
, @cache_context.user
, @cache_context.user.roles
, etc.) its results should be varied by.
We can do this by making access results objects rather than the various options of "boolean" or "boolean or null" or "one of three string constants" or "one of three integer constants" that we have today. Those access results objects can then provide cacheability — if they want — by also implementing CacheableInterface
(which already exists in HEAD).
Remaining tasks
- Build consensus
- Review
- Commit
User interface changes
None.
API changes
- Addition:
\Drupal\Core\Access\AccessResultInterface
— the entire patch is centered around this. - Addition:
\Drupal\Core\Access\AccessResult
, which implementsAccessResultInterface
andCacheableInterface
. Because it implementsCacheableInterface
, it can provide cacheability metadata. However, other access checks can choose to use anotherAccessResultInterface
implementation that does not implementCacheableInterface
— those will be treated as uncacheable.
- Before
-
if ($route->getRequirement('_access') === 'TRUE') { return static::ALLOW; } elseif ($route->getRequirement('_access') === 'FALSE') { return static::KILL; } else { return static::DENY; }
- After
-
if ($route->getRequirement('_access') === 'TRUE') { return AccessResult::allowed(); } elseif ($route->getRequirement('_access') === 'FALSE') { return AccessResult::forbidden(); } else { return AccessResult::create(); }
- Deletion:
\Drupal\Core\Access\AccessInterface
— this only served as a home for theALLOW/DENY/KILL
constants, which were the possible return values in many cases before this patch. Since we now haveAccessResultInterface
instead, they've become obsolete. - Converting all the access-checking logic to
AccessResultInterface
— this clearly shows how inconsistent access checking is today in HEAD:TRUE
/FALSE
→AccessResultInterface
\Drupal\Core\Access\AccessManagerInterface::check()
\Drupal\Core\Access\AccessManagerInterface::checkNamedRoute()
\Drupal\Core\Entity\EntityAccessControlHandlerInterface::checkAccess()
\Drupal\Core\Entity\EntityAccessControlHandlerInterface::checkCreateAccess()
\Drupal\Core\Field\FieldItemListInterface::defaultAccess()
\Drupal\content_translation\ContentTranslationHandlerInterface::getTranslationAccess()
\Drupal\quickedit\Access\EditEntityFieldAccessCheckInterface::accessEditEntityField()
shortcut_set_edit_access()
shortcut_set_switch_access()
- Before
-
$access = $access_manager->check(...); if ($access) { ... }
- After
-
$access = $access_manager->check(...); if ($access->isAllowed()) { ... }
AccessInterface::(ALLOW|DENY|KILL)
→AccessResultInterface
:\Drupal\Core\Access\AccessibleInterface::access()
(also affectsFieldItemListInterface
andEntityInterface
, which extend this interface)
- Before
-
$form[$name]['#access'] = $items->access('edit');
- After
-
$form[$name]['#access'] = $items->access('edit')->isAllowed();
NODE_ACCESS_ALLOW
/NODE_ACCESS_DENY
/NODE_ACCESS_IGNORE
→AccessResultInterface
:hook_node_access()
TRUE
/FALSE
/NULL
→AccessResultInterface
hook_block_access()
hook_entity_access()
hook_entity_create_access()
hook_ENTITY_TYPE_access()
hook_ENTITY_TYPE_create_access()
hook_entity_field_access()
\Drupal\node\NodeGrantDatabaseStorageInterface::access()
-
The following access checking logic used to return
FALSE
(the equivalent ofAccessResult::forbiden()
) in case it didn't care about an operation (e.g. the file access control handler only handled the "download" operation and returnedFALSE
in all other cases). But this is bad, because it prevents contrib modules implementing access checking logic for those other operations. Therefore, nowAccessResult::create()
is returned, to allow contrib modules to affect the access checking logic, for the following:LanguageAccessControlHandler
shortcut_set_edit_access()
shortcut_set_switch_access()
ShortcutSetAccessControlHandler
UserAccessControlHandler
FileAccessControlHandler.
MenuLinkContentAccessControlHandler
This effectively allows every access check to provide cacheability metadata, the overwhelming majority of which is then cacheable. Notably, this allows entity access checks to be cacheable (typically per role)!
Comment | File | Size | Author |
---|---|---|---|
#106 | access_cacheability_metadata-2287071-106.patch | 264.42 KB | Wim Leers |
#46 | xhprof runs.zip | 332.39 KB | Wim Leers |
Comments
Comment #1
Wim LeersComment #2
Wim LeersComment #3
Wim LeersComment #4
pwolanin CreditAttribution: pwolanin commentedA bunch of the code comments reference "this and any parent". Is that carried over from the menu links patch?
I'm not sure what parent means in terms of the general access check API
Comment #5
Wim Leers#4:
This is not related to the menu system in any way. It refers to parent things/objects.
For example: a certain complex value to calculate may depend on access check results. In that case, it should also be varied by the cache contexts of the access checks that it depends upon, otherwise the result will be incorrect. i.e. if the complex value is in principle globally cacheable, but it depends on e.g. a per-role access check, then the result should be cached per role.
Applied to menu tree render caching: a rendered menu tree depends on the access checks associated with the menu links in the tree. Some links may be cacheable per role, others per user, yet others per theme. That means that the cached result should be varied per role *and* user *and* theme.
Hence: "the cache contexts that this and any parent should be varied by".
Comment #6
pwolanin CreditAttribution: pwolanin commentedIn that case, "parent" doesn't really seem like the right term. Or at least I don't know if we're using it elsewhere in the caching system?
I might use a word like "encompassing" or "enclosing" or maybe "aggregate"?
I'm not sure how to concisely express "other cache items that need to be invalidated if the cached result of this service is invalidated"
Comment #7
msonnabaum CreditAttribution: msonnabaum commentedI think "parent" makes sense here, but we might qualify it a bit. Maybe parent cacheable? None of the other suggestions in #6 are clearer.
One thing that's not clear to me, is how we deal with changes to the cache contexts. For example, if an access check adds by-role to it's parent's contexts, and a permission is added to that role, is the parent's cache tagged with that role so that it'll get cleared? Maybe that's a larger issue than what this patch is doing.
I winced at that name at first, but I can't really think of anything better at the moment.
In general, I think this patch is going in the right direction.
Comment #8
effulgentsia CreditAttribution: effulgentsia commentedHuge +1 for adding API support to make access checks cacheable!
Why do we need a new interface for this issue instead of using CacheableInterface? The difference seems to be an assumption that we'll only want to cache access checks as part of some bigger structure (e.g., a rendered menu), but that's an implementation detail that shouldn't leak to the access checkers themselves. There's no reason we couldn't choose to cache access check results on their own, in a {cache_access} bin, if we wanted to. And if a containing item does want/need to affect its own cache info with the contained item's cache info, is there any reason it needs the contained item to report its info via a differently named interface?
What if we do this: instead of making AccessInterface extend CacheableInterface, give it a new method (e.g., getCacheInfo($route, $request, $account)). This needs to return an object that implements CacheableInterface. Instead of requiring each such access checker to create its own unique class for this, perhaps we should add a \Drupal\Core\Cache\CacheableInfo class that's just a dumb value object that receives keys, tags, etc. in its constructor, so that getCacheInfo() implementations can instantiate this class with the desired arguments.
If the above seems like a good idea, let's do it in this issue, rather than making AccessInterface extend CacheableInterface here, and then undoing it later.
Comment #9
Wim Leers#7:
Great remark. That's indeed how it's intended to work, but setting that role cache tag (in your example) has not yet been done.
This is exactly what I was getting at 4 months ago over at #2158003-30: Remove Block Cache API in favor of blocks returning #cache with cache tags:
What we chose to do there back then, is to put the burden of setting a cache context (a special key, essentially) and its corresponding cache tag (if any) manually. We probably want to do that here also, but this is another case of where it'd be great to have cache context affect both the keys and the tags…
#8:
:)
You make good points :)
However… there is still at least one important conceptual mismatch between cacheable access check results and
CacheableInterface
:CacheableInterface
is designed to be implemented by objects that themselves are cacheable. Hence they specify the cache keys that will be combined into a cache ID. In the case of access checkers, we could indeed make the assumption that it would be okay for the result to be cached. But we definitely also need the ability to affect cacheable parent objects. But "cache keys for caching the results" (i.e. generate a CID) versus "cache keys for affecting cacheable parent objects" (i.e. alter a CID to vary by some things) are not the same.In #2158003: Remove Block Cache API in favor of blocks returning #cache with cache tags, we removed "cache granularity", in favor of placeholder keys/constants that are expanded into their final value — because that's what needs to happen eventually anyway: the granularities caused a different CID to be generated. That change has been fine so far. But here we do need that distinction again: the keys that determine the CID, versus the keys that affect the CID, to vary it by certain factors/contexts/granularities/variations/whatever you call it.
If we say that access checks results could be cached in the future, then that implies that each of them should provide the necessary keys to generate a CID. We could do that, even though it'd be completely useless (at least for now), but even then, we'd still need to distinguish between "CID keys" versus "variation keys".
I *do* think that's a good idea. It's essentially taking this patch "all the way". So I worked on that all day today.
This is my concrete proposal:
CacheabilityAffectorInterface
, I removed that completely.CacheableInterface
also doesn't work because that requires the additional metadata that the access check depends upon (e.g. a node) would be missing, since it's only passed to the::access()
method of an access checker. Hence another reason forAccessCheckResult
. I propose to capture the cacheability metadata instead in aCacheable
(better name welcome) value object, that stores whether the associated thing is cacheable at all, the keys, tags, bin, max age and… stores the cache contexts separately, which hence allows us to fix/sidestep the "CID keys" versus "variation keys" problem.getCacheInfo()
method is flawed though, because that would mean replicating the majority of the logic in many access checkers to actually get and validate the entity from the request attributes, for example. So, instead, I propose to create a newAccessCheckResult
value object, with two properties: one for the actual value (AccessInterface::(ALLOW|DENY|KILL)
), and one for the cacheability metadata.EntityAccessCheck
and friends will still not be cacheable. The nice effect of going "all the way" is that we can effectively make entity access checks cacheable! (If the actual entity access hooks on a site *are* cacheable, of course — I only mean to say that it now becomes *possible* to cache them, rather than utterly impossible by design.)Attached is a WIP patch that implements this. All the changes are trivial, it's just a lot of grunt work. I would appreciate feedback on the overall approach. (This essentially means redoing the patch completely, so the interdiff is rather useless; just providing it for completeness.)
Comment #10
effulgentsia CreditAttribution: effulgentsia commentedI think you make a good point in theory, but our only current implementations of getCacheKeys() in HEAD (blocks) do not follow your model. They only return the variation keys, and BlockViewBuilder::viewMultiple() adds in the rest that's needed for a full CID. So with that as our example, my interpretation of CacheableInterface::getCacheKeys() is that it's de facto the same as what you conceptualize as CacheAffectorInterface::getCacheContexts(), and that it is the invoker of ->getCacheKeys() whose responsible for adding the rest of CID when wanting to cache the item in its own entry. Maybe we want to change that interpretation, but that seems like a separate issue.
No. Access checkers have no business validating request attributes. I think the ones we have that still do that (in e.g., quickedit) have @todos to fix that. Meanwhile, lumping cache info into the access() method means that we'll never be able to cache access check results on their own. For something like a per-user node access based on node grants, we could return the cacheability info faster than a full access check result. Maybe the DX gain of putting it all into access() as you suggest is worth giving up the ability to cache access checks on their own though. @catch, @msonnabaum: what do you think?
Comment #11
Wim LeersHah, great point! Let's consider this a non-problem then. Great insight, thanks!
I like the sound of that. I'd like to call out that AFAICT none or almost none of those that do this incorrectly then today have
@todo
s, and it's not limited to relatively unimportant access checkers like the ones for Quick Edit, it's also inEntityAccessCheck
,EntityCreateAccessCheck
,ConfigTranslationFormAccess
,ConfigTranslationOverviewAccess
,ConfigTranslationManageAccess
,FormModeAcessCheck
,ViewModeAccessCheck
, etc. There's quite a lot of them.And even in cases where no attribute validation is happening, like in
PathFieldItemList::defaultAccess()
orShortcutSetSwitchAccessCheck
, there is "phased access granting": if the user has a certain role with some sort of "admin powers" permission, then the result is cacheable per role, if not, then it's going to be cacheable per role and per user. In other words: there are still going to be scenarios where you need to duplicate the logic that determines what the access check result is (ALLOW/DENY/KILL) to generate the corresponding cacheability metadata.I don't see why this would be true? It'd always be up to the thing calling the access checkers to cache the results, and I don't see why doing
instead of
does not result in the same possibilities, while removing duplicate logic.
If you want access checkers themselves to cache the results, then I definitely don't see the problem. Also see the above "phased access granting" remark.
Since it sounds like we're discussing implementation details rather than the concept, I'm going to continue to work on this patch today.
Comment #12
Wim LeersStraight reroll. Next step: continue this work-in-progress to have a testable patch.
Comment #13
Wim LeersWhile working on this, I encountered existing test coverage being broken, impeding the work of this patch. Opened #2313115: AccessManagerTest is broken to fix that..
Comment #14
Wim Leers#2313115: AccessManagerTest is broken was committed already! :)
Just uploaded the first complete patch that should be mostly green. Did that over at #2280195-157: Ignore: patch testing issue, where I will continue to send it to testbot to make the patch green first, to avoid creating a lot of noise here.
Comment #15
Wim LeersPlease review the concept of this patch! Please disregard implementation details such as the specific signature of the
AccessCheckResult
andCacheability
objects. It can be changed. It can be improved.The concept & its consequences
What matters above all, is that all access checkers, and in fact anything that implements
AccessibleInterface
, plus on top of that all the entity access-related things (the hooks andEntityAccessControllerInterface
) has been converted to no longer return either anAccessInterface
constant or TRUE/FALSE/NULL, but is now without exception converted to returning anAccessCheckResult
object. Why an object? Because we can then associate cacheability metadata with the result.The consequence of this is that when we check access for something, we know whether the result applies to all users of this role ("cacheable per role"), or just this single user ("cacheable per user"), per language ("cacheable per language") or perhaps applicable to all contexts ("globally cacheable"). But also whether it's dependent on something that might change ("cacheable until this entity is modified", which is a fancy way of saying "this access check result is valid until the specified cache tag (the one for this entity) is invalidated). And, of course, the combination: "cacheable per role, until the entity is modified" (e.g. all anonymous users may see this node, until it's modified, for example because it's unpublished).
This cacheability metadata is what this issue is all about. It will allow us to e.g. cache rendered menus effectively. But that's not the only area where it will be of use. I can describe other use cases, but I'll do that in another comment, to not make this unnecessarily long.
From a syntax error (#157 in the patch testing issue), to 950 fails, to 64 fails, to 22 fails to finally zero fails. (There is still a single exception in there that I have to finish debugging for, but it doesn't stand in the way of reviewing the concept. Hence I commented out the line triggering it and added a @todo that will need to be resolved before this can be committed.)
Comment #17
Wim LeersAnd sadly, I didn't test whether commenting out that one line would indeed cause the test to pass. Turns out that I needed to comment two lines. Stupid! This one should be green. This is the patch I should've uploaded in #15.
Comment #18
dawehnerGeneral review:
Given that this makes the DX way of writing access related functions way worse we should try to improve it. At least
adding methdos like isAllowed() would also save you to include the access interface all the time.
On top of that I am confused about the cacheability because some of the checkers depend on the request so I don't see how they could be cacheable (toolbar is one example).
This is a mix of detail/nitpicks/highlevel/random thoughts.
I wonder whether we could theoretically let us just mark specific instances as CacheabilityAffectorInterface given that this adds a dependency for our routing component to our cache component. Otherwise add a CacheableAccessInterface which includes both or something else.
<3 value objects! We have to open up a beer on value objects at drupalaton
Meh, can we describe that just a couple of constants are allowed here?
It would be great to have some definition somewhere which explains the concept of cacheability.
Nothing against the usage of map and reduce but can we try to explain a bit better what the code does at the end?
Does that mean we cannot longer lazy-evaluate multiple access checkers? Just like if we would have overridden the && operator in cpp.
This is btw. NOT a value object ... as it contains quite some logic. Maybe the logic should live somewhere else.
Well, routes are not only provided by extensions but also my actions in the UI: create a new page manager display/view. So is this really cacheable that way?
Okay we seem to still be able to lazy-evaluate stuff.
It is a little bit odd to have one bit set via the constructor and one on the property, but yeah this is not horrible just odd.
I am confused. Let's assume $this has max age 10 and $other has permanent. Given that permanent is -1 it will be set. I would have expected 10 to be lower
Isn't that the exact same code as in the merge method of AccessCheckResult?
I wonder whether we could add methods like isAllowed, isDenied and isKilled on the AccessCheckResult object. There are a couple of places where we do such an comparison
It is quite sad how much worse the DX is. We worked as hard to make access as simple as possible, using service tags instaed of appliesStatic() and what not.
I don't see changes to the interface ... This also not sounds like a proper is*() method anymore, given that it returns a value object.
Why is that not cacheable? Nothing changes beside the Plugin ID. In general I really wonder whether it is worth to cache that admin listing, given that it contains potentially nodes/entities which aren't cacheables again. I guess your plan is to cache /admin menu?
Can we add a todo + followup?
This seems to assume book manager internals ... but I guess this is fine.
Can't we just get the commented_entity_access and set the cacheablity tags on there? This merging feels more complex tbh.
I try to understand where the $entity->id() condition is taken into account for cacheability.
Maybe just create this object right before returning it. It is not used somewhere else.
didn't you changed that to a value object? This will always be true.
Same as before, why do we not set the cacheablity directly here. this any-ness is confusing.
Just move the $access into the if()
TODO + followup.
Let's use $this->willReturn instead.
ha, it is elseif (I don't know why though)
NULL is cacheable per role? maybe we should use a hasPermission() as example
Just a general question: You add a new entry to a menu, which will change the hash. How will this be reflected here? The url might change, so the access might be granted even the actual hash could be already different.
So what is different between settings and config?
sometimes I see two spaces.
Comment #19
pounardSo you're adding cacheability information (throught the
Cacheability
value object) onto theAccessCheckResult
object, but at which point are you actually using theCacheability
object to cache anything ? Will that be in specific usages such as in menu links caching ?While this seems to be thoroughly thought, my first reaction while reading this patch was it seems that there is so many value object indirection here. That thought put aside the only thing that bothers me is the fact that
AccessCheckResult
knows aboutCacheability
value object; Eventually it couples the access check system to the cache system somehow, at the interface level of theAccessManagerInterface
and that's really inconvenient and probably unnecessary.The cacheability is supposed to be used only the menu links to be cacheable, and only for the ones that would need later display (tabs, contextual or action links, and menu blocks). Isn't there a way to keep the cacheability information at this level ? It would be great to see this not bleeding over the raw access system.
Comment #20
effulgentsia CreditAttribution: effulgentsia commentedI'm currently torn on this. I think I see the value of it (as opposed to separating in the way I suggested in #8), but I think the top of #18 and #19 raise some good concerns about the DX + complexity of intertwining the result value and its cacheability info. I need to think about this some more before giving substantive feedback. But I'm very happy to see so much (i.e., all of the) conversion done already, since looking through all those use cases will inform the benefit / weight analysis + options for alternative formulations.
Comment #21
catchI'd like to see stub usage of this even if it's pseudo code, because I'm not sure how it resolves the following;
The point at which you need to know cacheability metadata is when determining a cache ID. At that point no access checkers can be run, because you don't know what they will be yet.
#2099137: Entity/field access and node grants not taken into account with core cache contexts is open for this and the overall idea so far for entity render caching was just to allow access modules to interact at the point the cache ID is created, that doesn't at all appear to be the case here.
Comment #22
Wim Leers#18:
I disagree that this makes the DX "way worse". The current DX is bad, because sometimes you have to return one of the
AccessInterface
constants, and sometimes it's TRUE/FALSE and yet other times it's TRUE/FALSE/NULL. This makes it consistent. That being said…This I of course agree with :) It's also why I repeated *several* times that this should receive a concept review, not an implementation review. It's easy to make the DX of the implementation better.
RE:
CacheabillityAffectorInterface
: that's gone now, I just haven't updated the issue summary yet.Since it's too early to do an implementation review, and most of the review is in fact just that, I'm only responding to concept-related parts of the review for now.
Haha :D :D
AccessManager::check(All|Any)()
and you'll see that this optimization is still in place. TheAccessCheckResult
class merely is the new home for this logic, because it's also being used elsewhere.new AccessCheckResult(TRUE|FALSE)
pattern is quite handy, because it makes it very easy to grep the code base for access check results that are not cacheable. Plus, it forces the developer to think about the cacheability of his/her access checks (this was the main reason, the grepping was a consequence that I found to be quite handy).AccessCheckResult::merge()
.#19:
Correct. This will be used in #1805054: Cache localized, access filtered, URL resolved, and rendered menu trees.
I think that by "this level", you mean at the menu links level rather than the access checker level?
Assuming that I understood/interpreted that correctly, the answer is: because menu links themselves don't know which access checkers will apply to them. A menu link is essentially a reference to a route (or external URL), plus a title/label, plus tree metadata, plus less important metadata. From a performance POV, it'd be awesome if we could say that a given menu link was visible if the user had a certain permission. But that's not how it is designed currently. We currently want menu links to "automagically" adapt to the current user, to prevent showing any links that may result in a 403 (which is a valid reason and worthy goal, I think).
Consequently, given a menu link, we look at the route and see if the current user has access to it. Which means running all the access checkers. Which means the cacheability metadata must live there. Since the accessibility of a route is also determined by access checkers, it's only logical to put the burden of providing the cacheability metadata on the access checkers as well.
Finally, we should be able to use the cacheability metadata for other use cases as well.
#20: I understand. I think you must keep one important fact in mind: the logic that determines the access check result (ALLOW/DENY/KILL) is also the logic to determine the cacheability metadata. That may sound rather abstract, so let me give you one clear example. From
UserAccessController::checkAccess()
:As you can see, there are several "phases" or "levels of access" contained within this access checker:
KILL
. This is a globally cacheable result, because the answer is the same for everyone, always. If this is not the case, go to the next step.ALLOW
. Since it depends on a permission, it's cacheable per role. If this is not the case, go to the next step.ALLOW
. Since this depends just on the user that's looking at the profile, this result is cacheable per user.ALLOW
if the user is not blocked, andDENY
if blocked. (DENY
, notKILL
, so that some contrib module may add a "access blocked user profiles" permission.) Since this depends on a permission and on the status of the entity, it's cacheable per role, but only until the profile's user entity is modified (because that might block or unblock that user).DENY
, which means as much as "no opinion".I think it's intuitively clear that these different phases or levels come with different cacheability associated with them.
The only way that I can think of to simplify this, is by getting rid of the phases. This would imply that every access checker can only perform a single if-test, i.e. something like
$access->value = ($condition) ? ALLOW : DENY
. This if-test would then only have one possible piece of cacheability metadata. This would allow the DX to become simpler, at the cost of having more access checkers. But having written that, I just realized one would have to always use "OR" (the "any" access mode) to combine the different access checkers… whereas we almost always use "AND". So that'd be problematic too.#21: Hah :) I was expecting this question. Great question!
The answer is both simple and complex.
The simple version is: we don't need access cacheability metadata to determine the cache ID for the "menu links" use case. But you're right that this does not solve #2099137: Entity/field access and node grants not taken into account with core cache contexts.
The complex version is: #1805054: Cache localized, access filtered, URL resolved, and rendered menu trees allows you to set which cache contexts it should cache by (i.e. you can configure the
Vary
header). By default, it caches per role (Vary: user-role
). Any menu link that depends on anything else (let's say menu link A is cacheable per role and country) will be rendered into a render cache placeholder. If that didn't make sense yet, here is a longer version.Because any menu link may depend on any number of access checkers, and all those access checkers' results may have any cacheability that you can think of, and that cacheability may differ depending on the request (as explained in the above example: the same access checker can result a result that's globally cacheable, cacheable per role, cacheable per user, cacheable per role until an entity is modified, and so on), it is by definition impossible to let access checkers affect the cache ID. Unless of course we run all these access checkers on each request, but that will in some (or maybe many) cases defeat the purpose of caching in the first place.
Therefore, it is impossible to automatically determine a cache ID that takes into account the contexts by which we should cache in order to cache a chunk of HTML that we can return immediately. Since that is impossible, the next best thing is to configure the cache contexts that become part of the cache ID for the rendered menu. We default to "per role". We can then effectively render cache all those menu links whose access check result's cacheability metadata only contains the "per role" cache context. We can also still render cache them if they have any cache tags ("cache until user X is modified"). Any menu links whose access check result depends not only on role, but also on e.g. language will be rendered into a render cache placeholder. This means that the thing we store in the cache is "blob of HTML with placeholders plus metadata to render those placeholders". This allows us to comply with the default of "per role".
This will be sufficient for most sites. Especially now that this patch will effectively make most entity access checks cacheable per role.
However… sites for which this not sufficient, i.e. sites that also have many menu links that are not cacheable per role, but "per role and country and language" (or whatever else), will be able to optimize this! They can change the default of "per role" to "per role and country and language", and voila, fewer render cache placeholders will be generated, and more cache items will be created (because there will be more combinations).
So, there you have it. It's essentially an explanation of what #1805054 will provide, once (something like) this patch lands.
(While working on this, I also thought for some time that this would help solve #2099137: Entity/field access and node grants not taken into account with core cache contexts. But it won't. Because we allow access checkers to have any logic. Hence the access check results may also cache by any context (role, user, language, etc.). And it will always be possible to create uncacheable access checkers. If we'd automatically render cache entities by the complete range of possible cache contexts, we'd end up render caching each entity for every user, for every group, for every language, and so on. It wouldn't work. We could choose to go for a similar solution to menu render caching (create render cache placeholders for those fields whose access check results depend on a superset of the "default" contexts to cache by, then render those on the fly), or we could go for the "hook to alter the cache ID" approach. But the "hook to alter the cache ID" approach is not universally applicable, as I just explained: it'd end up meaning "cache per *everything*".)
Comment #23
catchHow do you find this out without running the access checkers?
Comment #24
Wim Leers#23: You don't; you do run access checkers for that. But that's not any different from today? When rendering menu links today, you run their access checkers, to determine whether they're visible or not.
What this patch plus #1805054: Cache localized, access filtered, URL resolved, and rendered menu trees give us on top of that, is knowing whether the result we get back is cacheable per role/user/….
Combined with the menu rendering logic's awareness of what contexts the rendered menu will be cached by ("per role" by default), we can then choose to only render those menu links that whose access check result is globally cacheable, or cacheable per role. If a menu link's access check result is "per user", then we render it into a render cache placeholder.
Comment #25
catchLet's take your example:
Let me see if I've got this right:
Empty cache.
User A visits with 'administer users', the menu link is cached as HTML because it's treated as per-role. The cid of the render-cached menu includes the role hash so that's fine.
User B visits without that permission, it gets to phase three and returns per-user, because it's per-user, we stick it in placeholder instead.
This means that for some roles the item is a placeholder, and for others it is directly cached in the HTML. Whether it's cached in the HTML or put into a placeholder depends on whether the granularity the access checker returns matches the configuration of cache granularity. Since the overall granularity is known up-front, it's fine to compare this and treat individual links differently.
If that's right, I have another question, but I'll make sure we're on the same page so far before continuing :)
Comment #26
pounardJust as a note, I'm quite convinced the cache metadata should not bleed and hard-couple the access check system to it. There are probably a lot of other layers in there where we can act upon to add the caching metadata. For example I'd say that menu links could use an additional interface over the controlers with an simple
getCacheabilityInformation()
(stupid name) method in order to build the menu links cache. Ideally I think the cacheability here solves a very specific problem and should live in the business layer where it solves it.Comment #27
Wim Leers#25: That's completely correct!
Comment #28
Wim LeersChasing HEAD. The context of two hunks had changed. Hence no interdiff.
Comment #29
catchOK so the next question would be this:
The number of access checks where there are multiple potential cacheabilities is fairly low (although entity access accounts for a lot in practice).
Entity access checks tend to only return early for users with administrative permissions. While having granularity as part of the return value allows for optimizing the case where an access check returns early for some users, I'm not sure there's enough of a trade-off there to justify the complexity (and where there is, it could probably be solved by splitting some of these up anyway). Non-administrative users will generally end up with a placeholder.
So that leaves me wondering why we can't (similar to pounard's suggestion) have a method on access checks that returns access check granularity - that could also be optional and default to the highest granularity. In core the permission access check would be per-role and any router item with just that can be cached per role.
This would massively simplify what access checkers have to do (just return the most pessimistic granularity they need from a method, or do nothing to get the most pessimistic granularity possible).
Additionally I think it'll be faster on cache misses:
In the current approach, if an access check has a granularity of worse than per-role, we'll run it once to determine the granularity. Then the placeholder gets added to the menu. Then #post_render_cache is going to run the access check again to determine if the link is actually renderable or not. So all the worst access checks get run twice on that request.
If we have just a method, then we only need to run that method when generating the HTML string since we can get granularity without running the check, then the actual access checks only get run in #post_render_cache.
That also means that the render-cached portion of the menu will get into the cache quicker, since potentially expensive access checks don't need to be run before the HTML can be set - would help slightly in a cache stampede situation.
Comment #30
catchAlso even if it's in a separate method, it'd potentially be feasible to do something like return per role if the current user is an admin else something else. Can't use context from route parameters of course.
Comment #31
moshe weitzman CreditAttribution: moshe weitzman commentedAt a conceptual level, this patch makes a lot of sense to me. Access checks have dependencies (i.e. contexts) and thus need to affect the granularity of caches. Having them return those contexts is a step forward.
Comment #32
effulgentsia CreditAttribution: effulgentsia commentedOk, thinking about it more, I agree with that much. Meaning, I agree with changing AccessibleInterface::access() and *AccessCheck::access() methods to return an object instead of a string, in order to allow (but not require, see the end of this comment) said object to be capable of returning cacheability metadata.
First, I'll clarify why I agree with that by responding to catch's question:
While a method on access check services might be sufficient in some cases, it doesn't cover some of the most important ones. Take for example, a menu with 100 links in it, all to nodes, where 1 of the nodes is unpublished. Assuming the site isn't implementing per-user node grants, then 99 of those access check results can be cached per-role, and only 1 (the unpublished one) needs to be per-user (because of the "view own unpublished content" permission). So the goal is to be able to render-cache the entire menu per-role, and have a #post_render callback for just the one link to the unpublished node. That way, on a (per-role) cache hit, we don't even need to load the 99 published nodes, let alone invoke
->access()
on them. Neither the EntityAccessCheck service (which is a single object for all entities, not just nodes) nor the NodeAccessController (which is a single object for all nodes) can be the ones to implement agetCacheContexts()
method that solves this use case of the cache context being different per node. If we were to settle for the most pessimistic answer (per-user), then we lose out on the benefit of being able to cache much more effectively per role for the vast majority of instances.Secondly, in response to pounard:
We already have a concept of routes having access checkers. I don't see it as being desirable in any way to invent a separate concept of link access checkers. I think any use case where you want link access to be different than route access is sketchy (we have one in core, MyAccountMenuLink::isHidden(), and I think it's sketchy, and it has a @todo about that), and needing to maintain two separate access check levels that should always be consistent with each other seems fraught with problems.
But, where I agree with pounard (at least if I'm understanding his concern correctly) is:
Yes, I see the hard-coupling of AccessibleInterface::access() to have an @return of
AccessCheckResult
to be problematic in two ways:@catch, @pounard, @dawehner, @Wim: thoughts on the above?
Comment #33
dawehnerYeah, we worked hard to use routes for access checking in various "links" related places. For now they work really fine, no reason to come up
with a new concept.
I do like this idea. This would allow us to keep a separated subsystem and share it with the world at some point. More important this keeps
us flexible as maybe someone has more ideas than just cacheable, maybe lazy-evaluation etc.
Personally I don't see this per defition as a big problem though just for usage inside a limited scope of code though this isn't really the case here.
If we want to optimize those kind of cases I would rather suggest to optimize on the tree level, see #2250315: Regression: Bring back node access optimization to menu trees
Comment #34
catchI don't think this is one of the 'most important' cases, we just shouldn't be supporting this all the time by default. That permission should be opt-in for listings, and it already is for things like the front page view that check published as opposed to not published || author = user (unless you especially configure that via Views).
I'm not sure that's even desirable during normal operations - i.e. would you want unpublished handbook items to show up in the book navigation during normal browsing on Drupal.org? On the front page?
#32 doesn't answer the 'double check' issue on cache misses mentioned in #29, that feels like a showstopper to me unless I've got that wrong.
Comment #35
Wim Leers#29/#30/#34:
That's also what I thought, but in practice, there are a lot of these. In fact, in most scenarios in Drupal core, entity access check results will be cacheable per role!
It's surprising how many entity access checks actually are cacheable per role. Hence the above statement is not actually a general truth. It's true for certain scenarios, definitely, but it's not a valid generalization. Many entities, and hence links to many entities are actually accessible with a "per-role" cacheability (but quite often with a "until the entity is modified" requirement, which means the resulting render cached menu gets the cache tags of the entities that are linked from the menu).
But the method I've described also doesn't require access checkers to be run to get the cache granularity for the render cached menu to retrieve (cache hit) or build (cache miss). The cache granularity (and hence the affected cache ID) of rendered menus is configured. As explained earlier, it's in fact undesirable to let access checkers affect the cache granularity of rendered menus, because that'd imply that almost every menu would be cached per user (plus per language, etc.), which is not at all desirable, because it'd lead to extremely low cache hit ratios. That's why render cache placeholders are used to deal with links that have "more granular" (than the default configuration, which is "per role") access checkers.
I'm also generally confused by this statement. The only way we can move all invocations of access checks to
#post_render_cache
, is by always rendering every menu link into a post-render cache placeholder. To actually render a menu link, it's by definition necessary to run its access check, otherwise we can't know if we should display it to the current user or not. But if we render every menu link into a post-render cache placeholder, then we're not really render caching anything… except for a very "bare skeleton" that doesn't buy us much. We still have to do almost all the work on every page load in that scenario!The cost is not in retrieving the granularity. The cost lies in running all those access checks and rendering all those menu links on every single page load. To render cache more than a "bare skeleton" of a menu, we need to run access checks when rendering a menu, so we can cache those menu links we know to be as granular or less granular than the default "menu rendering granularity" ("per role"). Hence there is no way to avoid running access checks when rendering menus, but to know whether a given link's access check result is cacheable, we need the granularity associated with the access check result… and that's all this is about.
First of all: it's not because an access checker has worse cacheability (i.e. "per user" is objectively worse than "per role") that it also is expensive to run. It very often amounts to comparing the entity's owner ID to the current user ID, both of which are already loaded into memory anyway, so these "worst access checks" (with worse (more granular) than "per role" cacheability) usually amount to a function call in which very simple comparisons are performed.
Second: when rendering menus, the
#post_render_cache
does not need to run access checkers for all of those whose cache granularity is more granular than the default ("per role"). Just like you and I discussed (~two months ago now), it's okay to cache the access check results for those access checks that are not "per role" on a per-user basis. i.e. we assume it's okay that access checker results remain the same for the same user, because that already is the "worst case" scenario. See #1805054-96: Cache localized, access filtered, URL resolved, and rendered menu trees, the part at the bottom.(BTW, these questions are largely about the approach that we brainstormed together! :P Of course, it's more detailed now, since I've actually fleshed it out, so it's to be expected that you have questions. But it's fundamentally the same approach.)
I think the answers above already answer this. But to make it abundantly clear, I'll give you a point-by-point overview:
The expected benefits are upwards of the 12–20% of the total page generation time improvement I saw at #1805054-97: Cache localized, access filtered, URL resolved, and rendered menu trees (which contains benchmarks and profiling), because there all entity access checks were considered uncacheable, whereas with this patch, many will be cacheable per role). Absolutely speaking, this will allow us to remove 12–20K function calls per page load (out of 80–90K calls per page load).
So: yes, on a "full" cache miss (i.e. neither the rendered-menu-with-placeholders is cached nor the per-user-cache-of-access-check-results already exists), we will run some (a minority) access checks twice. But that only happens extremely rarely. The typical case will be that both are cached. The atypical case will be that a user visits whose cache entry does not yet exist, in which case only a few access checks will run.
Compare that with the situation today, where all access checks run on every page load.
What you are asking for AFAICT, would imply caching a rendered menu for every user. That would lead to much lower cache hit ratios, much bigger caches, much more frequent cache invalidation of the most expensive thing (the rendered-menu-with-placeholders) rather than only the cheaper thing (the per-user-cache-of-access-check-results of access check results for a cached rendered-menu-with-placeholders)
All of the above is informative for this discussion, but it's actually independent, because it's about a complex use case for actually using this metadata. I understand that it's necessary to understand why this cacheability metadata is actually needed. But let's not get lost in the weeds of something that's in fact independent of this issue :)
#32:
I agree with this in general. But it does not always make sense to override implementations; quite often it's unnecessary in the case of value objects. However, I can see that being useful in this case, however unlikely it may be that somebody may want to do that.
So: sure, let's fix that, let's just create an
AccessCheckResultInterface
, and voila, this problem is solved.I'm fine with that.
#33:
I agree we should do #2250315, and I'm helping on that issue.
But doing #2250315 is orthogonal to what's being discussed here. #2250315 will make it more efficient to render menus. That's a laudable goal. That's why we're doing it. We'll get that done. No questions about that.
What this issue is about, though, is being able to cache those rendered menus at all. Currently, it's utterly impossible, because we simply don't have enough information to know whether it's safe to cache a rendered menu. This issue/patch provides that necessary metadata to make that possible.
Just to be clear: this issue and caching of rendered menus in general is no excuse for letting menu rendering be slow. We want that to be fast. But that doesn't mean we should continue to stupidly re-render something expensive on every single page load.
P.S.: Sorry for having written so much, hopefully it contains all the needed answers.
Comment #36
catchBy this, I mean the granularity of the access check/link, not the granularity of the entire menu.
Comment #37
catchI think you've misunderstood what I was asking for, here's a rough idea of how I think it could look:
A method gets added to each access checker which returns the granularity needed for the access check (not the menu).
This granularity can be based on information in the current request (so it would allow for entity access to return per-role if the user has certain permissions), but it would not be allowed to depend on route parameters.
Based on that return, links are rendered directly in the menu (and access checked there), or put into render cache placeholders.
The only difference from the proposal is that the determination is put into a separate method rather than the return value of the access check, and that the method does not get the route parameters as context. Should still cover most cases though where the granularity requirements are different.
We should be aiming for a situation where the entity is not actually loaded into memory - exactly because some of these access checks only need to check permissions, not any property of the entity. Issues like #1237636: Entity lazy multiple front loading and #638070: Router loaders causing a lot of database hits for access checks are related to this. The situation has got a bit better since #638070: Router loaders causing a lot of database hits for access checks since for example views no longer need to be instantiated to run access checks afaik.
Comment #38
pounardBy adding a new method instead of including it in the access result, we avoid being intrusive with code that absolutely don't care about this (for exemple base implementation will be to always return a granularity per role). And we also remove the need to have the granularity at the AccessManagerInterface level, and can be used only when necessary directly by the business code that needs it. This also would make the whole patch a lot less invasive since you won't have to modify all access checks to return something different that it already is in HEAD, and just focus on the optional granularity method on its own (the whole patch would be a lot simpler, and architecturally things would be a lot less massive).
Comment #39
Wim Leers#37:
I indeed misunderstood. Thanks for the clarification, that really helps!
That would be highly problematic, because then you wouldn't be able to indicate that e.g. a node is accessible, this applies to all users of this role, until the node is modified (e.g. because it is unpublished).
I understand what you mean now — this was the crux of what you were getting at, of course.
Now here's the fun fact: what you're asking for, is exactly what the original patch in #1 did! The caveat with that approach is (copied from the current — outdated — issue summary):
That's also exactly what I said is problematic about not letting cacheability metadata depend on the route parameters. But effulgentsia asked me in #8 to not do this in a follow-up issue, but in this issue.
The problems with returning cacheability metadata in a method instead of a value object are:
access()
method (mentioned in #9, point 2 at the bottom) — unless you can live with incorrect (imprecise) cacheability metadata, which is what you're proposingEntityAccessCheck
and other entity-related access checkers because entity access hooks don't have a way to provide the cacheability metadata associated with the "is this entity accessible" logic they contain — unless you allow value objects to be returned by the entity access hooks, but NOT by the access checksSo, your proposal is an option (and it's implemented in #1), but it is IMO flawed because it makes it impossible to associate correct cache tags with access check results.
That sounds wonderful. But I think that's 1) out of scope, because this patch does not cause regressions, 2) impossible to achieve for many access checks, because they (need to) invoke
$entity->access()
, which by design occurs on an already-loaded entity. Changing that API at this point is likely impossible.You're right that it's possible for the "if own entity" type check, of course.
#38:
If this is the concern, we can make
AccessCheckResult
's cacheability default to "per role". That's trivial to do.But a specific access check result (ALLOW/DENY/KILL) inherently depends on certain information, and the cacheability information describes those dependencies. That's why we need it to be tightly associated, otherwise we get incorrect/imprecise cacheability metadata, which is significantly less useful.
Furthermore, somebody might want to change the default
AccessManager
to perform caching of access check results.If the cacheability metadata was being unrelated to the access check result (i.e. unrelated data was being forced into the same data structure), then I'd agree with you. But because that's not the case, I disagree.
… and also a lot less useful. We'd literally be back to square 1: the patch in comment #1. Also see the background information in my reply to #37 :)
Comment #40
catchNot sure it prevents that. You won't get the actual node, but you'll know that the access check takes a node as a parameter, so it should be possible to indicate that. In terms of an implementation that adds cache tags to access check caches, that should also be possible by adding a separate method rather than baking it into return values no?
Comment #41
Wim LeersI interpret this as
.If that's indeed what you meant: that's better but still very problematic, because many access checks accept an entity but only use it optionally, i.e. they only use it (and hence depend on it) if several conditions don't apply (e.g. the user doesn't have some sort of "administer taxonomy terms" permission). So then we're still associating incorrect/imprecise cacheability metadata with the access check result.
Comment #42
effulgentsia CreditAttribution: effulgentsia commentedSo we have 3 situations:
To address the first case, what if we allow, but don't require, access checkers to also implement CacheableInterface? Access checkers choosing to do this would be announcing to the system that their cache information doesn't depend on route parameters, so they prefer to provide their information directly rather than via the access result object.
To address the third case, I still don't see a way to avoid making the access result object be the one to implement CacheableInterface.
To address the second case, we could choose to create some additional method on access checkers, other than the access() method, to which we pass route parameters, but I don't see the benefit of that (I suggested something like that in #8, but the more I think about that, the more I think it doesn't help either performance or DX). If we choose to solve the 3rd case by allowing the result of the access() method to potentially implement CacheableInterface, then I think that approach is equally appropriate to use in the 2nd case (where only tags depend on the route parameters), because in most cases, it's the preparation of the route parameters that's the expensive step.
In the cases where the access check itself is potentially expensive beyond the loading of the route parameters (e.g., node access that relies on node grants), the access check result object returned by the access checker can be lazy. In other words, the object implementing both AccessCheckResultInterface and CacheableInterface just means that it needs to have a getValue() method (or perhaps instead isAllowed() and isDenied() methods, but that can be bikeshed later) and getCache*() methods. An access checker where all the logic is cheap could use something like the AccessCheckResult value object implementation in Wim's patch, but something like NodeAccessController could return a custom implementation that has a fast implementation of getCache*() and a slow implementation of getValue() that would only be invoked once. @catch: would that address your 'double check' concern?
Just clarifying that my proposal in this comment would still be very invasive, because it would still require changing all access checkers to return an object of type AccessResultInterface rather than a string constant. Unless we want to allow either a string or object return value, but I think that makes for ugly @return documentation and ugly logic in the callers to deal with return values that can be of different primitive types.
Comment #43
Wim LeersChasing HEAD because #2154435: Rename EntityAccessController to EntityAccessControlHandler, #2307681: getEntityTypeFromStaticClass - poor performance and #2098355: Missing default access for all node fields — all of which caused a lot of changes that forced manual updating of this patch.
This patch does not yet take all the feedback into account. A patch that does is coming later today.
Comment #44
Wim Leers@effulgentsia and I had a long chat about how to move this issue forward. I think the main concerns in this issue that have been raised so far are 1) DX, 2) tying the access system to the cache system, which we don't want.
This reroll should address that.
It renames
AccessCheckResult
toAccessResult
.AccessResult
implementsAccessResultInterface
. Everything is typehinted toAccessResultInterface
. But, the default implementation of it (AccessResult
) also happens to implementCacheableInterface
, hence allowing for cacheability metadata. But it's no longer required. It's just strongly encouraged.This will also allow for the pattern that effulgentsia described in #42 to perform lazy evaluation of access checks: e.g. for node grants, we could choose to use a different implementation of
AccessResultInterface
that only executes the DB query if its value is actually requested. I will post that tomorrow, then you'll also have a clear interdiff showing what that'd look like.Overall, this reroll significantly enhances the DX compared to all previous patches. In most cases, nothing really changes, it's just as legible as before, just a bit richer (because cacheability metadata is added).
Looking forward to your reviews!
Comment #45
Wim LeersChasing HEAD; #2285083: Rename contact category to contact form broke this.
Comment #46
Wim LeersJust to make sure that this doesn't cause a performance regression, I did some profiling.
Setup:
As you can see, essentially status quo:
PDOStatement::execute()
) and same files being loaded (Composer\Autoload\ClassLoader::findFileWithExtension
), it takes 2 ms more in the latter — this is the I/O variability skewing the results. Which means the difference is really only 2 ms, and that's without looking at other I/O. Therefore the difference is around the 0.5% mark in XHProf, which is less than its margin of error. Similarly small differences when benchmarking withab
.As said earlier, the original patch in #2273277: Figure out a solution for the problematic interaction between the render system and the theme system when using #pre_render that worked saw 12–20% performance improvements. Since this patch now provides better cacheability metadata (entity access checks also have cacheability metadata, and almost all node links will be cacheable per role), I expect that number to be even better once I reroll it on top of this patch!
Comment #47
tim.plunkettOverall I like the flexibility and power this gives us. The DX hit isn't too bad.
The patch doesn't apply cleanly right now.
Why are these public?
Why aren't these methods on the interface? I understand maybe the cache ones are separate, but allowIf and friends should be somewhere
Missing a full docblock
Taking this as an example of the patch as a whole:
I really dislike the verbosity of "AccessResult::create()->allowIf()" since it is used so commonly. Was there any thought given to having several factory methods, like AccessResult::allowIf()?
Comment #48
Wim LeersThanks for the review!
AccessResultInterface
only need the four that are already on the interface:isAllowed()
,isForbidden()
,orIf()
andandIf()
.allowif()
be both a static method *and* an instance method. So then I'd have to call itcreateAndAllowIf()
… which only yields a 1-character win:I think we want the instance method because otherwise we lose the ability to "gradually build" an access result (i.e. create an access result, perform if-tests, and depending on which logic branch we end up in, invoke different instance methods on the
AccessResult
object).It is possible though to get rid of that possibility, which would allow us to introduce
AccessResult::allowIf()
. You're right that it would simplify the most common cases.What do you think?
Rebased on top of HEAD (and after I finished, I had to rebase it again :P — HEAD moves fast!).
Comment #49
Crell CreditAttribution: Crell commentedI can't say I like the complexity this adds, but I can't think of an alternative. That said, we could simplify the DX a little here by having return AccessResult::allowed() and AccessResult::kill() and AccessResult::deny(), which simply return a new result object set to that result. It's a bit less typing and I think more natural for people writing access checkers.
Comment #50
Crell CreditAttribution: Crell commentedAnd that's what happens when I go have dinner between looking at an issue and commenting on it. :-P
Wim, can you clarify for my late-night brain what you mean by "It is possible though to get rid of that possibility, which would allow us to introduce AccessResult::allowIf()"? Get rid of which possibility, and how? It sounds like you mean get rid of the possibility of building up the object gradually, which seems like not something to lose...
Comment #51
Wim LeersChasing HEAD again (broken twice while rerolling — by #1976158: Rename entity storage/list/form/render "controllers" to handlers, #1837388: Provide a ParamConverter than can upcast any entity. and #2314599: Use title/url instead of l() for building breadcrumb this time).
I had a chat about #47.4/#48.4 with tim.plunkett & effulgentsia yesterday. We concluded:
But… I think the suggestion in #49 is even better. So I went with that instead. I kept the existing
AccessResult
methods the same, and went with::allowed()
(convenience method for doing::create()->allow()
),::allowedIf()
(::create()->allowIf()
),::forbidden()
and::forbiddenIf()
. I checked with effulgentsia and tim.plunkett and they're also both on board with that.This patch now has 100% test coverage for
AccessResult
.All PHPUnit tests pass, hopefully all other tests also pass.
Comment #52
tim.plunkettOkay, now this looks great! I didn't go through the whole thing again, but I have no more DX concerns.
Comment #54
Crell CreditAttribution: Crell commentedMuch better! Thanks, Wim!
Comment #55
Wim LeersAll of the fails in #51 were due to not having updated the newly added
core/modules/node/src/Access/NodePreviewAccessCheck.php
access check yet. This will be green.Comment #57
Wim LeersAssuming HEAD wouldn't have changed too much in the past 14 hours was too big of an assumption.
However, this time, the power of git's automatic 3-way merges saved the day: automatic rebase attached :)
Comment #58
effulgentsia CreditAttribution: effulgentsia commentedInitial review (nits) before dreditor crashes :)
Remove "highly".
Can we remove all mention of DENY, KILL, and ALLOWED constants throughout core? Everything should be explained in terms of AccessResultInterface::isAllowed() and isForbidden(). How those are implemented is an implementation choice. Related, let's move the definition of the constants into AccessResult.
What about just
$result->andIf($other);
to not hide from people that the method mutates the subject."... Access services must return an object that implements AccessResultInterface".
Everywhere in this class: s/AccessResult/static/
This should either also invoke the other reset*() methods, or else be renamed to resetValue(), resetAccess() or similar.
Do we want to overwrite $this->value if $condition is FALSE? Same q for forbidIf().
What does this mean?
individual access results
Adds the 'cache_context.user.roles' cache context.
Adds the 'cache_context.user' cache context.
Adds the entity's cache tag.
Does this mean order of operations will affect cache metadata (i.e., we end up with a more aggressively cleared cache than we need if the forbidden one is the argument rather than the subject)?
Since we're checking !isAllowed(), do we need the isForbidden() check?
Comment #59
effulgentsia CreditAttribution: effulgentsia commentedRe #58.7: Or, what about removing the instance-level allowIf() and forbidIf()? There are few callers of them, and I don't think they'd be made worse by having their own
if
statement.Comment #60
effulgentsia CreditAttribution: effulgentsia commentedGot through core/lib. Still need to review core/modules and core/tests, if no one else does.
If we enter the if statement, shouldn't we still ->orIf() instead of overwrite, since the lack of module opinion might be conditional on something dynamic?
Where is it communicated that the result also depends on $entity->id()? Or is it up to the caller to know that (e.g., for menu links, not a problem, but not sure about other use cases). Same question might apply to other access checkers and their various arguments, this is just the one that caught my attention.
Same as above re needing an ->orIf() instead of overwrite.
Is the setCacheable(FALSE) because we don't have cache tags for non-entity config objects? If so, a @todo would be nice. Related: #2040135-4: Caches dependent on simple config are only invalidated on form submissions.
Comment #61
dawehnerupdate the IS to include the actual outside API change.
Comment #62
Wim Leers#58 & #59: All done!
Details where necessary below.
#58.2: perhaps we do want a
AccessResultInterface::hasNoOpinion()
method, which would be a convenience method replacing!isAllowed() && !isForbidden()
? Similarly, perhaps we should renameAccessResultInterface::reset()
to::resetOpinion()
?#58.6: done, changed to
::resetAccess()
.#58.7: Excellent question! I think you're right, we don't want to overwrite
$this->value
, because that makesallowIf()
easier to understand. Then it's literally about conditionally callingallow()
, and nothing else.Test coverage updated.
#58.8: Clarified; also updated
CacheableInterface::getCacheKeys()
's docblock — thereby making the language consistent. Any further updates in this area should happen in a follow-up, because it's scope creep. I've opened up #2329101: CacheableInterface only has a getCacheKeys() method, no getCacheContexts(), leads to awkward implementations to refine that further.#58.13: Yes, order matters. And for good reason: if a decision has already been made, then any subsequent
orIf()
calls don't affect the access result. This is nothing new.#58.14: Yes, we do need that check. The
andIf()
check can (and must) terminate if the current access result's value is notALLOW
, since all access results must beALLOW
for the combined end result still to achieveALLOW
. Therefore we want to return early if either the current access result is not explicitly allowed (!isAllowed()
, i.e.DENY
) or if it's explicitly forbidden (isForbidden()
, i.e.KILL
), because the combination of those checks allow us to know that it is indeed explicitly allowed (ALLOW
), in which case we can apply the logic to combine the other access result's value.Try changing it and you'll get a test failure.
#59.1/3: Excellent catch! Fixed.
#59.2: I don't understand the question. An access result cannot depend on an entity's ID. It can depend on an entity. And if that's the case, we use
::addCacheTags($entity->getCacheTag())
, for which::cacheUntilEntityChanges($entity)
is the shorthand/convenience method.#59.4: Wow, amazing catch! You're right. Added a @todo.
Comment #65
Wim LeersQueued for re-testing because I cannot reproduce the 4 test failures locally. Fingers crossed.
Comment #67
dawehnerTo tackle the accidental boolean issue we should keep the existing methods (and return boolean), the existing access manager interface
but add a new one AccessManagerWithCacheContext which adds methods which returns the access objects. This ensures that we don't accidentaly leak information
Comment #68
effulgentsia CreditAttribution: effulgentsia commentedFWIW, I get the same 4 failures when running locally via the Simpletest web UI when applying #62 on top of core commit 2191f864ab27e8457c25cbeb4e0fe2a5b644a263. I didn't try via run-tests.sh.
Comment #69
Wim Leers#2323721: [sechole] Link field item and menu link information leakage broke this, rerolled. On
68755a3bbfaef7b2507bb5c053a631e9f3b320cc
now (latest as of right now), and still can't reproduce this. Here's hoping that it will now pass.Note: I'm on PHP 5.5.11.
Comment #70
Wim LeersIgnore #69. It's rerolled on top of #57 rather than #62. Sorry for the noise.
Comment #71
Wim Leers4 test failures reproduced now. Was all my fault; I'd been testing with #57 all along… stupid! :(
Should be green now.
Comment #74
Wim LeersLOL, yet another commit that broke this: #2188675: "Translate" local task always visible, leads to WSOD.
Comment #75
effulgentsia CreditAttribution: effulgentsia commentedDoes this need cache tags added for $block?
Before the switch statement, we checked a permission, so this should be cached per role.
Time to debug this.
Why the $condition intermediary?
Why add cache tags for $items->getEntity(). But, we will want cache tags for $field_definition: that can be a @todo followup though. Also, the allowed() needs to be per role too, right?
Rogue hunks from a different issue?
resetAccess()
Time to fix.
If depending on config, then not cacheable until config has cache tags.
Comment #76
effulgentsia CreditAttribution: effulgentsia commentedIssue summary should also be updated to not reference CacheabilityAffectorInterface, which isn't in this patch anymore. And it would be good to make sure the API changes section lists all interfaces affected (e.g., EntityAccessControlHandlerInterface, AccessibleInterface, AccessManagerInterface, what else?).
Comment #77
tstoecklerThanks for your persistence on this @Wim Leers!
Since this is a common pattern in this patch I wonder if we can improve the DX for this now that we have a rich object to work with anyway. Maybe
$access->allowIfEslseForbid()
?This is also common in this patch. Also I think this will be one the most common use-cases in contrib, so I think we shouldn't make people ponder about cacheability when we already know that permissions are tied to roles anyway. Can we add a
$access->allowedIfHasPermission($account, $permission)
or similar which just performs the two function calls internally?Comment #78
Wim Leers#75/#76: thanks, will reroll!
#77:
Comment #79
Wim Leers#75:
KernelTestBase
test that doesn't have routing stuff installed. So I just disabled the "linking" setting, and now it passes. I wonder how this used to pass before though…#76: Done.
This reroll does not yet address #78.
Comment #80
Wim Leers::allowIfHasPermission()
as a convenience instance method::allowedIfHasPermission()
as a convenience static method(consistent with other convenience methods)
I also made one mistake in #79 —
DbUpdateAccessCheck
allows for asettings.php
override, and of course if that's the case, the access result should not be cacheable, becausesettings.php
can change at any time. So I marked it as uncacheable now.Comment #83
Wim LeersAnd green again.
Comment #84
effulgentsia CreditAttribution: effulgentsia commentedI disagree with this. The issue summary now lists all the different methods/functions that return an access check result. Why would we want to single out AccessManager::check()/checkNamedRoute() as following different rules than the others? I think it's cleaner for everything that returns an access check result to return an AccessResultInterface, which this patch does. Yes, that does mean that callers will need to explicitly call
isAllowed()
, sinceif ($access_result)
always returns TRUE, but I think it's more likely that developers will remember to callisAllowed()
if they learn that they always need to do that, rather than needing to learn when they do and when they don't.I think the patch is RTBC other than reaching consensus on this point. @dawehner: thoughts?
Comment #85
dawehner@effulgentsia
Well, your argument would certainly be 100% fine in case there would be just Drupal and we would just develop for Drupal, but well, in case we would
try to write generic components, which we don't do, the code would no longer follow the rule of clean code:
Well, let's hope developers read documentation before they write code, examples clearly show that already, right ;) ? I hope though that we still have security first
as our concept, over performance/cachablity, in case no, good luck!
So this interface checks access, really? What about the following change:
Well, anyway, don't feel blocked on me, just want to ensure that we really don't treat some values over others.
Comment #86
chx CreditAttribution: chx commentedStop this madness now. There was a saying in a usability book (by Jef Raskin): if a control always needs to be operated do not provide that control!
If you always need to call a method then do not provide that method.
This is absolutely not acceptable and dawehner is right.
Comment #87
effulgentsia CreditAttribution: effulgentsia commentedLet's say I want to do some logic based on if I have any nodes. This won't work:
Because that if will always evaluate to true. Instead, I need to do:
Why do I need the explicit call to
execute()
? Because there's something else I might want to do with a query object: execute() isn't the only method. Even though in practice, what do I usually want to do with a count query other than execute it?Same thing here. What this issue is about is making access check results not just booleans (or tristate constants), but objects, so that callers can know more than just whether access is allowed or not, but also what that result depends on (i.e., cache contexts and tags). Unfortunately, PHP doesn't allow an object to implement a magic __toBoolean() method. Per the issue summary, there are many interfaces/methods that return an access result. Does that mean we want all of them to have a "return as boolean" and "return as object" variants? Or, can we trust that developers can learn that access results are objects on which you can call
isAllowed()
, just like they learned that queries are objects on which you can callexecute()
?Comment #88
chx CreditAttribution: chx commented> Why do I need the explicit call to execute()? Because there's something else I might want to do with a query object
And there's nothing you can even do to the AccessResultInterface. The query object has a large amount of methods that significantly change its behavior and so it needs an execution. There is nothing like that here.
You guys long, long ago have lost any connection with reality. I am trying to drag back Drupal 8 to the ground but it's hard and if this goes in it'll be even harder.
And yet, I know there's nothing I can do. So resetting to code needs review as I know there is nothing I would say or do would change it. D8 is a lost cause. There was a brief hope when I was allowed to make it secure. Apparently that window is closed.
Comment #89
effulgentsia CreditAttribution: effulgentsia commentedWe need both. The most secure thing an internet user can do is unplug their computer from the internet. But we don't really tell people that they can't use the internet because there are security dangers out there. Similarly, we shouldn't have to tell people that their sites must be slow, because we're afraid to implement a performance improvement that requires developers to start dealing with an object where once they dealt with booleans, because of a fear that dealing with objects has a pathway by which it's dealt with incorrectly.
But, if we really think it's dangerous DX to return an object from a method that for some reason people expect a boolean from, then I'm fine with something like #67 where we have separate methods, or an extra parameter (e.g., $return_as_object) that defaults to FALSE. But then my question is which of our many access methods do we provide that variant on? Is there any reason to do it on AccessManager::checkNamedRoute() and not do it on Entity::access()? And if we do it on those two, which other ones do we need to add to the "important to babysit callers" list?
Comment #90
chx CreditAttribution: chx commentedI see two resolutions: either #67 or won't fix the issue. Anything else results in an insecure Drupal 8. I am astounded people don't see this.
Comment #91
effulgentsia CreditAttribution: effulgentsia commentedIt's not just about AccessResultInterface. It's that an object that implements AccessResultInterface can also implement other interfaces (such as CacheableInterface), so callers that receive objects can do a lot more than if only receiving a boolean.
Comment #92
chx CreditAttribution: chx commented>> I hope though that we still have security first as our concept, over performance/cachablity
> We need both.
NO WE DONT. Security is our number one concern trumping everything, including performance, usability and everything else. Drupal's history is full of these: db_select is a Drupal-only slow query builder. The form API CSRF token made AJAX hard. And so on and so on. We need to stay on this road. If we don't it's the end of Drupal and I am not kidding. What's your value proposition over Wordpress or Symfony if you are not secure?
Comment #93
effulgentsia CreditAttribution: effulgentsia commentedOk, here's my suggestion then:
Add an optional
$return_as_object = FALSE
parameter to:- AccessManagerInterface::check()
- AccessManagerInterface::checkNamedRoute()
- AccessibleInterface::access()
- EntityAccessControlHandlerInterface::access()
- EntityAccessControlHandlerInterface::createAccess()
- EntityAccessControlHandlerInterface::fieldAccess()
And that's it. In other words, not the hooks, individual route access checkers, or lower level APIs like NodeGrantDatabaseStorageInterface, since the casual callers I think we're most worried about shouldn't be calling those directly, but via one of the above high-level APIs. It means each of the above implementations will need a:
line at the end, but that's not so bad.
@dawehner, @chx, @Wim: how does that sound?
Comment #94
chx CreditAttribution: chx commentedI like that! Especially with the FALSE default... Thanks Alex!
Comment #95
dawehnerIt is certainly a worse DX but security is king.
Comment #96
Wim LeersDX/complexity
The last few comments consider this patch as introducing only complexity. As worsening the DX. The opposite is true.
HEAD uses three kinds of tri-state values! If it were using only booleans, I'd agree with the DX regression. But it's not.
All access hooks and some interfaces — most notably
Drupal\Core\Access\AccessibleInterface::access()
(which is the interface for$entity->access()
) — are returning those three kinds of tri-state values:AccessInterface::(ALLOW|DENY|KILL)
NODE_ACCESS_ALLOW
/NODE_ACCESS_DENY
/NODE_ACCESS_IGNORE
TRUE
/FALSE
/NULL
This patch simplifies and unifies all of them (plus the ones that do return booleans) to the same single concept: returning
AccessResultInterface
.Compromise
I've implemented #93 — it looks like it's the only possible compromise (#67 would be incomplete/inconsistent, #83 (latest patch) clearly is not acceptable, #93 addresses both).
This definitely does improve the DX in the calling code (mostly
$entity->access()
calls), at the cost of less consistency, which I think is justified by the gains in this case.In this reroll, I also addressed dawehner's rightfully critical feedback in #85 regarding the wrong one-line sentence in the new interface's docblock.
I provided three interdiffs to simplify reviews:
interdiff.txt
contains all changesinterdiff-interface_and_implementation_changes.txt
contains the changes described by #93: the changes in the interfaces and the consequent changes in the implementations (plus of course test coverage updates, to test both$return_as_object === TRUE
and$return_as_object === FALSE
)interdiff-callers_updates.txt
contains the changes to callers, because the changes in #93 now allow me to remove a lot of changes in earlier patches; the resulting patch now has to change fewer filesSo: interdiff 1 = interdiff 2 + interdiff 3. Hope that helps.
Comment #97
effulgentsia CreditAttribution: effulgentsia commentedLooks good. Let's see what catch thinks.
Comment #98
chx CreditAttribution: chx commentedI am so happy about this change. I so wish I didn't need to throw a hissy fit to get it and I am sorry I did.
Comment #100
chx CreditAttribution: chx commentedRerolled because Wim complained in another issue that this one is a pain to reroll. I like to save Wim some pain :)
Comment #101
Wim LeersThank you very much chx! :)
Comment #102
catchWasn't sure about this at the start, but the overall architecture makes sense now and the API itself I quite like.
Haven't got through the entire patch, few small things I noticed going through:
Should we look at splitting the cache bin method to an additional interface somewhere? This has come up before as well.
Not sure the neither nor is grammatically correct here. Isn't it "not explicitly allowed, or explicitly forbidden"?
Looks like a copy and paste error - this is for the and method.
Comment #103
Wim LeersGlad you like it! :)
Comment #104
alexpottThe @todo needs an issue
$entity does not exist
Comment #105
Wim LeersFixed:
$return_as_object
parameter could be clearer. So clarified that.Comment #106
Wim LeersWhile reviewing, alexpott noticed that in some cases,
return FALSE
has been replaced withreturn AccessResult::create()
.But that doesn't really change the logic. ALL of these are the same case:
. Many (if not most) people didn't know about the tri-stateness of access checks: they thought you'd have to return eitherTRUE
orFALSE
— they didn't know aboutNULL
. This is new in Drupal 8 after all, and wasn't very well documented. Plus, any existing access checking code already used booleans, and many (most?) were not updated.By changing that to
AccessResult::create()
(instead ofAccessResult::forbidden()
, we effectively say "we don't forbid but we also don't allow", and hence allow contrib modules to interject and make decisions that core modules don't make because they don't care about certain use cases.E.g.
FileAccessControlHandler::checkAccess()
looked like this:This prevented contrib modules from implementing access checking logic for files for operations other than "download": the default controller simply always forbade access. This patch changes that
return FALSE
at the end toreturn AccessResult::create()
, which gives contrib modules the freedom to implement access checkers for operations core doesn't care about.This is the case for the following:
LanguageAccessControlHandler
shortcut_set_edit_access()
shortcut_set_switch_access()
ShortcutSetAccessControlHandler
UserAccessControlHandler
FileAccessControlHandler.
MenuLinkContentAccessControlHandler
IS and CR (https://www.drupal.org/node/2337377/revisions/view/7621951/7634851) updated to clarify this.
alexpott is fine with the above, so with that, plus the tirival changes in #104, #105 and this patch, we can move this back to RTBC as soon as testbot confirms there are no test failures.
(Two more cases were also changed but are reverted in this reroll: one was accidental, the other one needs further discussion.)
Comment #107
Wim LeersComment #108
alexpottI read the patch over twice once for nits and to get a feel for it and once with a security head on to ensure that everything that has changed is acceptable. Thanks @Wim Leers for your patience on IRC :)
The patch looks really good to me and the DX win of AccessResultInterface is huge. Let's get this in to unblock several other changes.
Comment #109
catchOK I'm happy with this as well now. AccessResultInterface is a good improvement in itself, then we just add the caching information on top.
Committed/pushed to 8.0.x, thanks!
Comment #111
Wim LeersWoot, this unblocks #1805054: Cache localized, access filtered, URL resolved, and rendered menu trees and #2232609: Cacheable breadcrumbs block, and fix breadcrumb builders! :) And I'm sure more things will come up in the future.
So glad this landed, because it's indeed a big improvement for both the cacheability and the developer experience of Drupal 8! :)
Comment #112
fagoWhile looking through the changes I noticed a few glitches and small DX issue, thus I created a quick follow-up: #2341399: Follow-up: entity.api.php documentation fixes.
Comment #114
olli CreditAttribution: olli commented#2373017: No delete link when editing a menu, only reset links