Summary

I've been talking to Wim Leers about cache contexts and during our discussion I came to the conclusion that user.permissions and user.roles aren't really sibling cache contexts. Instead, user.permissions is a logical child of user.roles and should thus become user.roles.permissions.

Explanation

Suppose you have a block on the page that shows a list of your permissions. This block obviously varies by the current user's permissions. Keep in mind that there could be two users with totally different roles that end up with the same permissions because the combination of their roles so happens to add up to the same permission set.

Great, so now we want to add a block to that page that shows a list of your user roles. Again, this block obviously varies by the current user's roles. Now here's the kicker: Any two users with the same roles automatically have the same permissions. In light of that, user.roles is a logical parent to user.roles.permissions because it's more granular.

Finally, for argument's sake, let's add a block showing your user ID to the page. This block naturally varies per user. Should the previous two blocks now also vary per user, then that would be fine as they would most definitely show the right roles and permissions. So user is still a logical parent to both user.roles and user.roles.permissions

What needs to be done?

The permission cache context needs to move to user.roles.permissions

That's it! Permissions already correctly set the roles' metadata when optimized away, so in a sense it already knew it was a more logical child of roles than it was of users.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

kristiaanvandeneynde created an issue. See original summary.

Wim Leers’s picture

Title: Refactor the user.permissions context to become user.roles.permissions » Rename the 'user.permissions' context to 'user.roles.permissions', to ensure correct cache context optimization
Priority: Normal » Major
Issue tags: +D8 cacheability, +Needs change record, +needs documenting

Confirming!

Thanks for opening this issue, with your excellent explanation!

This would need a CR and https://www.drupal.org/developing/api/8/cache/contexts would need to be updated.

catch’s picture

Status: Postponed » Closed (works as designed)

Permissions is handled separately, because we don't invalidate a cache tag for the permissions cache context when permissions are added to or removed from a role. See the discussion in #2428703: Add a 'user.permissions' cache context (was: "Should cache contexts be able to associate a cache tag?"). So collapsing permissions down to roles would result in stale cache items, if permissions are added to or removed from roles.

kristiaanvandeneynde’s picture

Status: Closed (works as designed) » Postponed

Wait what?

Permissions is handled separately, because we don't invalidate a cache tag for the permissions cache context when permissions are added to or removed from a role.

You don't need to invalidate any tags for a cache context to work? The whole point is that it checks your unique identifier at runtime; in this case permissions hash. It then uses that ID to find a cache item and serves it if found.

When you update permissions on a role, the role is saved and as such the role's cache tags are triggered. The permission cache context already adds the user's roles' cache tags in getCacheableMetadata(), so we should be fine?

P.S. I'm currently already using this approach in the Group module if you want to see an example:

  • group_membership
  • group_membership.roles
  • group_membership.roles.permissions
catch’s picture

Status: Postponed » Closed (won't fix)

Here's an example:

User A has roles 'authenticated' and 'editor'.

If a permission is added to the editor role, the permissions hash will change - works 100% with cache context, no need for cache tag here.

The roles of the user do not change - the user still has the same roles.

Therefore, content that is cached with the user's roles but not permissions will be stale.

The roles cache context doesn't add cache tags for the role, just for the user.

kristiaanvandeneynde’s picture

Status: Closed (won't fix) » Postponed

Oh snap, good catch (no pun intended)!

Although, wouldn't that indicate wrong use of the user.roles cache context? If you are varying by roles instead of permissions, you should not rely on the roles' permissions. If you are, however, varying by permissions you should use the user.roles.permissions context instead.

Example: A block that shows your roles would still be valid in the above scenario. Even if you changed the permissions on said roles.

Can we also keep the ticket open until we've reached a consensus? It tends to not show up on my radar when closed.

catch’s picture

If you are, however, varying by permissions you should use the user.roles.permissions context instead.

We optimize cache contexts, so that if user.roles is used, user.roles.permissions would not be used. This is to ensure the minimum calculation is performed when resolving them. It's also why the user.roles cache context has the user cache context, because it gets collapsed down to just user if that's used.

By doing this issue, anything on the site using user.roles would result in everything using user.roles.permissions getting cached incorrectly. This is the exact reason why they're side by side instead of nested.

I keep closing this because consensus was reached in #2428703: Add a 'user.permissions' cache context (was: "Should cache contexts be able to associate a cache tag?") 18 months ago via the same discussion we're having here.

kristiaanvandeneynde’s picture

By doing this issue, anything on the site using user.roles would result in everything using user.roles.permissions getting cached incorrectly. This is the exact reason why they're side by side instead of nested.

No it wouldn't? Because then user.roles.permissions would be optimized away, setting the cache tags of the user's roles. If that happens, the cache is invalidated whenever a role (or its permissions) are updated.

catch’s picture

Because then user.roles.permissions would be optimized away, setting the cache tags of the user's roles.

OK that's fair, but it's not ideal for cache longevity:

Authenticated users have access to 'view user profiles'.

Someone adds that permission to the 'Editor' role.

Now the editor role cache tag gets invalidated, but the permissions hash for users with the editor role wouldn't change - they'd have exactly the same permissions.

kristiaanvandeneynde’s picture

Now the editor role cache tag gets invalidated, but the permissions hash for users with the editor role wouldn't change - they'd have exactly the same permissions.

That's the case for all optimized cache contexts. It's not an ideal system, but it's the only way to assure that the cache will still be valid.

Example: If we optimize away the user.roles context, we set the user's cache tags. If we then save the user, any cache that had user.roles optimized away would be flushed, even though we may have only changed the user's name and not their roles.

catch’s picture

Version: 9.x-dev » 8.3.x-dev

This could be added in 8.3.x if we wanted to do it.

kristiaanvandeneynde’s picture

I guess you're right.

It would require two large warnings though:

  1. If you are using user.roles where you should be using user.permissions, refactor your code!
  2. If you are using user.permissions, replace all occurrences with user.roles.permissions if you want to benefit from the optimized contexts.
Wim Leers’s picture

No it wouldn’t? Because then user.roles.permissions would be optimized away, setting the cache tags of the user’s roles. If that happens, the cache is invalidated whenever a role (or its permissions) are updated.

OK that’s fair […]

This is the key insight of this issue.


OK that’s fair, but it’s not ideal for cache longevity:

Roles config entities aren't modified very often (i.e. the permissions they are granted are not frequently modified). And #10 is exactly right: That’s the case for all optimized cache contexts. It’s not an ideal system, but it’s the only way to assure that the cache will still be valid.


#12:

  1. If you are using user.roles where you should be using user.permissions, refactor your code!

This is already the case.

  1. If you are using user.permissions, replace all occurrences with user.roles.permissions if you want to benefit from the optimized contexts.

That's the thing we need to discuss the most if we're going to do this in 8.3.

  1. Do we let all existing core code continue to use user.permissions, or do we update it? (This might break some contrib code and will break some contrib tests.)
  2. Do we keep the default required cache context user.permissions, or do we update it? (This will break some contrib tests.)

The one thing we know we'd have to do regardless of the answers to those questions: we just add cache_context.user.roles.permissions and let it effectively be an alias of cache_context.user.permissions. This lets all code continue to work.

kristiaanvandeneynde’s picture

The one thing we know we'd have to do regardless of the answers to those questions: we just add cache_context.user.roles.permissions and let it effectively be an alias of cache_context.user.permissions. This lets all code continue to work.

Wouldn't it be more ideal to move the code into the new class and have the old class extend the new one? Then add a @deprecated notice in its docblock to indicate that people should start using the new one.

Other than that, the two could indeed coexist perfectly without affecting one another. So that's a nice bonus for a change :)

Wim Leers’s picture

+1

Version: 8.3.x-dev » 8.4.x-dev

Drupal 8.3.0-alpha1 will be released the week of January 30, 2017, which means new developments and disruptive changes should now be targeted against the 8.4.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

kristiaanvandeneynde’s picture

Status: Postponed » Active

Setting to active as it would be rather easy to create a patch for #14

kristiaanvandeneynde’s picture

Status: Active » Needs review
FileSize
5.9 KB

Curious to see what tests will say about this.

Berdir’s picture

As commented in #2453835: Allow non-intrinsic (implementation-dependent) cache context services to specify their parents via DynamicParentContextInterface::getParents(), this proposes kind of the opposite of that issue (with a different approach).

I've been thinking about this for a while now and went from disagreeing with you, to kind-of-agreeing with you to disagreeing with you again :)

First, yes, thanks to cache tags that we add when optimizing, user.permissions can indeed be optimized away by user.roles. *Only* when optimizing, but the same is true for user and user.permissions/user.roles.

But, user.roles can also be optimized away by user.permissions. Because the permission hash we generate is not just based on the permissions but also the role ids. So two sets of roles with the same permissions between them *result in a different hash*.

Given that, we can optimize away in both directions, which IMHO makes neither a child of the other, but overlapping siblings, which is what they are now.

Also, even though we *can* optimize in both directions, it is more efficient to optimize away user.roles because of at least two reasons:

A) Parameters. user.roles has a role parameter, which is for example very useful to have something vary by whether the user is anonymous or not (user.roles:authenticated). user.permissions can be optimized away in favor of user.roles but not user.roles:authenticated. However, we can optimize user.roles:authenticated away in favor user.permissions because even if anon and authenticated users have the same permission, they still have a different hash.

B) At least as long as user.permissions is a default cache context, it is more efficient to optimize away user.roles. Among other things, it results in fewer cache redirects as explained in the issue linked above. Given the scenario that someone adds user.roles as a cache context in hook_node_view(), right now, that results in a cache redirect because the contexts at the end are different, so we have to do two database queries. With this proposal, we would optimize user.permissions away, which still results in a different set of contexts and a redirect. With my issue, we optimize user.roles away, resulting in the same set of contexts and have no redirect, which saves us a query/redis get.

kristiaanvandeneynde’s picture

But, user.roles can also be optimized away by user.permissions. Because the permission hash we generate is not just based on the permissions but also the role ids. So two sets of roles with the same permissions between them *result in a different hash*.

Given that, we can optimize away in both directions, which IMHO makes neither a child of the other, but overlapping siblings, which is what they are now.

I would argue that, conceptually, permissions are still a child context of roles because of granularity. The fact that they are currently overlapping siblings is due to our implementation, namely the generation of a hash based on role IDs instead of the actual permissions.

If we want to have cache contexts that can be optimized away following the concept behind the context instead of the logic we chose to implement the context with, we should change the way the hash is generated.

Also, even though we *can* optimize in both directions, it is more efficient to optimize away user.roles because of at least two reasons:

These do seem like benefits, but they still break rank with the concept of the permissions cache context. I would argue that varying said context by a roles hash is actually a bug.

If Bob with roles X and Y has the same permissions as Alice with role Z, then they will not see the same cache object because of role IDs differing. Even though that has nothing to do with their actual list of permissions and they should ideally be seeing the same cache object.

The only part I'm not up to speed on is the cache redirects (part B). Are you saying that by default, we only have the 'user.permissions' context and if 'user.roles' were optimized away into the former, we'd end up with the same contexts we started with? Somehow meaning we don't redirect? Because if that's the case, then that will only be true for 'user.roles', a rarely used context. As soon as someone adds the more common 'user' to vary by account, we'd still end up with a different context ('user') than the one we started with ('user.permissions').

So while I absolutely appreciate the very thorough feedback and honesty in #19 and I definitely see and understand your point on why reversing the process would be more efficient, I still think we need to optimize permissions into roles.

I think clarity and consistency trumps performance here. Having a context that promises to do X and then have it do Y behind the scenes just doesn't seem right to me. In its current state, 'user.permissions' is actually 'user.roles_hash'. We should definitely fix that in a separate issue first before we can even begin to discuss whether we should commit the patch in #18.

kristiaanvandeneynde’s picture

For reference: PermissionsHashGenerator::generate() is right to internally cache the generated hash by user role IDs, but is wrong by making the hash contain role data in ::doGenerate().

The part that's wrong is this:

// Note that for admin roles (\Drupal\user\RoleInterface::isAdmin()), the
// permissions returned will be empty ($permissions = []). Therefore the
// presence of the role ID as a key in $permissions_by_role is essential
// to ensure that the hash correctly recognizes admin roles. (If the hash
// was based solely on the union of $permissions, the admin roles would
// effectively be no-ops, allowing for hash collisions.)
$permissions_by_role[$role] = $permissions;

We should not account for that during, but before the hash generation. Same goes for this part in ::generate().

    if ($account->id() == 1) {
      return $this->hash('is-super-user');
    }

The generator should do what it promises to do: return a hash of permissions. Not special keys, not tainted permissions. How we choose to implement and optimize that is up for discussion, but the end result should always be a permissions hash. Fix that and you automatically fix the user.permissions context to actually do what it advertises.

Edit:
--------
For instance, we could hash the list of all permissions and return that hash when UID === 1 or the user has an admin role. We'd need to internally cache that "super hash" with the right cache tags so that a newly enabled module will purge said cache to make sure the list is rebuilt next time around to show a different permission list.

Right now, if we show a block with "Your permission" to user 1 and enable a new module, the list would be out of date because the generated hash remains the same ('is-super-user').

Wim Leers’s picture

Given that, we can optimize away in both directions, which IMHO makes neither a child of the other, but overlapping siblings, which is what they are now.

This is wrong. I think you only came to this conclusion because your reasoning is based on permissions.

It's totally possible to do if (in_array($this->currentUser->getRoles(), ['foobar'], TRUE)) { … }. The only correct cache context then is user.roles.

However, we can optimize user.roles:authenticated away in favor user.permissions because even if anon and authenticated users have the same permission, they still have a different hash.

This definitely contains a gap in its reasoning. What you say would be correct. But it would unfortunately also be suboptimal. Because if something varies by user.roles:authenticated, then all authenticated users get the same CID. If you'd use user.permissions instead, then authenticated users that have different combinations of roles would get different CIDs.

At least as long as user.permissions is a default cache context

I want to call out that this only applies to the render system. It's a required cache context in renderer.config. Not globally. So it does not apply to API-first data.

[…], it is more efficient to optimize away user.roles

This is true though!

If Bob with roles X and Y has the same permissions as Alice with role Z, then they will not see the same cache object because of role IDs differing. Even though that has nothing to do with their actual list of permissions and they should ideally be seeing the same cache object.

Agreed! We should not make optimizations based on an imperfect concrete implementation, we should make optimizations based on the semantics.

Are you saying that by default, we only have the 'user.permissions' context and if 'user.roles' were optimized away into the former, we'd end up with the same contexts we started with? Somehow meaning we don't redirect?

Yes, that's what @Berdir is saying. Look for example at the cache_dynamic_page_cache DB table, and observe how cache redirects work. If you want to understand it in full detail, see \Drupal\Core\Render\RenderCache. Both ::get() and ::set(). Very detailed comment explaining the process in ::set().

For reference: PermissionsHashGenerator::generate() is right to internally cache the generated hash by user role IDs, but is wrong by making the hash contain role data in ::doGenerate().

Agreed. And I'm the one who wrote that code. Blame me :) It was the naïve/simple implementation. But not the most optimal one. Optimizing this will solve the "Bob (X+Y) vs Alice (Z), but X+Y==Z" case you mentioned above, and result in fewer variations (fewer CIDs).

if ($account->id() == 1) {
      return $this->hash('is-super-user');
    }

The generator should do what it promises to do: return a hash of permissions.

On this one, I disagree. The root user is a very special case. It does not have any permissions assigned. It simply is granted any and all permissions always. It's the exception to the rule; it should have its own special value. Otherwise you also run into problems of "oh, new permissions appeared, so all cache items for user 1 were using the old cache context value, and now they need the new one".

Berdir’s picture

Given that, we can optimize away in both directions, which IMHO makes neither a child of the other, but overlapping siblings, which is what they are now.

This is wrong. I think you only came to this conclusion because your reasoning is based on permissions.

It's totally possible to do if (in_array($this->currentUser->getRoles(), ['foobar'], TRUE)) { … }. The only correct cache context then is user.roles.

Not wrong, just unclear I think. By overlapping, I mean partially overlapping, obviously if they would be identical we wouldn't have ever added user.permissions in the first place. I'm talking about the situation of optimizing cache contexts *when both contexts are added to a thing.*.

However, we can optimize user.roles:authenticated away in favor user.permissions because even if anon and authenticated users have the same permission, they still have a different hash.

This definitely contains a gap in its reasoning. What you say would be correct. But it would unfortunately also be suboptimal. Because if something varies by user.roles:authenticated, then all authenticated users get the same CID. If you'd use user.permissions instead, then authenticated users that have different combinations of roles would get different CIDs.

Again, not wrong IMHO, just apparently not clear enough. As I said above, I'm talking about optimizing contexts in a scenario where both contexts *are* provided. And not about what you should use as a developer. Of course you should use the user.roles with or without parameters based on what is the optimal and correct context for what you are doing.

I want to call out that this only applies to the render system. It's a required cache context in renderer.config. Not globally. So it does not apply to API-first data.

And I want to call out *again* :) that we are talking about optimizing *if both contexts are given*. If you are in an API scenario and you only have user.roles, then there is nothing to optimize anyway. And if you only have user.permissions, it also doesn't matter how we optimize.

Agreed! We should not make optimizations based on an imperfect concrete implementation, we should make optimizations based on the semantics.

That's nice in theory, but if the sematically correct optimization is not an optimization in real life with the actually used implementation then it is not an optimization at all.

Agreed. And I'm the one who wrote that code. Blame me :) It was the naïve/simple implementation. But not the most optimal one. Optimizing this will solve the "Bob (X+Y) vs Alice (Z), but X+Y==Z" case you mentioned above, and result in fewer variations (fewer CIDs).

I'm not sure about naive/simple, IMHO we did it like this on purpose and there was a lot of discussion in #2417895: AccountPermissionsCacheContext/PermissionsHashGenerator must special case user 1, since permissions don't apply to it about it. And answering whether being able to optimize user.roles away in favor of user.permissions or having fewer variations in case of overlapping sets of permissions is more performant depends on the specific site, if you don't have any roles like that then it doesn't really matter to have more variations (I guess having admin role + various other roles is something which would be better).

And not sure if that's something that we can actually *change*?

kristiaanvandeneynde’s picture

Thanks for weighing in, Wim.

Regarding the super user I somewhat agree with your logic but would like to point out the system of admin roles that was introduced in D8. They essentially turn any account into a super user.

If we choose to have this special case for user 1, we generate a cache object for one user only. If we instead load all permissions and hash those, we generate a cache object that can not only be served to user 1 but also all accounts that have at least one admin role.

So the logic could be something like:

  • Are we dealing with a superuser or does the user have an admin role?
  • If yes: Do we have the all-permissions-hash cached? Yes: Return that; No: continue.
  • Load all defined permissions and hash those
  • Cache the hash and add a 'core.extension' config cache tag, then return it

With the above, we'd make much more use out of the super user cache object.

Berdir’s picture

Disagree on loading permissions, we don't cache that at the moment and only use it on the permissions page, plus, admin role and user 1 even return TRUE for permissions that are not defined and it's even possible to grant a role a permission that's not actually defined (doesn't really make sense of course, but who knows what kind of sites are out there).

I don't see what a hash on all permissions would buy us. Having the same key for admin role or uid 1 might be possible, but again, this could be a behavior change for existing sites that currently only work because of the user.permissions hash (which is wrong, but that's kind of exactly why we made user.permissions a default cache context in the first place, to prevent security issues on sites that are doing it wrong) and again, having the roles in there might not be correct semantically, but it allows for better optimizations.

kristiaanvandeneynde’s picture

Disagree on loading permissions, we don't cache that at the moment and only use it on the permissions page,

So let's cache them? It should be as simple as adding the 'core.extension' config cache tag, unless I'm missing something. I also don't see the harm in caching them as they might be expensive to gather anew all the time.

admin role and user 1 even return TRUE for permissions that are not defined and it's even possible to grant a role a permission that's not actually defined (doesn't really make sense of course, but who knows what kind of sites are out there).

Although I don't think we should cater to poorly written modules that check for permissions that don't even exist, a valid use case did just jump to mind. Role X has permissions A, B and C. The module that defined permission C is uninstalled, leaving role X with permission C that is no longer defined.

This would not affect the gathered permissions of the superuser or people with an admin role as they would get the same permission set regardless of the permissions belonging to the defined roles. This would only "break" in two ways:

  1. Alice and Bob should have the same permissions even though their roles vary. If one of the roles has a "ghost" permission, their hash would differ.
  2. Joe has several roles which, when added to each other, result in him having all permissions. If one of his roles has a "ghost" permission, his hash would differ from the one we use for super users.

There's an easy fix for both of the above: First gathering all of the user's permissions and then do an array_intersect with the "defined" permissions. This will rule out ghost permissions.

this could be a behavior change for existing sites that currently only work because of the user.permissions hash

Changing the logic the way I proposed should do no harm as it will show the exact same thing super/admin users are already seeing, but instead of it coming from either of two cache objects, they'd be served the same cache object. If any site has a different result here, they should break as it will alert them of a misconfiguration that might lead to security issues.

having the roles in there might not be correct semantically, but it allows for better optimizations

I don't argue the benefit of better optimizations. But I don't think a performance gain should come at the cost of writing semantically incorrect code. If the permissions cache context does not actually vary by permissions, we might as well call it the magical access unicorn context :)

kristiaanvandeneynde’s picture

Having read through the latest comments again I have changed my mind about the super user hash. It does make more sense to use a special key here because newly introduced permissions (by enabling a module, creating a content type, ...) do not affect super users or admins. So it would not make sense to invalidate their cache objects when permissions are added.

This bit should still be optimized, IMO:

// User 1 is the super user, and can always access all permissions. Use a
// different, unique identifier for the hash.
if ($account->id() == 1) {
  return $this->hash('is-super-user');
}

We need to loop over the user's roles and see if any of them are admin roles. If they are, the user effectively becomes a super user, so we should also return the 'is-super-user' hash.

Wim Leers’s picture

We need to loop over the user's roles and see if any of them are admin roles. If they are, the user effectively becomes a super user, so we should also return the 'is-super-user' hash.

Why? "admin role" !== "uid 1", because \Drupal\user\Entity\User::hasPermission() looks like this:

  public function hasPermission($permission) {
    // User #1 has all privileges.
    if ((int) $this->id() === 1) {
      return TRUE;
    }

    return $this->getRoleStorage()->isPermissionInRoles($permission, $this->getRoles());
  }
Berdir’s picture

An admin role has basically the same code two levels down: \Drupal\user\Entity\Role::hasPermission().

But it is possible that there is some code that specifically checks for UID 1 as that has always been special. Another possibility would be roles-based access checks/visibility conditions.

Example situation, you have 3 users:
* user 1 (no role other than authenticated, still has all permissions, but role-based access checks don't work for him)
* user A (Administrator, Editor roles)
* and user B (only Administrator role)

Lets consider all possible implementations for the hash:

1. HEAD: all 3 users are guaranteed to have different user.permissions hashes, which means it is safe to optimize user.roles away in favor of user.permissions. But it is *not* safe to optimize user.permissions away in favor of user.roles because uid 1 would otherwise have the same roles list as an authenticated user without any roles.

2. *only* a hash of all permissions of all roles the user has: again, user 1 would have the same hash as an authenticated user, so that's not an option, at least not alone. And returning all permissions for admin roles is problematic and not easy to cache because cache invalidations (many things result in new roles, like new node types and text formats).

3. Same special value for both uid 1 and admin role: Now all 3 users would have the same hash, which would be OK for permission based checks as hasPermission() but it is still not ok to optimize permissions away for roles, as only calculating the roles would result in the same problem as case 1. And we can now also no longer optimize away roles as we have users with different roles having the same permission hash.

4. Different special value for uid and admin role. Same situtation like case 3 betwen user A and user B, both have the same permission context value, but different roles.

Any objections?

I remain convinced that the implementation in HEAD is the best/only correct approach and that it is only possible to optimize away roles in favor of permissions but not the opposite, which this issue is suggesting. The code in #28 shows perfectly that the permissions that a user can have does not (only) depend on his roles, so it is not a children of roles and can not be optimized away in favor of it. Not as long as the uid 1 special case exists at least.

Wim Leers’s picture

An admin role has basically the same code two levels down: \Drupal\user\Entity\Role::hasPermission().

HAH! Good to know :)

But it is possible that there is some code that specifically checks for UID 1 as that has always been special. Another possibility would be roles-based access checks/visibility conditions.

Indeed.

  1. In favor of user.roles, you mean. But, yes, agreed.
  2. User 1 would have the same hash as an authenticated user because it has no permissions assigned, right? If so, agreed.
  3. Agreed.
  4. Agreed.

#29 seems convincing. But it's getting late, and it's been a long time since I dealt with this stuff. So I'm wondering if @kristiaanvandeneynde has a convincing counterargument :P

kristiaanvandeneynde’s picture

Re #29:

user 1 (no role other than authenticated, still has all permissions, but role-based access checks don't work for him)

And they rightfully shouldn't! :)

  1. Indeed user 1 would have the same roles list as an authenticated user without any roles. But that's the whole point: You should only use the user.roles cache context when you actually need info about the roles. Anything access-related has always been and should always be varied by permissions.

    The easiest example is always to display that which you vary by, so let's assume a block that lists the names of your roles. It would be perfectly acceptable to have that block only show "Authenticated user" for user 1, because that is actually the only role he has (by default). I see no problem in that whatsoever, it's actually the simple truth :)

  2. As you have both pointed out earlier and I agreed to in #27, this is not a valid option. We'd have to invalidate the cache objects for user 1 and admin roles every time new permissions are added even though they are very unlikely to actually change that which was cached.
  3. This basically proves my point. Roles are more granular than permissions and thus the logical parent.

    In your example: User 1, 2 and 3 have the same effective permissions, but different roles. So when varying by permissions, they should all get the same cache object. But as soon as something on the page varies by roles (e.g.: the roles block mentioned in 1.), they should see different cache objects. So it would be fine to optimize permissions away in favor of roles.

  4. See answer 3.

Basically we are facing only one real difficulty here: Implied permissions. See how I used effective permissions in answer 3.? The whole system we currently have would work perfectly under user.roles.permissions, but special care needs to be taken because of implied permissions for UID 1 and admin roles.

Without implied permissions, we could actually even implement Berdir's second statement. Think of the pure, simple blocks I mentioned earlier: A block listing your permissions (varied by user.permissions) would show an empty list for admins and UID 1. But that's actually the truth.

Edit, flawed suggestion, see below: Problem is we need implied permissions to make Drupal usable for admins, which is why we had to come up with the 'is-super-user' hash, which should actually be named 'indirectly-has-all-permissions' and used for both UID 1 and people with admin roles.

kristiaanvandeneynde’s picture

It's basically how we treat the aforementioned implied permissions.

Option 1: Treat them as effective permissions

  1. We expect a block of "Your permissions" to show all possible permissions on the site for UID1/admin
  2. We can therefore drop the special hash and simply grab the hash of all permissions for UID1/admin
  3. Yes this means that any time a new permission is added, we need to invalidate the cache for UID1/admin
  4. However, when we treat implied permissions as effective permissions, this is the only right choice. Otherwise the "Your permissions" block would contain stale data.

Option 2: Treat them as implied permissions (HEAD)

  1. We expect a block of "Your permissions" to show all permissions you actually have
  2. For UID1 this would probably be nothing and for admins this could list some permissions. User A in Berdir's example would show Editor roles.
  3. Because of this, User A should see a different cache object than UID1, which would not be the case with my 'indirectly-has-all-permissions' suggestion. So consider that one flawed.
  4. But even 'is-super-user' is flawed because of this. Suppose UID1 has got the Editor role added to their account. We'd have two cache objects (one for UID1, one for User A) even though they'd see the same result.

So basically our current implementation is flawed. We should either treat permissions as effective or implied, but not a mix of both (which is what we're doing in HEAD).

Just to prove my point, suppose we have user.actual_roles and user.actual_roles.actual_permissions. Two rather useless contexts you'd only use when showing a block of your roles or your permissions. Would you both not agree that in this case user.actual_roles.actual_permissions should definitely be a child of user.actual_roles?

If yes, then this is the real problem: We name our contexts user.roles and user.permissions, but they are in fact not representatives of their names. We currently have user.roles_or_uid_1 and user.permissions_as_seen_by_the_access_layer :D

Berdir’s picture

Cache contexts have to match how the roles/permissions/access checking actually works and how you think it should work.

> And they rightfully shouldn't! :)

Yes, I didn't say otherwise.

But that's the whole point: You should only use the user.roles cache context when you actually need info about the roles. Anything access-related has always been and should always be varied by permissions.

Nope, that is wrong. Yes, code shouldn't hardcode role checks, but it is sometimes done in custom code and we have to support that. And for views and block configurations it is perfectly valid to have access/visibility checks based on roles.

This basically proves my point. Roles are more granular than permissions and thus the logical parent.

In your example: User 1, 2 and 3 have the same effective permissions, but different roles. So when varying by permissions, they should all get the same cache object. But as soon as something on the page varies by roles (e.g.: the roles block mentioned in 1.), they should see different cache objects. So it would be fine to optimize permissions away in favor of roles.

If user 1 wouldn't exist, yes. But user 1 does exist and will not go away any time soon and as long as that is true, we can not do what you are saying. Maybe your roles block also has edit link for the roles that only users with administer permissions can see. And because uid 1 and authenticated user have the same roles, they would share the same cache.

So again, as long as the UID 1 special case exists (and changing that is IMHO not likely to happen any time soon):
* user.permissions can not be optimized away in favor of user.roles
* While maybe not in semantically correct, with the current permission hash implementation, it *is* possible to optimize user.roles away in favor of user.permissions.

As long as the uid 1 special case exists, this is IMHO a won't fix.

kristiaanvandeneynde’s picture

Nope, that is wrong. Yes, code shouldn't hardcode role checks, but it is sometimes done in custom code and we have to support that. And for views and block configurations it is perfectly valid to have access/visibility checks based on roles.

Visibility? Agreed. Access? Please for the love of all that is holy, no.

I've always taught people to create a custom permission, assign it to a role and then check for that permission. This is easy to pull off with Views, by the way. Roles change, permissions don't. If your entire access layer depends on what roles someone has, you're bound to have a very bad time when those roles change.

Why we still don't have a permission-based access system in the block UI is beyond me. I get that filtering by roles is a nice feature for visibility. But let's please not use that system for access control. The whole section is even named "Visibility".

So my opinion is that we do not cater to people who define access by roles. It's wrong, plain and simple.

Maybe your roles block also has edit link for the roles that only users with administer permissions can see. And because uid 1 and authenticated user have the same roles, they would share the same cache.

This is a very good example of why mixing implied permissions with effective permissions is bad. You're absolutely right that this is a problem. The only reason it currently does not break that way is because of a UID1 check in the user.roles context.

So again, as long as the UID 1 special case exists (and changing that is IMHO not likely to happen any time soon):
* user.permissions can not be optimized away in favor of user.roles
* While maybe not in semantically correct, with the current permission hash implementation, it *is* possible to optimize user.roles away in favor of user.permissions.

As long as the uid 1 special case exists, this is IMHO a won't fix.

Completely agree except for the won't fix. My approach would be to normalize the special case. I.e.: For all intents and purposes refactor user.roles and user.permissions to deal with effective roles and permissions:

  1. For cache objects varying by user.roles: Still treat UID1 as an exception. This is something we can't fix unless we introduce a hidden Superuser role and forcibly assign that one to UID1. That alone would fix the access system on so many levels.
  2. For cache objects varying by user.roles.permissions: Retrieve all defined permissions for admins/UID1 and return that hash. As a side-effect: Eat the often useless cache invalidations experienced by admins/UID1 when new permissions are introduced.

P.S.: This is exactly how I fixed it in Group. I removed all implied things (by removing a hook that allowed you to alter permissions) in favor of using things that are actually stored in the database. If we store the fact that UID1 has a role Superuser in the database, we could optimize a lot of access checks all over Drupal.

kristiaanvandeneynde’s picture

Actually, the more I think about it, the more I think a Superuser role would fix everything. It would even maintain backwards compatibility.

Edit: And we wouldn't even have to store it in the DB because of AccountInterface::getRoles()

  /**
   * Returns a list of roles.
   *
   * @param bool $exclude_locked_roles
   *   (optional) If TRUE, locked roles (anonymous/authenticated) are not returned.
   *
   * @return array
   *   List of role IDs.
   */
  public function getRoles($exclude_locked_roles = FALSE);
kristiaanvandeneynde’s picture

With a "Superuser" locked role, we could gradually get rid of a few $user->id() == 1 checks in core in favor of RoleStorage::isPermissionInRoles().

User.roles.permissions would be a possible cache context which would do the following:

  1. Loop over roles
  2. If any role is an admin role, retrieve the all-permissions-hash (from cache?) and break the loop
  3. If not, gather all of the roles' permissions and hash them

Its getCacheableMetadata() would remain unchanged. If new permissions are added, the all-permissions-hash would also change, meaning admins would have their cached objects flushed. In most cases this isn't ideal performance-wise, but it would be right in the purest sense of the cache context.

User.roles would become a lot cleaner:

  1. We could get rid of the 'is-super-user' bit altogether as we can now be sure UID1 actually has a role that reflects their admin powers.
  2. Its getCacheableMetadata() would also remain unchanged.

It would simplify both cache contexts a lot, introduce more consistency in D8 regarding the UID1 special use case and allow us to gradually remove $user->id() == 1 checks in core in favor for the API we have for all other users.

kristiaanvandeneynde’s picture

I have revived the following issue: #540008: Add a container parameter that can remove the special behavior of UID#1. If that one lands, we can actually fix this issue properly as suggested above.

Version: 8.4.x-dev » 8.5.x-dev

Drupal 8.4.0-alpha1 will be released the week of July 31, 2017, which means new developments and disruptive changes should now be targeted against the 8.5.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.5.x-dev » 8.6.x-dev

Drupal 8.5.0-alpha1 will be released the week of January 17, 2018, which means new developments and disruptive changes should now be targeted against the 8.6.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.6.x-dev » 8.7.x-dev

Drupal 8.6.0-alpha1 will be released the week of July 16, 2018, which means new developments and disruptive changes should now be targeted against the 8.7.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.7.x-dev » 8.8.x-dev

Drupal 8.7.0-alpha1 will be released the week of March 11, 2019, which means new developments and disruptive changes should now be targeted against the 8.8.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.8.x-dev » 8.9.x-dev

Drupal 8.8.0-alpha1 will be released the week of October 14th, 2019, which means new developments and disruptive changes should now be targeted against the 8.9.x-dev branch. (Any changes to 8.9.x will also be committed to 9.0.x in preparation for Drupal 9’s release, but some changes like significant feature additions will be deferred to 9.1.x.). For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

apaderno’s picture

Issue tags: -needs documenting +Needs documentation

Version: 8.9.x-dev » 9.1.x-dev

Drupal 8.9.0-beta1 was released on March 20, 2020. 8.9.x is the final, long-term support (LTS) minor release of Drupal 8, which means new developments and disruptive changes should now be targeted against the 9.1.x-dev branch. For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

Version: 9.1.x-dev » 9.2.x-dev

Drupal 9.1.0-alpha1 will be released the week of October 19, 2020, which means new developments and disruptive changes should now be targeted for the 9.2.x-dev branch. For more information see the Drupal 9 minor version schedule and the Allowed changes during the Drupal 9 release cycle.

Version: 9.2.x-dev » 9.3.x-dev

Drupal 9.2.0-alpha1 will be released the week of May 3, 2021, which means new developments and disruptive changes should now be targeted for the 9.3.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 9.3.x-dev » 9.4.x-dev

Drupal 9.3.0-rc1 was released on November 26, 2021, which means new developments and disruptive changes should now be targeted for the 9.4.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 9.4.x-dev » 9.5.x-dev

Drupal 9.4.0-alpha1 was released on May 6, 2022, which means new developments and disruptive changes should now be targeted for the 9.5.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 9.5.x-dev » 10.1.x-dev

Drupal 9.5.0-beta2 and Drupal 10.0.0-beta2 were released on September 29, 2022, which means new developments and disruptive changes should now be targeted for the 10.1.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

needs-review-queue-bot’s picture

Status: Needs review » Needs work
FileSize
2.56 KB

The Needs Review Queue Bot tested this issue. It either no longer applies to Drupal core, or fails the Drupal core commit checks. Therefore, this issue status is now "Needs work".

Apart from a re-roll or rebase, this issue may need more work to address feedback in the issue or MR comments. To progress an issue, incorporate this feedback as part of the process of updating the issue. This helps other contributors to know what is outstanding.

Consult the Drupal Contributor Guide to find step-by-step guides for working with issues.

Version: 10.1.x-dev » 11.x-dev

Drupal core is moving towards using a “main” branch. As an interim step, a new 11.x branch has been opened, as Drupal.org infrastructure cannot currently fully support a branch named main. New developments and disruptive changes should now be targeted for the 11.x branch, which currently accepts only minor-version allowed changes. For more information, see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

kristiaanvandeneynde’s picture

With #3371246: [Meta, Plan] Pitch-Burgh: Policy based access in core around the corner (hopefully?), this issue no longer makes sense.

Current situation: Your permissions are not calculated based merely on your role, but also on whether or not you are user 1, as rightfully pointed out in #33. So until that was fixed, we could never resolve this issue.

Future situation (again, hopefully): Your permissions are calculated based on an unpredictable amount of access policies, some of which being your user roles and your UID1 status. So as soon as we have access policies in core, we will never be able to fold the user.permissions cache context into user.roles and for good reason.

So if everyone agrees, we can close this issue as won't fix as it's looking more likely that access policies will turn this into a moot point. Furthermore, access policies will allow us to land #540008: Add a container parameter that can remove the special behavior of UID#1 after ~14-15 years :)

catch’s picture

Status: Needs work » Closed (won't fix)

Yes agreed, let's won't fix it. user.roles is not often used so even if we hadn't broken the optimization possibility, it would have been a relatively marginal one.