Problem/Motivation

Part of #3371246: [Meta, Plan] Pitch-Burgh: Policy based access in core

After the API has been added to core, we need to implement it.

Steps to reproduce

Proposed resolution

Take our current 2 access policies:

  • User 1 has full access, all the time
  • You gain access based on what permissions are part of the roles assigned to you

and convert them into SuperUserAccessPolicy and UserRolesAccessPolicy. Furthermore, we need to change the current centralized PermissionChecker service to call for the new API for access checks. The user.permissions cache context will also need to get its value (and inherently hash value) from the new API. We can copy and adjust code from the Group module's implementation to quickly achieve this.

Remaining tasks

  • Change the PermissionChecker to use the new access policy API
  • Change the PermissionHashGenerator to mimic GroupPermissionHashGenerator
  • Change AccountPermissionsCacheContext to mimic GroupPermissionsCacheContext, namely using cacheable metadata from the hash generator
  • Write a UserRolesAccessPolicy (name pending) class that calculates permissions based on which user roles you have
  • Write a SuperUserAccessPolicy (name pending) class that grants all permissions to user ID 1
  • Check if all tests still go green and adjust where needed

User interface changes

API changes

Data model changes

Release notes snippet

Both permissions coming from user roles and the special case of user 1 are now managed by the Access Policy API. This new API was introduced in 10.3.0 and Drupal core now uses it for checking permissions. This was a project funded as part of the PitchBurgh Innovation contest.

Issue fork drupal-3376846

Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Comments

kristiaanvandeneynde created an issue. See original summary.

kristiaanvandeneynde’s picture

Status: Active » Needs review

MR incoming with the implementation. Haven't adjusted any tests yet nor have I added tests for the 2 new policies.

Note: Need to properly handle the new dependency and deprecation of 2 old dependencies in Drupal\Core\Session\PermissionsHashGenerator

kristiaanvandeneynde’s picture

Hmm, I'll look into this on Tuesday, not sure why it says "Build complete" but I'll double-check the obvious stuff. Seems like a lot of FE tests are failing

kristiaanvandeneynde’s picture

Well this is a success. Swapped out how we hand out permissions and only 30 tests fail. Will dive into the test fails.

kristiaanvandeneynde’s picture

Drupal\KernelTests\Core\Render\RenderCacheTest::testUser1PermissionContext

Fails because it wrongfully tests side effects of user.permissions (more specifically the hash generator) and user.roles even though we have user.is_super_user for that. It tries to get a different render array to show up for UID 1 even if both UID 1 and Authenticated user have the same roles (none in this case).

The reality is that this test is now pointless because the permission hash now looks the same if you have the same permissions as user 1, i.e.: if you have a user role that is flagged as admin.

The other method on that test, ::testUser1RolesContext(), is still green but a really bad example of a test. We should never expect the name "user 1" to show up when varying by user.roles. We have "user.is_super_user" or "user" for that.

Conclusion: This test should simply be removed as a whole.

UserTest / UserSessionTest

Fail on line 124 because it mocks the PermissionChecker service with the wrong dependency. I changed the dependency of said service because it hasn't been released yet (it was also added in 10.2.x).

Conclusion: This should be an easy fix.

PermissionsHashGeneratorTest

This runs into the swapped out dependencies we have for PermissionsHashGenerator. I can fix this by changing the test and properly deprecating the old dependencies as seen in this example.

Conclusion: This should be a relatively easy fix.

UserAccountFormPasswordResetTest / MediaEmbedFilterDisabledIntegrationsTest / MediaEmbedFilterTest / MediaEmbedFilterTranslationTest /

Fails because it sets the current_user service to an instance of User, even though it should be an instance of AccountProxyInterface. Now that permission checks inevitably cause the account switcher service to be fired up (it's a dependency of AccessPolicyProcessor), we get the error seen here.

Conclusion: This should be an easy fix. Could be done in a separate issue to reduce noise in this MR.

AccessDeniedTest

I need to look into this one further

Other fails

Seems to be Rest/jsonapi related, can probably fix all of them once I find out why they're failing.

kristiaanvandeneynde’s picture

I fixed UserSessionTest, yet the extending UserTest is still failing even though it doesn't change anything about the thing that's failing. Must be something to do with the mock classes it creates vs the actual classes the parent creates. I'm lost, will focus on other failures :/

kristiaanvandeneynde’s picture

Okay so all of the tests extending NodeResourceTestBase are failing in testPatchPath because that one hands out extra permissions to the "Authenticated user" role during the test.

Normally, this would trigger the config:user.role.authenticated cache tag to be invalidated (and it does) and therefore clear the calculated permissions cache, because UserRolesAccessPolicy properly adds that tag to the CalculatedPermissions object. However, when inspecting CacheTagsChecksumTrait, it shows that the role's cache tag is already part of invalidatedTags and does therefore not invalidate again.

I have no idea why this is happening, but manually clearing 'access_policies' makes the test go green, so the problem really is that Drupal is trying to clear the role's cache tag but does not get the desired result because it got cleared before and somehow wasn't removed from CacheTagsChecksumTrait::invalidatedTags when the role was saved again.

kristiaanvandeneynde’s picture

Talked about it on Slack with Berdir and he pointed out that this is exactly why we have refreshVariables() in browser tests. To reset the checksum thingy so that it doesn't get out of sync when multiple browser requests are fired during the same test. Should see some more tests go green now.

kristiaanvandeneynde’s picture

MediaJsonCookieTest::testPost and other sibling classes are failing a bit weirdly. They are instantly getting to the 400 part in the code below, leading me to assume they are already authorized. Now the question is, why did this not fail before. Using the same trick as the other rest/jsonapi tests does not work here and that seems logical as we are actually trying _not_ to be authenticated here.

All in all, I am seeing a lot of rest/jsonapi tests go red because of how they were written, not because of the changes I am making to core. Meh :/

From EntityResourceTestBase.php

    // DX: 403 when unauthorized.
    $response = $this->request('POST', $url, $request_options);
    $this->assertResourceErrorResponse(403, $this->getExpectedUnauthorizedAccessMessage('POST'), $response);

    $this->setUpAuthorization('POST');

    // DX: 400 when no request body.
    $response = $this->request('POST', $url, $request_options);
    $this->assertResourceErrorResponse(400, 'No entity content received.', $response);
kristiaanvandeneynde’s picture

These browser tests are such a pain :/ Now AccessDeniedTest is somehow getting a cached response where it shouldn't. It does not contain drupalSettings and therefore fails. If I invalidated the 'rendered' cache tag, it all works. Trying to figure out what is going wrong here.

kristiaanvandeneynde’s picture

Okay so I figured out why that dreadful test goes red:

  public function testAccessDenied() {
    $this->drupalGet('admin');
    $this->assertSession()->pageTextContains('Access denied');
    $this->assertSession()->statusCodeEquals(403);

With both the old hash generator and the new one does this page not contain drupalSettings. Which is weird because when I reproduce this test in the browser by clicking around, I do see the drupalSettings being added to the footer of the page. This seems to only happen for anonymous sessions.

// Ensure that users without permission are denied access and have the
    // correct path information in drupalSettings.
    $this->drupalLogin($this->createUser([]));
    $this->drupalGet('admin', ['query' => ['foo' => 'bar']]);

With the old hash generator this page contains the drupalSettings object, with the old hash generator it does not. Because the old hash generator used to include the role names and I don't, the page gets a new hash (anonymous became authenticated) but the new hash generator still returns the same hash as both anon and auth have the same permissions.

This leads me to conclude that whatever is causing drupalSettings to not show up for anon forgot to add the user.is_anonymous cache context. The fact that my hashes now (rightfully) collide leads to the auth user seeing the anon page, which does not contain drupalSettings.

Now if anyone could point me in the right direction on why drupalSettings does not show up on the anon page? Can be reproduced by changing the first few lines of the test to:

    $this->drupalGet('admin');
    $this->assertSession()->pageTextContains('Access denied');
    $this->assertSession()->statusCodeEquals(403);
    $settings = $this->getDrupalSettings();
    // This breaks the same way, but for ALL hash generators (old and new).
    $this->assertEquals('admin', $settings['path']['currentPath']);

This cost me almost 8 hours to debug already. What a nightmare.


Edit: You can reproduce this in any test. Add the following lines to the top of any browser test that has no user logged in in the setUp and it will fail with the same error message as seen here.

    $this->drupalGet('');
    $this->assertArrayHasKey('currentPathIsAdmin', $this->getDrupalSettings()['path']);
kristiaanvandeneynde’s picture

For anonymous users in tests, HtmlResponseAttachmentsProcessor::processAssetLibraries() hands an empty array to the JsCollectionRenderer and sets that to $variables['scripts_bottom']. When doing the same in a browser, it correctly contains drupal settings.

This has now been proven to be a core bug that I do not have time to investigate. Will file a separate issue for this tomorrow, adding back the failing test code that I am removing from the failing test here.

kristiaanvandeneynde’s picture

Re 12/13, I reported it so we can definitely move this issue along: #3379220: system_page_attachments() varies by authenticated user role but does not add said cache context

Edit: FYI this was the culprit (from system_page_attachments())

if (\Drupal::currentUser()->isAuthenticated()) {
    $page['#attached']['library'][] = 'core/drupal.active-link';
  }

So now we need to add user.roles:anonymous to the renderer.config.required_cache_contexts

kristiaanvandeneynde’s picture

  1. Cleaned up user test to be an actual unit test rather than feeding data that the deeper implementation used to require. Now it just mocks the permission checker.
  2. Then redid the deprecation in the hash generator and rewrote that test.

All that is left now is add tests for the individual policies.

kristiaanvandeneynde’s picture

All green 🎉

needs-review-queue-bot’s picture

Status: Needs review » Needs work
StatusFileSize
new13.49 KB

The Needs Review Queue Bot tested this issue. It fails the Drupal core commit checks. Therefore, this issue status is now "Needs work".

This does not mean that the patch needs to be re-rolled or the MR rebased. Read the Issue Summary, the issue tags and the latest discussion here to determine what needs to be done.

Consult the Drupal Contributor Guide to find step-by-step guides for working with issues.

kristiaanvandeneynde’s picture

Status: Needs work » Needs review
smustgrave’s picture

Status: Needs review » Reviewed & tested by the community

Since #3376843: Add the access policy API is in RTBC think this should be reviewed with it, could be wrong.

kristiaanvandeneynde’s picture

The only test failure here is after I added the "do not switch if user is current user" logic. Trying to figure out why only that single core test (UrlIntegrationTest) trips over this recent change. It goes green if I revert that single change.

kristiaanvandeneynde’s picture

Found it! In:

      if ($cache_context_root === 'user' && $this->currentUser->id() != $account->id()) {

Current user ID was 0 and $account ID was NULL. I specifically had a non-strict check in there because I thought users returns strings as IDs, but apparently that's not the case. So I went ahead and made the check strict and that fixed it.

However, this means we allow unsaved user's permissions to be calculated and the question is whether that is something we should allow. Arguably, UrlIntegrationTest should use the UserCreationTrait and properly create users that way.

Testing access with an unsaved account is asking for trouble as caching per user permissions would also not work properly in that scenario, even without this whole new access policy API.

Will push a fix, but please advise whether we:

  • Update UrlIntegrationTest to use UserCreationTrait
  • Update AccessPolicyProcessor to throw an exception or something if $account->id() is NULL
  • Do something else than the two above options
larowlan’s picture

Status: Reviewed & tested by the community » Needs work

Left a review on the MR, Will be easier to review once the API issue is in, but given I'd just reviewed the other one, could tell what the changes were above that issue.

kristiaanvandeneynde’s picture

Status: Needs work » Needs review

Addressed all questions. Setting to NR rather than RTBC so that my answers can be evaluated first.

smustgrave’s picture

Status: Needs review » Reviewed & tested by the community

Haven't been involved but floating around thanks to the follow feature.

Believe all feedback has been addressed so reRTBC.

kristiaanvandeneynde’s picture

Merged a recent last-minute change that drops the new cache factory. Should still go green but let's see what the test results have to say.

kristiaanvandeneynde’s picture

Status: Reviewed & tested by the community » Needs work

Setting to NW because the new cache approach fails in UserPermissionsTest but only because of how we test. So the implementation still works, it's just that our tests fail because of how we refresh variables. Will need to figure that one out here.

Cross-posting from #3376843-57: Add the access policy API:

UiHelperTrait::submitForm() and UiHelperTrait::drupalGet() both call $this->refreshVariables();, but if we look into that:

protected function refreshVariables() {
    // Clear the tag cache.
    \Drupal::service('cache_tags.invalidator')->resetChecksums();
    foreach (Cache::getBins() as $backend) {
      if (is_callable([$backend, 'reset'])) {
        $backend->reset();
      }
    }

    \Drupal::service('config.factory')->reset();
    \Drupal::service('state')->resetCache();
  }

It's clearly only resetting cache.bin tagged services, which is why we are seeing this testfail. If I manually add in this line:

\Drupal::service('cache.access_policy_memory')->reset();

Then the tests go green again.

So it's purely a test fail because refreshVariables() doesn't pick up on cache tag invalidators that also support a reset() method.

larowlan’s picture

Needs reroll now the access policy API is in

kristiaanvandeneynde’s picture

Opened #3402850: Fix MemoryCache discovery and DX to fix what i wrote in #26 "the right way"

kristiaanvandeneynde’s picture

#3379220: system_page_attachments() varies by authenticated user role but does not add said cache context goes green now. If we commit that along with #3402850: Fix MemoryCache discovery and DX then this patch becomes a bit smaller and no longer has to adjust that many tests to work around the actual bugs. Will revert the bugfix for system_page_attachments() in this MR.

kristiaanvandeneynde’s picture

Was out sick for a week, but just replied to both blocker issues, minor as they are, so hopefully we can get those in soon and then this issue will be unblocked again.

kristiaanvandeneynde’s picture

One blocker is now multi-RTBC and I addressed notes in the other, making that one NR again, and probably RTBC soon.

Re my latest commit changing a typehint in the code that was added as part of the API, does that need a new issue or can we simply incorporate that in this patch as we're still on 10.3.x, which is the branch that added the new API to begin with?

kristiaanvandeneynde’s picture

#3402850: Fix MemoryCache discovery and DX got committed, will try to adjust the MR to incorporate those changes tomorrow.

kristiaanvandeneynde’s picture

Okay so now only AccessDeniedTest fails, which will be solved by #3379220: system_page_attachments() varies by authenticated user role but does not add said cache context

It's also failing in the newly introduced StandardPerformanceTest, which is to be expected because it hard-codes how many cache gets are expected and that obviously changes with the work being done here. Results of those changes:

  • testAnonymous: Failed asserting that 128 is equal to 129 or is greater than 129. So we actually have one or more cache gets fewer.
  • testLogin: Failed asserting that 30 is identical to 28. So we actually have two cache gets more.
  • testLoginBlock: Failed asserting that 32 is identical to 30. So we actually have two cache gets more.

Overall slight performance boost for anonymous, slight hit for login process. But that is merely based on reporting of this test.

The fact that your permissions are now cached based on the cache contexts provided by the access policies could actually translate into an overall performance gain because we no longer need to load your user role entities on every request (to gather their permissions).

Would it be fine to tweak the numbers in StandardPerformanceTest to match the new reality?

partdigital’s picture

kristiaanvandeneynde, I was attempting to try out the Access Policy API in Drupal 11 as part of an investigation into integrating it into the Access Policy module. Even though the service collector is defined in core.services.yml I noticed that addAccessPolicy is never actually called. It looks like this is because it's never retrieved by the container except in the automated tests. I assume that's what this issue is for?

Is this API currently dormant in Drupal 11? If so, would it be worth mentioning that in the documentation This way developers don't go down a rabbit hole only to discover it's not ready?

Unless of course I'm missing something entirely which in that case, open to any suggestions.

kristiaanvandeneynde’s picture

@partdigital you're right that it's never called yet. That's what this issue is for :)

Previous issue added the API, this issue is for making core use it.

kristiaanvandeneynde’s picture

Status: Needs work » Needs review

Merged latest core now that all blockers are resolved and updated StandardPerformanceTest to match #33

kristiaanvandeneynde’s picture

All tests go green once more 🎉

Here's a reviewer guide for all the changes:

  1. AccountPermissionsCacheContext: We now get the cacheable metadata from the hash generator as we do not know what data the permissions were built with and therefore vary by.
  2. AccessPolicyBase: Changed typehint to match what's actually going on, this is an oversight from #3376843: Add the access policy API Can do this because API got added in 10.3 and is not released yet
  3. AccessPolicyInterface: See 2.
  4. PermissionChecker: Now checks your permissions coming from access policies
  5. PermissionHashGenerator: Now hashes the calculated permissions
  6. PermissionHashGeneratorInterface: Includes new method for getting cacheability, follows 1:1 rule
  7. SuperUserAccessPolicy + test: New code, grants permissions for user 1
  8. UserRolesAccessPolicy + test: New code, grants permissions based on user roles
  9. CommentResourceTestBase, MediaResourceTestBase, MediaOverviewPageTest, NodeResourceTestBase: Need to call refreshVariables or certain cache tags do not get flushed due to how these tests span multiple requests.
  10. MediaEmbedFilterDisabledIntegrationsTest, MediaEmbedFilterTest, MediaEmbedFilterTestBase, UserAccountFormPasswordResetTest: Were incorrectly setting current_user service
  11. StandardPerformanceTest; Small changes reflected in test
  12. RenderCacheTest: Deleted because it no longer makes any sense, we can now have potentially millions of access policies aside from UID 1 being special
  13. AccessPolicyProcessorTest: See 2.
  14. PermissionCheckerTest: New code
  15. PermissionsHashGeneratorTest: Adjusted to match new code
  16. UserSessionTest: Adjusted to match new code
  17. core.services.yml: Adjusted for new services and dependencies

Hope that helps

smustgrave’s picture

Status: Needs review » Needs work

So have not reviewed yet but see there are some failures in the MR. Seems to have got bit by #3401988: Spell-checking job fails with "Argument list too long" when too many files are changed

kristiaanvandeneynde’s picture

Merged in 11.x, which now has #3396196: Separate cache operations from database queries in OpenTelemetry and assertions so we should now be able to get exact performance data. Curious to see how much difference there is. Also, will see if I can reduce the size a bit more.

kristiaanvandeneynde changed the visibility of the branch 11.x to hidden.

kristiaanvandeneynde’s picture

Status: Needs work » Needs review
kristiaanvandeneynde’s picture

Okay so new performance values are in:

  • OpenTelemetryAuthenticatedPerformanceTest::testFrontPageAuthenticatedWarmCache() query count goes up by 1
  • StandardPerformanceTest::testAnonymous() query count goes up by 1
  • StandardPerformanceTest::testLogin() query count goes up by 2
  • StandardPerformanceTest::testLoginBlock() query count goes up by 2

Cache get count stays the same across the board

catch’s picture

@kristiaanvandeneynde do you know what the additional database queries are?

kristiaanvandeneynde’s picture

The calculated permissions are stored in the DB using VariationCache so that we can actually have access policies (impossible without cache contexts). Before, we'd load your user roles and be done with it, which was always one query. Now we load your calculated permissions and, depending on how variable those are, follow a cache redirect or two to get to the final ones.

So I'm almost 100% sure the extra queries are cache redirects being followed.

catch’s picture

Cache redirects being followed should only be cache gets/sets, except for cache tags though, so if there are different cache tags being requested it could be that.

kristiaanvandeneynde’s picture

Oh, right forgot the new performance testing now disregards queries coming from cache sets/gets. Could definitely be cache tags, yeah. If we want a definitive answer we'd have to compare the current output vs output when running this on ddev with your tools.

Either way, the main difference between old code and new code is what I wrote in #44. Your permissions now come from potentially many sources and we need to gather them from said sources at least once until we cache it. At that point we have cache redirects to follow, depending on the complexity.

So I'd start looking there to see what changed, but then again your tool might simply tell us which queries are gone and which are new.

kristiaanvandeneynde’s picture

kristiaanvandeneynde’s picture

Hmm cache get count went down by one all of a sudden. Adjusted MR but this might spell trouble. Ignore the commit message mentioning query count, that was me not being properly awake yet.

kristiaanvandeneynde’s picture

All 3 spin-off issues from #47 have a green MR and are easy to review and commit.

kristiaanvandeneynde’s picture

Okay so most recent performance values are in:

  • OpenTelemetryAuthenticatedPerformanceTest::testFrontPageAuthenticatedWarmCache()
    • Query count goes up by 1
    • Cache get count goes down by 1
  • StandardPerformanceTest::testAnonymous()
    • Query count goes up by 1
    • Cache get count goes down by 1
  • StandardPerformanceTest::testLogin()
    • Query count goes up by 2
    • Cache get count goes down by 1
  • StandardPerformanceTest::testLoginBlock()
    • Query count goes up by 2
    • Cache get count goes down by 2

Not sure why this is different from #42 but I get the same values locally so I guess they're correct.

kristiaanvandeneynde’s picture

Only fail is because of #3417721: Adjust performance tests to avoid random failures, currently working on comparing the telemtry data from pre-patch and post-patch to get an accurate description of the performance changes.

kristiaanvandeneynde’s picture

Okay so tests results are in.

On my shitty laptop, performance was as follows:

  • StandardPerformanceTest::testAnonymous(), part 1: 2.32s -> 1.59s (31% faster)
  • StandardPerformanceTest::testAnonymous(), part 2: 446ms -> 419ms (6% faster)
  • StandardPerformanceTest::testAnonymous(), part 3: 319ms -> 245ms (23% faster)
  • StandardPerformanceTest::testLogin(): 227ms -> 226ms (same)
  • StandardPerformanceTest::testLoginBlock(): 255ms -> 332ms (30% slower)

So performance for anonymous users went up across the board and was worse for one login test case out of two.

Here is the difference on StandardPerformanceTest::testLoginBlock, because that one was slower.

Post-patch Pre-patch
Starts off similarly
get access_policy, CID "access_policies:drupal:[user.is_super_user]=0:[user.roles]=anonymous get discovery, CID "entity_type"
SELECT "tag", "invalidations" FROM "test51561594cachetags" WHERE "tag" IN ( "access_policies", "config:user.role.anonymous" ) SELECT "tag", "invalidations" FROM "test99820655cachetags" WHERE "tag" IN ( "entity_types" )
getMultiple config, CID "user.role.anonymous"
get bootstrap, CID "module_implements"
A bunch of similar stuff
get discovery, CID "entity_type" getMultiple config, CID "views.view.frontpage"
SELECT "tag", "invalidations" FROM "test51561594cachetags" WHERE "tag" IN ( "entity_types" )
getMultiple config, CID "views.view.frontpage"
get bootstrap, CID "module_implements"
A whole lot of similar stuff
SELECT "name", "value" FROM "test51561594key_value" WHERE "name" IN ( "system.private_key" ) AND "collection" = "state" get bootstrap, CID "user_permissions_hash:anonymous"
SELECT "tag", "invalidations" FROM "test99820655cachetags" WHERE "tag" IN ( "config:user.role.anonymous" )
Then a bunch of stuff where 5 rows around firstByte were the same but in different order
get access_policy, CID "access_policies:drupal:[user.is_super_user]=0:[user.roles]=authenticated" get discovery, CID "entity_type"
SELECT "tag", "invalidations" FROM "test51561594cachetags" WHERE "tag" IN ( "access_policies", :"config:user.role.authenticated" ) SELECT "tag", "invalidations" FROM "test99820655cachetags" WHERE "tag" IN ( "entity_types" )
SELECT "name", "value" FROM "test51561594key_value" WHERE "name" IN ( "system.maintenance_mode" ) AND "collection" = "state" getMultiple config, CID "user.role.authenticated"
get dynamic_page_cache, CID "response:[request_format]=html:[route]=view.frontpage.page_19c08cb355eb3b07aca692af8c68cc11a934f5e596bf92a0449a69ff3afcd6d3f" get bootstrap, CID "module_implements"
get bootstrap, CID "module_implements" SELECT "name", "value" FROM "test99820655key_value" WHERE "name" IN ( "system.maintenance_mode" ) AND "collection" = "state"
get discovery, CID "entity_type" get dynamic_page_cache, CID "response:[request_format]=html:[route]=view.frontpage.page_19c08cb355eb3b07aca692af8c68cc11a934f5e596bf92a0449a69ff3afcd6d3f"

So if we reduce that to just the entries that are unique, we get (cache gets):

  1. get access_policy, CID "access_policies:drupal:[user.is_super_user]=0:[user.roles]=anonymous"
  2. get access_policy, CID "access_policies:drupal:[user.is_super_user]=0:[user.roles]=authenticated"

Compared to:

  1. getMultiple config, CID "user.role.anonymous"
  2. get bootstrap, CID "user_permissions_hash:anonymous"
  3. getMultiple config, CID "user.role.authenticated"

And query-wise:

  1. SELECT "tag", "invalidations" FROM "test51561594cachetags" WHERE "tag" IN ( "access_policies", "config:user.role.anonymous" )
  2. SELECT "name", "value" FROM "test51561594key_value" WHERE "name" IN ( "system.private_key" ) AND "collection" = "state"
  3. SELECT "tag", "invalidations" FROM "test51561594cachetags" WHERE "tag" IN ( "access_policies", :"config:user.role.authenticated" )

Compared to:

  1. SELECT "tag", "invalidations" FROM "test99820655cachetags" WHERE "tag" IN ( "entity_types" )
  2. SELECT "tag", "invalidations" FROM "test99820655cachetags" WHERE "tag" IN ( "config:user.role.anonymous" )
kristiaanvandeneynde’s picture

Summary of the above: Seems like it's significantly faster on anonymous sessions and sometimes slower on authenticated ones.

Also, not going to comb over the other result sets as it takes a long time to do so. But I can post the exports here for someone else to take a look at.

catch’s picture

OK that makes a lot of sense. The extra queries are for cache tags and state for the private key. Makes me wonder if we should split cache tags out from database queries too in the assertions so it's easier to see when that's the thing that changes.

kristiaanvandeneynde’s picture

Updated MR to what I think is right after merge.

Also, the small issues from #47 are all RTBC now, so would be great if we could commit those to make the MR here smaller.

kristiaanvandeneynde’s picture

Makes me wonder if we should split cache tags out from database queries too in the assertions so it's easier to see when that's the thing that changes.

I think that would be a great quality of life improvement.

kristiaanvandeneynde’s picture

Okay so kind of chasing performance tests now as they are sadly not as exact as we would like them to be. We'll fix this for sure, but in the meantime please focus on the actual code and ignore the performance tests counts.

kristiaanvandeneynde’s picture

2 out of 3 side quests from #47 got committed, so this MR just got smaller :)

Merged in 11.x and I think I adjusted performance values correctly, will keep an eye on results. If we could now get #3417362: Call refreshVariables() where needed in various tests committed, then I think this MR would become roughly as small as it can be.

smustgrave’s picture

Status: Needs review » Needs work

Seems to need a manual rebase. Will keep an eye out though

kristiaanvandeneynde’s picture

Status: Needs work » Needs review

Updated the MR, #3417362: Call refreshVariables() where needed in various tests is still pending commit to make the MR smaller. Also the performance impact as to what queries changed will be far more visible once we commit #3421164: Log every individual query in performance tests.

Still in favor of deleting RenderCacheTest as a whole because the user.permissions test case is no longer valid and the only other test case (user.roles) will soon become invalid too after a recent discussion we had where said cache context should not be returning a magic string for UID 1. Issue is pending for that one.

needs-review-queue-bot’s picture

Status: Needs review » Needs work
StatusFileSize
new90 bytes

The Needs Review Queue Bot tested this issue. It no longer applies to Drupal core. Therefore, this issue status is now "Needs work".

This does not mean that the patch necessarily needs to be re-rolled or the MR rebased. Read the Issue Summary, the issue tags and the latest discussion here to determine what needs to be done.

Consult the Drupal Contributor Guide to find step-by-step guides for working with issues.

kristiaanvandeneynde’s picture

Status: Needs work » Needs review

Well yes mr. bot I was in the middle of things here...

kristiaanvandeneynde’s picture

Starting to wonder how much value there is in chasing the performance tests here until #3421164: Log every individual query in performance tests lands.

needs-review-queue-bot’s picture

Status: Needs review » Needs work
StatusFileSize
new90 bytes

The Needs Review Queue Bot tested this issue. It no longer applies to Drupal core. Therefore, this issue status is now "Needs work".

This does not mean that the patch necessarily needs to be re-rolled or the MR rebased. Read the Issue Summary, the issue tags and the latest discussion here to determine what needs to be done.

Consult the Drupal Contributor Guide to find step-by-step guides for working with issues.

kristiaanvandeneynde’s picture

Status: Needs work » Needs review
needs-review-queue-bot’s picture

Status: Needs review » Needs work
StatusFileSize
new90 bytes

The Needs Review Queue Bot tested this issue. It no longer applies to Drupal core. Therefore, this issue status is now "Needs work".

This does not mean that the patch necessarily needs to be re-rolled or the MR rebased. Read the Issue Summary, the issue tags and the latest discussion here to determine what needs to be done.

Consult the Drupal Contributor Guide to find step-by-step guides for working with issues.

nod_’s picture

bot is not happy since patch doesn't apply from the MR, can someone merge 11.x in the issue branch so it stops complaining? (for commit we still use the patch, not the MR so it's still important to have the MR working as a patch).

kristiaanvandeneynde’s picture

I've been doing that for the past few days :D

It keeps complaining because we're touching the performance tests here and both @catch and I have been hard at work optimizing those tests. So it keeps throwing merge conflicts here whenever one of those issues is committed.

kristiaanvandeneynde’s picture

Status: Needs work » Needs review
nod_’s picture

ah yes sorry, you can always add the no-needs-review-bot tag to make him quiet if necessary :)

kristiaanvandeneynde’s picture

Issue tags: +no-needs-review-bot

That's a great tip, thanks. I'll remove it once we're fully done with the performance stuff.

kristiaanvandeneynde’s picture

After merging in #3421164: Log every individual query in performance tests locally, I've noticed the queries being added are really simple. It's always:

'SELECT "name", "value" FROM "key_value" WHERE "name" IN ( "system.private_key" ) AND "collection" = "state"',

Right after:

'SELECT "name", "value" FROM "key_value" WHERE "name" IN ( "system.maintenance_mode" ) AND "collection" = "state"',

With one exception in StandardPerformanceTest::testLoginBlock() where the above happens, but the same system.private_key is also added early on and right after:

'SELECT "name", "value" FROM "key_value" WHERE "name" IN ( "views.view_route_names" ) AND "collection" = "state"',

So every test gets one extra call to the key value store and one of them gets two.

kristiaanvandeneynde’s picture

The above makes perfect sense because the PermissionsHashGenerator used to find the calculated hash as follows:

    $cid = "user_permissions_hash:$role_list";
    if ($static_cache = $this->static->get($cid)) {
      return $static_cache->data;
    }
    // Also tries DB cache below.

But now, we do not store that hash in the DB anymore because we'd be repeating how it's stored in the AccessPolicyProcessor's cache. Meaning we always have to calculated a hash live at least once per request before we store it in the static cache. This is the extra query we're seeing for the private key and frankly is a slight false positive in a sense that on a cold cache we also need to request the private key at least once.

The fact that it's happening twice with the login block might be because your account ID changes and the following static cache logic doesn't find anything for you then:

    $cid = 'permissions_hash_' . $account->id();

    // Retrieve the hash from the static cache if available.
    if ($static_cache = $this->static->get($cid)) {
      return $static_cache->data;
    }
catch’s picture

The login block test does both the POST request for the form submission and also the subsequent GET request after the redirect, so that explains the two queries there I think. The extra key value query will go away aftr #2575105: Use cache collector for state so no concerns about that.

kristiaanvandeneynde’s picture

Individual query tracking issue got merged, so now you can see the changes for yourself here.

quietone’s picture

Issue summary: View changes

I was doing triage on #3417362: Call refreshVariables() where needed in various tests which I came to understand is a child of this issue from reading comment #47. I then added the other two issues mentioned in that comment as children. Then, I converted to the standard issue template so it is easier for more people to find information and know what still needs to be done here etc. And finally, I hide all the files.

kristiaanvandeneynde’s picture

Thanks @quietone, the only sidequest that still needs committing is the one you mentioned.

That will reduce the MR here to:

  • 6 code file changes (4 changed, 2 new)
  • 8 test file changes (4 changed, 1 removed, 3 new)
  • 1 core.services.yml change

Which is actually really small compared to how massively this changes the access layer in core.

kristiaanvandeneynde’s picture

#3417362: Call refreshVariables() where needed in various tests got committed, merged it in.

The MR is a lot easier to review now than it was a few weeks ago.

smustgrave’s picture

Status: Needs review » Needs work

This looks good and hopefully can get into 10.3

Left a few comments, all nitpicky stuff but will keep an eye out for this

kristiaanvandeneynde’s picture

Thanks so much for the reviews and changes @smustgrave and @larowlan.

I've gone over the threads and have the following left:

  • Ping @catch regarding the method change on the permissions hash generator: DONE
  • Check the 2 test cases that don't test much: Left a comment, you can decide what we do here.
  • Write a kernel test for user roles access policy: TO DO, will do so next week
kristiaanvandeneynde’s picture

Added change record here: https://www.drupal.org/node/3435842
All that remains now is the extra kernel test, on it now.

kristiaanvandeneynde’s picture

Status: Needs work » Needs review

All threads addressed, tests still go green.

smustgrave’s picture

Status: Needs review » Reviewed & tested by the community

All feedback appears to be addressed. Did see the slack thread and didn't seem like the tests were a stopper for 10.3

kristiaanvandeneynde’s picture

That's awesome, thanks for the review 🎉

larowlan’s picture

Status: Reviewed & tested by the community » Needs review

I announced my intention to commit this to other committers and was taking one last pass when I realised that there were two tests in RenderCacheTest, one for user.permissions (which is redundant now) but another for user.roles.

I've reinstated that tests with adjustments for the new code here. I also merged 11.x. Can you review and confirm that my changes to reinstate that test make sense. Fine to self RTBC if so.

larowlan’s picture

Issue credits

kristiaanvandeneynde’s picture

Status: Needs review » Reviewed & tested by the community

From #60:

Still in favor of deleting RenderCacheTest as a whole because the user.permissions test case is no longer valid and the only other test case (user.roles) will soon become invalid too after a recent discussion we had where said cache context should not be returning a magic string for UID 1. Issue is pending for that one.

The reason I mentioned it would be removed soon is because, and I can now say this, the user roles cache context has a security issue. It was, however, deemed minor enough that a public follow-up could be created. In essence, checking for user 1 in the roles cache context is a relic that has no right to be there. Because this issue fixes the same thing for the user.permissions cache context, I thought outright deleting RenderCacheTest would be fine.

But you're right that this is perhaps not the place and we should delete the rest of the test in that soon-to-be-created public follow-up.

Your test changes look good to me and as discussed on Slack that warrants setting it back to RTBC. Thanks for all the reviews!

kristiaanvandeneynde’s picture

For posterity, this is the public security issue I was talking about: #3436395: UserRolesCacheContext can lead to poisoned cache returns for user 1

  • larowlan committed 9bb46296 on 10.3.x
    Issue #3376846 by kristiaanvandeneynde, larowlan, smustgrave, catch:...

  • larowlan committed 8a2a2fe2 on 11.x
    Issue #3376846 by kristiaanvandeneynde, larowlan, smustgrave, catch:...
larowlan’s picture

Status: Reviewed & tested by the community » Fixed

Committed to 11.x and backported to 10.3.x🎉

Published the change records.

Congrats @kristiaanvandeneynde on a mammoth effort, its been a pleasure working with you and I look forward to further collaboration in your new role as User subsystem maintainer.

larowlan’s picture

Issue summary: View changes
Issue tags: +10.3.0 release highlights

Added release notes snippet, please review

baluertl’s picture

Woo-hoo! 🎉🎊

Thank you @kristiaanvandeneynde, @larowlan and all contributors for their hard work on this era-shifting feature.

larowlan’s picture

Version: 11.x-dev » 10.3.x-dev
kristiaanvandeneynde’s picture

Issue summary: View changes

This is awesome 🚀

Thanks for the many reviews to everyone involved!

I've update the release note snippet to indicate user roles are also an access policy.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

damienmckenna’s picture

Assigned: kristiaanvandeneynde » Unassigned
nitesh624’s picture

This changes caused out Functional test to failed
More details on https://www.drupal.org/project/drupal/issues/3463722

claudiu.cristea’s picture

Created #3476498: What happens with user.is_super_user cache context? as a followup to decide the fate of user.is_super_user cache context

danflanagan8’s picture

This broke a site where we extend PermissionsHashGenerator.

When calling parent::__construct($private_key, $cache, $static, $entity_type_manager); it goes WSOD because the PermissionsHashGenerator constructor got changed to accept three arguments instead of four. Beyond that, the `$cache` property is removed from PermissionsHashGenerator. Isn't that pretty big BC no-no kind of stuff?

I see there was a conversation about BC breaks (https://git.drupalcode.org/project/drupal/-/merge_requests/4494#note_284734) in contrib in at least on context, but I don't see any conversation around several other potential BC problems.

kristiaanvandeneynde’s picture

The changes followed the BC dance:

  1. We used to have __construct(A $a, B $b, C $c, ?D $d = NULL)
  2. Now we have __construct(A $a, B $b, C|E $e)
  3. , where extra code runs in case someone is still providing ABC or ABCD

Obviously in D11, the BC layer has been removed, but in D10.3, you should be fine.

danflanagan8’s picture

Thanks, @kristiaanvandeneynde.

Am I really allowed to pass four arguments to a constructor that accepts only 3 arguments? We were passing the EntityTypeManager as the fourth argument rather than relying on the already existing BC layer for that fourth argument (lots of work on this class lately!).

I'm still curious about removing a property from the class:

Beyond that, the `$cache` property is removed from PermissionsHashGenerator. Isn't that pretty big BC no-no kind of stuff?

My extension of PermissionsHashGenerator was using that property and therefore completely broke.

kristiaanvandeneynde’s picture

In php, you can pass as many arguments to a function even if it has only a few parameters. This is where some of this func_get_args() voodoo comes from in functions that accept any amount of arguments.

As to the $cache property, that's a tough one. Technically it's a BC break and at the very least we should have added a second trigger_error with a "there is no replacement" type of message for $cache like we did in doGenerate().

We could have kept assigning a cache backend to the $cache property, but then the question arises: "What's the point?" The actual class you were extending no longer makes use of it, so does it make sense for the extending class to still make calls to it?

Either way, this seems like an oversight and I apologize for the inconvenience. Personally speaking, I think this shows why we should have more classes in core marked as internal and/or final. One could argue the permission hash generator should not be considered API, as it's only consumed by one cache context and sent to the frontend for external caches.

I'd rather make it internal then and if people really want to change the outcome, encourage them to decorate it instead. This would retain our freedom to completely overhaul the internal hashing logic (like we did here), without risking people ending up in a situation like yours.

danflanagan8’s picture

I think you nailed it, @kristiaanvandeneynde:

I think this shows why we should have more classes in core marked as internal and/or final.

Agreed! I was kind of surprised wasn't internal.

Thanks for your comments to me and your work on the issue. Cheers!