Problem/Motivation
We have list_cache_contexts for entities, but this needs to be applied whenever a list of entities is created and can be forgotten. That's an issue on sites using the node grants system since it could lead to information disclosure.
Proposed resolution
Add the cache context automatically in node_query_node_access_alter(). This ensures that any entity query listing nodes will automatically bubble the user.node_grants:$op cache context.
Remaining tasks
None.
User interface changes
None.
API changes
No changes, only additional code an an existing alter hook.
Data model changes
None.
Beta phase evaluation
| Issue category | Task because not a bug, but a security improvement. |
|---|---|
| Issue priority | Major because it is not a release blocker, but is very important to protect against common mistakes/oversights that could lead to information disclosure security vulnerabilities. |
| Disruption | Zero disruption. |
| Comment | File | Size | Author |
|---|---|---|---|
| #90 | node_access_cache_context_bubbled_automatically-2557815-90.patch | 6.55 KB | wim leers |
| #16 | node_access_cache_context_bubbled_automatically-2557815-16.patch | 10.45 KB | wim leers |
Comments
Comment #2
wim leersUploading the PoC I posted at #2556889-40: [policy, no patch] Decide if SmartCache is still in scope for 8.0 and whether remaining risks require additional mitigation.
Comment #3
catchContext!
Comment #4
berdirMakes sense!
Comment #5
wim leersAnd here's test coverage.
Comment #8
wim leersGreen after re-testing, so PIFR & DrupalCI agree again.
Comment #9
effulgentsia commentedI haven't reviewed the tests yet, but the node.module change looks great. Meanwhile, how's this for docs explaining the rationale for having this in 8.x, but removing in 9?
Comment #11
effulgentsia commentedOh, Macs and your silly files. Removed.
Comment #12
wim leers+1, thanks for writing that comment!
Comment #13
dawehnerFor all those automatic magic ... is there a way that we can toggle them off, so we can find problematic parts of our code easier?
missing proper doc
Comment #14
effulgentsia commentedRe #13.1, we have:
'user.permissions'added as a required cache context incore.services.yml.MetadataBubblingUrlGeneratorused as theurl_generatorimplementation incore.services.yml.node_query_node_access_alter().Should we move all of that to something like a cache_context_safeguards.module? And if so, should we do something to prevent it from being disabled without massive warnings? Or are you thinking of some other approach to toggling off?
Comment #15
wim leersWow, #14 is a very interesting idea! We could even make that module hidden.
Comment #16
wim leersHere's a reroll to address #13.2.
I think addressing #13.1 will need broader changes. I like the idea in #14. But to do that for this particular patch (which changes
node_query_node_access_alter()), I think that's going to be difficult. Fortunately, we can apply the same tactic as in other places: add a decorating service that does the cacheability bubbling.Given those changes that I made in this reroll, I think the answer to #13.1 then becomes "service aliases", and we can do that in a separate issue, just like #14 proposes.
Comment #17
wim leersHere's a PoC patch implementing #14. Incomplete, but shows that moving the URL + node access bubbling can indeed be moved to a separate module that can be disabled.
Comment #19
wim leersWim--
Comment #20
wim leersAgain, Wim--
Comment #22
wim leersI'm making mistake after mistake. Clearly I can't multitask. Sorry for all the noise.
Per #2556889-65: [policy, no patch] Decide if SmartCache is still in scope for 8.0 and whether remaining risks require additional mitigation, this should not handle
MetadataBubblingUrlGenerator. #20 partially reverted that, so that's why #20 should also fail badly.Comment #23
wim leersThis makes the module required and adds an update hook, per #2556889-65: [policy, no patch] Decide if SmartCache is still in scope for 8.0 and whether remaining risks require additional mitigation.
Comment #25
wim leerscache_context_safeguardsimplies it's only about cache contexts. But in case of #2555665: When index is added for content_translation_uid, the corresponding stored schema definition is not updated, we know that it's also about cache tags (the current user's cache tag). So, renaming tocacheability_safeguards.Also moved the test coverage out of the node module and into this module.
Comment #27
wim leersThe failures that started appearing in #23 are because they're kernel tests, which means they don't get the
cacheability_safeguardsmodule, which means they no longer get theuser.permissionscache context (unless they'd installcacheability_safeguards).Comment #29
tstoecklerI don't think the module should be required. It's fine that we want it to be un-disable-able in Standard profile - or even in Minimal, that's fine with me - however, Drupal as a framework should not be making the assumption that you need such a special-purpose module in every single application. Instead we can implement
hook_system_info_alter()in Standard profile. In a perfect world we would have actual dependencies for installation profiles, but lacking that I think that would be the best option.Comment #30
wim leers#29: it needs to be required, because otherwise it's too easy for custom distributions/install profiles to forget to add this module and hence not have these safeguards. That concern was explicitly raised by Berdir. See #2556889-65: [policy, no patch] Decide if SmartCache is still in scope for 8.0 and whether remaining risks require additional mitigation.
This reroll allows sites to opt out by specifying a
cacheability_safeguards: falsecontainer parameter. A "no safeguards" module that undoes everything doesn't work: what if something else already is adding theuser.permissionscache context? We'd then be incorrectly removing it in the "no safeguards" module. Therefore, we must have some way to choose to not enable all these safeguards. An empty module to do so would be rather silly. Hence a container parameter seems the logical choice.This concludes this PoC. It's a working PoC, just needs some polish and more tests. This is now blocked on reviews. This now blocks #2558599: Automatically assign user cache contexts/tags when using current_user service.
Comment #31
wim leers@Fabianx remarked in IRC it'd be better if those kernel tests simply used the
required_cache_contextsspecified in the container. Done.Comment #32
almaudoh commentedModule name needs to be updated.
Comment #33
wim leersYes, and tests that would've caught that. Before continuing I want a +1 or -1 on this direction.
Comment #34
tstoecklerRe #30: Meh, yeah I guess that's valid. We really need inheritance for profiles and then this could be part of a base install profile....
Comment #35
moshe weitzman commentedIs an $op cache context sufficient? Doesn't the content vary by the grants for the current user (which is determined dynamically - ugh).
Comment #36
berdirThe context cache token will then be replaced with an actual value that represents the node access grants for that user. yes, it works.
Comment #37
wim leersIndeed. And we have test coverage for that:
\Drupal\node\Tests\NodeAccessGrantsCacheContextTest.Comment #38
effulgentsia commentedI like the patch's use of a container parameter for toggling the safeguards. If we have that, then I don't think we need it to also be a separate module. Perhaps the provider that is being safeguarded can also provide the safeguard? I.e., if
\Drupal\Core\Cacheis the provider of'user.permissions', perhaps it should also contain the classes needed to safeguard it, and same for\Drupal\nodewith respect to'user.node_grants'? Although that would mean the safeguard implementations aren't all centralized in the same namespace, it would save on needing a hidden required module, and it would setup a pattern for contrib: e.g., a contrib LDAP integration module could provide both a cache context for some LDAP info and a safeguard for it.Comment #39
effulgentsia commentedI struggled for a while on whether to promote this issue to Critical, either as a blocker for #2429617: Make D8 2x as fast: Dynamic Page Cache: context-dependent page caching (for *all* users!) or a blocker for RC once that issue lands. But I don't think it should be either. I think it's a great security hardening, so tagging as such, and would be very valuable to land near the same time as SmartCache itself, so tagging as "sprint", but I also think it would be acceptable to commit after 8.0.0 (either in a minor or patch release).
If there are concerns about the security of Drupal without this patch, then I think those security concerns should already exist with HEAD, independent of SmartCache, because of node_access queries performed within non-controller code that is already being render cached in HEAD: blocks, hook_node_view(), Views plugins, etc.
If, however, we find real security vulnerabilities related to this either before or after committing SmartCache, then we might need to promote this to Critical.
Comment #40
wim leersDiscussed yesterday with webchick and effulgentsia. The conclusion was that this should not be a separate (and hidden+required) module. The decorators should live in the modules providing this functionality, and should simply be possible to disable by setting a container
cacheability_safeguards: truecontainer parameter tofalse.This patch does that.
This direction also means that AFAICT no additional test coverage is necessary, beyond the (already existing) tests that ensure that the automatic bubbling of the cache context for using the node access query alter.
EDIT: only after writing this, I saw #38, which says pretty much the same.
Comment #41
catchYes in a previous discussion we realised we'd need a module that disables the container parameter, rather than just being able to uninstall this one, so no new module for the safeguards themselves is good.
Comment #42
fabianx commentedI think it would be better to have this fine-granular for each thing:
e.g.
cacheability_safeguards:
node_grants: true
user_permissions: true
etc.
On the other hand Symfony makes it hard to extend a parameter, so contrib/ could not use it ...
So I guess this is fine :/
The non_bubbling version should always be available, so one can depend on it in an edge case.
In that case it would just be an alias.
$container->setAlias('node.grant_storage.non_bubbling', ...);
( Syntax not sure, but pseudo code. )
Comment #43
wim leersGood point! Let's hear if others agree with that. Will do that in my next reroll, but looking for more feedback in the mean time. (I will likely be able to reroll late tonight, i.e. within 3–5 hours.)
Comment #44
effulgentsia commented+1 to #42.2. I don't know what the best implementation is (e.g., what should be an alias of what, see point one below), but I agree that services that know what they're doing should be able to ask for the non_bubbling service as a dependency. For node grants, the use cases for that might be edge case, but as a general pattern, it's not so edge case. For example, in #2558599: Automatically assign user cache contexts/tags when using current_user service, any module that wants to provide a context for a specific field of the current user will need to have a way to get that value without triggering the autobubbling of the 'user' context, and doing that via a dependency on
current_user.non_bubblingseems like the likely way to do that.A plus side of this approach (as opposed to defining
node.grant_storage.non_bubblingin node.services.yml) is that existing modules that implement providers that alternode.grant_storage(e.g., with a MongoDb implementation) can continue to work without a BC break, and then NodeServiceProvider just comes in and takes the altered definition, renames it tonode.grant_storage.non_bubbling, and then decorates it. I like this over breaking existing contrib modules that alternode.grant_storage. However, this seems like it would be sensitive on module list order. I.e., contrib modules that have service providers that run after NodeServiceProvider will then end up replacing the decorator, which is not what we want. I don't know if there's a way to solve this cleanly. If not, perhaps we'll be required to make the BC break of puttingnode.grant_storage.non_bubblinginto node.services.yml and publishing a CR for contrib modules to alter that one instead ofnode.grant_storage?s/!nids/@nids/. Also, should the test be expanded to test for this message (in particular, which nids)?
Comment #45
effulgentsia commentedNot sure about this name. It looks naked without a namespace. Perhaps
renderer.cacheability_safeguards?Comment #46
wim leers#44:
Exactly.
Ugh :( Excellent point!
I don't see an alternative. :/
#45: Works for me.
Keeping at NR particularly for #44.1. Will reroll tomorrow.
Comment #47
wim leersAddressed everything except the ordering problem pointed out by @effulgentsia in #44.1.
Comment #48
wim leers@dawehner mentioned http://symfony.com/doc/current/components/dependency_injection/advanced.... in IRC. Currently investigating that, it may address #44.1 :)
Comment #49
fabianx commented#48: Yes, indeed that is a great way to decorate services as you can always depend on the inner state. Should have thought about that.
This is a pure container builder thing, but defines this in a way nicer way.
Comment #50
wim leersSo, sadly, Symfony has used a very strange interpretation of the word "decorator" (which has a very specific meaning in the CS world, and they assign a very different one). They're not actually decorating a service, but .
http://www.php-schulung.de/2014/replace-service-keep-reference/ explains it more accurately (and the URL already captures better what it does than http://symfony.com/doc/current/components/dependency_injection/advanced.... extremely misleadingly describes).
EDIT: I forgot to mention the worst part: when decorating a service, you have to give it a different name. So it's only for providing "extended versions of a service" (that's also what Symfony's test coverage indicates), which you must choose to explicitly access. Rather than actually decorating a service.
Comment #51
dawehnerAs always things are an ordering issue. These services would use the the storage mechanism anyway which is executed after the aliasing so no matter which module order we choose for the modifiers, the fact that we use
will result into issues.
This makes me quite sad. We can decouple that by providing a renderer proxy and inject that manually.
Comment #52
wim leersI think the solution to #50 is quite simple. Just introduce our own compiler pass that runs very late.
That also makes it much easier for contrib to provide additional cacheability safeguards:
(See the last line there.)
Next: addressing #51.
Comment #53
wim leersThe
renderer.lazything in #51 can be done in a follow-up, it'd be useful in other places too, and doesn't need to block this critical.This reroll just removes the container-awareness and injects the renderer. So @dawehner's concerns are addressed, it'll just make things a bit slower when
node.grants_storageis used in e.g. a JSON response. But that was probably premature optimization on my behalf anyway.Comment #54
fabianx commentedRTBC, looks great to me now!
All my concerns are addressed.
Comment #56
wim leers@effulgentsia and I pair-programmed on an improved implementation, that does use Symfony's decorators. This improves clarity for people familiar with Symfony decorators and ensures that the service name swapping happens at the expected time, for example before
ResolveReferencesToAliasesPass.We also added unit test coverage for the compiler pass :)
Please also credit @effulgentsia.
Comment #57
fabianx commentedGreat work! Back to RTBC!
Comment #58
effulgentsia commentedMy credit box is already checked. Also adding credit to Fabianx and dawehner for their thorough reviews.
Since I pair programmed with Wim on the latest patch, assigning to @catch for final review/commit.
Comment #59
wim leersAdded beta evaluation.
Comment #60
dawehnerAh so the pass ensures that it will be late, nice!
Wrong docs
We should document here why we use the pass ... because its later than a ServiceModifier ...
Comment #61
catchJust a quick review right now, leaving RTBC because these are just questions.
Why are language and theme not safeguards too?
heh.
So we keep the original name but append a hash. Is there any chance this makes errors less googleable etc?
Comment #62
effulgentsia commentedTheme is a required cache context because the Renderer itself calls out to theme(). I think a follow-up to move it from a required cache context to something added by the renderer only where it calls theme() would be a reasonable thing to explore, but out of scope for this issue.
Interface language is a required cache context because a massive amount of core code calls t(). In theory, we could decorate TranslationManager, but that would add several stack calls to every invocation of t() for little gain: we'd still end up varying every rendered cache item by language. And even if we did do that, we'd want it always in effect, not conditional on the cacheability_safeguards parameter, for the same reason as #22 gives with respect to url().
The 2nd "non-" should be removed.
Right, the way Symfony service decorators work, the original service ID is what ends up holding the decorator service. I.e., after the container is compiled,
node.grant_storagecontains the definition for the bubbling implementation. Unfortunately, Symfony requires an ID to be passed to$container->register()even whensetDecoratedService()will be called on it, and therefore, that ID never used. If there's a container compilation error, I suppose that could lead to an error message exposing the temporary ID. Perhaps we should make it$id . '.decorator_alias_do_not_use.' . strtolower($random->name())to help with googling?Comment #63
wim leers#60: addressed both points.
#61:
Keeping RTBC because only docs changes.
Comment #64
effulgentsia commentedTo clarify, what I should have said is "that ID is never legitimately used." But, if we don't randomize it, but make it something predictable, such as
node.grant_storage.bubbling, then people will be able to write *.services.yml files that have@node.grant_storage.bubblingas a dependency. But that dependency would fail whencacheability_safeguardsis turned off. So randomizing the ID helps make it clear that it's not part of the API.Comment #65
wim leersHaha, #62's reply to #61.3 was what I was writing first, and then changed, because I felt that was the less important rationale :) Now catch has both aspects! :) Also, #64++ for that additional clarification :)
Good catch, fixed.
I don't think that's necessary. Because how could you ever see that service ID? It's all automated transformation with well-defined inputs. The only additional input, and hence also the only way to trigger an error, is if you specify a non-existing class name. I tried doing that, and this is what I got:
… exactly the same kind of error you get when you specify a non-existing class for any regular service definition in a
*.services.ymlfile.Comment #66
fabianx commented#64: To be precise that random name never occurs anywhere, it is just a dummy as can be seen here:
https://gist.github.com/LionsAd/a10b7d8bb17cebe35d92#file-container-deco...
Maybe we should move making user_permissions a safe guard guarded thing to a follow-up - or at least leave the parameter change - the code itself does not hurt.
Reasoning:
- Since adding that we do not know if core is still secure if you turned safe guards off, so that change is more invasive.
Comment #67
wim leersAt first I disagreed, but then I saw what you meant: all of core now was written/updated/tested with the assumption that the permissions cache context is added automatically. But, the same is not true for the node access cache context. As this patch shows, no additional responses with test coverage were found to be missing the node access cache context.
That's why they're different.
So, yes, let's do that in a follow-up.
Comment #68
Crell commentedMaybe I missed this somewhere, but can someone explain why we'd ever want the safeguards off? I mean, this is the code that makes entity queries (or at least node queries) not be an information disclosure. Why in Druplicon's name would I ever want it off?
The only reason I even could turn it off is if I am manually adding the appropriate cache info myself... but I my node-futzing module has no idea if a node grants module is enabled. That's by design. And even then, how would I know to add the cache info if in "dev mode" or whatever it's being added for me, so I don't know I need to add it?
Either I'm completely missing something here, or everyone else is. I give it 50/50 odds. :-)
Comment #69
wim leersThis is exactly why.
It doesn't matter that no node grants hook implementations exist on a certain site. As long as you're doing a
node_access-tagged query, you must associate that cache context. The node access cache context ensures that in case no node grants hook implementations exist, it always returns the same value, because only a single variation exists. (Single variation, single return value ⇒ correct.) You can check the test coverage if you want :)Comment #70
catchComment #71
fabianx commented#68: I am pretty sure no sane person ever wants to turn off the safe guards in production (I would not want to do that) - except it is very useful for testing, because as dawehner stated earlier:
- Explicit is better than implicit
Yes, nagging the developer is the "holy grail" and tell you as developer exactly what you need to do, but that needs a (non-BC breaking, purely internal) refactor in Renderer first.
Comment #72
wim leersI think this is the aspect that catch felt was missing in the IS?
I also wrote a handbook page: https://www.drupal.org/developing/api/8/cache/safeguards
Comment #73
catchYes #72 is what I was after, thanks!
Comment #74
plachThis looks great! I updated the IS to mention why
MetadataBubblingUrlGeneratoris not a safeguard.Would it make sense to statically cache this, so bubbling is performed exactly once for each
$op?Comment #75
catchSo thinking about the cache safeguards discussion, I'm wondering if https://api.drupal.org/api/drupal/core%21lib%21Drupal%21Core%21Entity%21... should also be adding the node grants cache context - other than in the implementation here.
Argument being:
- when you write an entity query, it automatically handles the node access query tag unless you explicitly disable it, this is the opposite of using the database API directly
- therefore it is not reasonable to expect people writing entity queries to think about node grants - unless they were explicitly trying to avoid them.
- therefore we might want the grants cache context to be added by entity queries, whether or not safeguards are on or not.
- if we did that, can we additionally send some information via the entity query (and possibly Views), to disable the safeguard version from running? That gives us the opportunity to throw E_DEPRECATED in here one day.
Comment #76
fabianx commented#74: No, it must not be statically cached, the reason being:
Consider two blocks which both show something with node_grants, but without adding the necessary context.
If Block A and B are not cached then and Block A is executed first then Block A gets the cache context added to their render array, while Block B does not => Information disclosure.
While currently no new RenderContext is opened, one can still semantically think of it like that:
- Whenever Renderer encounters a #cache element then (at least) it will be in a new context and hence all data that is added is part of the #cache entry there.
So the beauty of the Render Context and Render Stack system is that as long as #pre_render is used, things automatically end up within the right render element automatically, even if they are added out-of-bounds.
Comment #77
wim leers#74 is fully answered by #76 — thanks Fabianx :)
#75:
It took some searching to find where this is happening:
Agreed. But the current patch already only does things to queries that have the
node_accesstag. Therefore, when developers are using entity queries, they will get the cache context. And when they explicitly opt out, they will not get the cache context. So, this is already working :)I just didn't realize that that tag was being added by default. So all that is necessary, is to expand the test coverage here to test the three possible cases:
accessCheck === TRUE=>node_accesstag added automatically)/li>accessCheck === FALSE=>node_accesstag NOT added)/li>accessCheck === TRUE || FALSE,node_accesstag added manually))/li>I don't think this is a good idea. Because that means:
Rendererinto the entity query classBoth of those seem like a terrible conflating of concerns. I think the above solution, where a query tag is added automatically, is already sufficient. The cache context concern should not be handled by the query logic.
If the previous doesn't follow, then neither does this.
Comment #78
plach@Fabianx:
Thanks for the explanation, makes totally sense :)
Comment #79
Crell commentedWim: I'm still not following. What is the use case for disabling auto-tagging? What's the User Story for that? That's what I still don't get.
Comment #80
wim leersWhile awaiting a reply from @catch in #77, here's a straight reroll of #67 against HEAD.
Comment #82
catchWe don't need to do either of those things, because we have hook_entity_query_alter().
Comment #83
wim leersThe failures in #80 were legitimate, caused by #2072945: Remove the $langcode parameter in EntityAccessControllerInterface::access() and friends.
Comment #85
wim leersI'll work on addressing #77 + #82 tomorrow.
Comment #86
wim leersNow really on this again.
First, a rebased patch.
Comment #87
wim leers#82:
By working on this, I sadly discovered that
hook_entity_query_alter()is dead. There's onlyhook_query_TAG_alter(). Which seems to have been sufficient, sincehook_entity_query_alter()has been dead for three years without anybody noticing. So, filed an issue to update the docs accordingly: #2611638: hook_entity_query_alter() is dead, remove it from entity.api.php.Comment #88
wim leersSo with @catch's great insight in #75, this becomes vastly simpler. In fact, this then doesn't need a cacheability safeguard. Instead, this then simply means that the necessary cache context for node access queries is added automatically.
Which means there is nothing to safeguard, because it is always safe.
Retitling accordingly.
Comment #89
wim leersThere were also some hunks in there that made sense in earlier iterations of the patch, but now are out-of-scope, nice-to-have clean-ups. Removing those, which cuts the patch size in half and makes it much simpler to review.
Comment #90
wim leersClean-up.
I think this is ready for final review.
Comment #91
catchThis looks great to me and exactly answers #75.
Comment #92
effulgentsia commented#75 also says:
I don't see that implemented in the patch. Was there a decision somewhere that we don't want that?
Comment #93
catch@effulgentsia the patch now adds the cacheability in node_query_node_access_alter() - so as long as there is node access being enforced, the cacheability is enforced too.
We came to the conclusion that this shouldn't be a safeguard/fallback, but is the actual behaviour we want, so that part of #75 is outdated.
Comment #94
wim leers@effulgentsia chatted with me about #92 yesterday, and I gave him the same reply as #93, so that's great :)
Comment #95
effulgentsia commented@catch and I agree on this being rc target, but let's update the issue summary to reflect #93.
Comment #96
wim leersIS updated.
Comment #97
effulgentsia commentedThanks. I reviewed the patch again and it looks great, so RTBC.
Comment #98
fabianx commentedRTBC + 1
Comment #100
catchCommitted/pushed to 8.0.x, thanks!