Problem/Motivation
We have entity render caching. But its cache keys & contexts don't take into account entity access, field access, nor node grants. Entities are only cached per role by default. Hence, any access checking that depends on something other than role will effectively be broken, and will be disclosing information to users that aren't allowed to see it.
Proposed resolution
#2429257: Bubble cache contexts provides the necessary infrastructure: it automatically bubbles cache contexts. Therefore this issue is merely a matter of ensuring that whenever we call $entity->access()
or $field->access()
, that we use the AccessResult
object we can get from there, and set the associated cache tags & cache contexts on the render array.
Remaining tasks
Ensure the right cache contexts & tags are set for:
entities (entity access) — see #49 + #61fields (field access)the entities rendered by entity reference field formatters
User interface changes
None.
API changes
None.
Internal (protected) API change: EntityReferenceFormatterBase::needsAccessCheck()
signatures modified, which affects all custom/contrib modules' Entity Reference formatter classes subclassing that class and overriding it (which is very rarely needed; in core itself, it only happens for file formatters).
Beta phase evaluation
Issue category | Bug because HEAD may show inaccessible information without this patch. |
---|---|
Issue priority | Critical because security bug. |
Prioritized changes | The main goal of this issue is security. |
Disruption | Disruptive for contributed and custom modules that subclass EntityReferenceFormatterBase and override its ::needsAccessCheck() method. |
Original report by effulgentsia
Opening this after discussions at DrupalCon this week.
Field access and entity access modules complicate entity render caching quite a bit, couple of examples:
1. An entity reference holds 3 referenced entities but the current user only has access to two.
2. A field access module is preventing two fields from being rendered at all.
Access modules can be based on arbitrary granularity - forcing it to be per-user, but also potentially by location or some other arbitrary criteria.
We could use JavaScript replacement for these in the same way as #1991684: Node history markers (comment & node "new" indicator, "x new comments" links) forces render caching to be per user but this moves actual content into js replacement, rather than more incidental items like new/updated/edit links etc.
Another option is to allow entity and field access modules to add to cache granularity themselves - since they know exactly what criteria is going to be needed. i.e. a per-role field access module needs to do nothing, because entity render caching is already per roie. But a per-organic group field access module needs a hash of the current user's organic group memberships - and only it knows that.
This may be possible with existing alter hooks (hook_entity_view_alter()?) but we ought to be able to test that it works correctly at least - and it'll need to be documented somewhere that entity/field access modules need to do this. If that's not enough we may need to add an extra step or move things around.
Comment | File | Size | Author |
---|---|---|---|
#178 | interdiff.txt | 850 bytes | amateescu |
#178 | 2099137-178.patch | 41.3 KB | amateescu |
#175 | interdiff.txt | 797 bytes | amateescu |
#175 | 2099137-175.patch | 41.29 KB | amateescu |
#173 | interdiff.txt | 1.52 KB | amateescu |
Comments
Comment #1
Wim LeersThis also affects the node-counterpart for #2090783: Run comment op links (delete, edit, reply, approve + contrib) through #post_render_cache to prevent render caching granularity being per-user: the "Add child page" node link that is added by book.module is added if these conditions are met:
The problematic part there is
node_access()
, which callsNodeAccessController::createAccess()
, which callsEntityAccessController::createAccess()
, which invokeshook_entity_create_access()
andhook_node_create_access()
, allowing for the arbitrary logic described in the issue summary.Now, because node links are affected, and node links are part of the rendered nodes they're related to, this also affects render caching. In other words: book module plus
crazyadvanced node access hooks = broken render cache.Comment #2
Wim Leers(For more context to #1, see #2090783-20: Run comment op links (delete, edit, reply, approve + contrib) through #post_render_cache to prevent render caching granularity being per-user.)
Comment #3
xjmComment #4
Dave ReidI don't believe using JS to solve all these problems, let alone the entity links is a good solution based on what's going on in #2090783: Run comment op links (delete, edit, reply, approve + contrib) through #post_render_cache to prevent render caching granularity being per-user so far. But I'm also not able to offer better solutions at this time.
Comment #5
Dave ReidJust wondering at what point some things are just better uncached vs trying to fit them into the puzzle of trying to make everything cacheable. I think we underestimate how much of a logged in request is personalized, and how much contrib uses that ability.
Comment #6
Crell CreditAttribution: Crell commentedIt's not just personalization. With entity access specifically, the instant you turn on any node-access module (which includes Organic Groups and Domain Access) any Entity Reference field, View, or edit link becomes potentially per-user. That leaves only three options that I'm aware of:
1) Screw caching. (This is what D7 did. It wasn't good.)
2) Developer, you're on your own, good luck. (This would be horrible for DX, and since we're talking about access control it becomes a security hole.)
3) Some fancy-pants way to derive meaningful cache context at an appropriate level, with an API that developers can at least wrap their head around enough that we're able to make caching "default on" rather than "default off". Likely involves Javascript at least sometimes.
A gaping security hole is a no-go. Drupal 7's authenticated user performance was nothing to brag about. That leaves option 3 if we want to do better in Drupal 8 unless someone has a better idea. So far, no one has. (If you come up with one, I'd love to hear it.)
I freely admit that I hate render caching and render API, and have argued many times more than Catch and Mark would probably like that we should give up on it and just use block-level caching. However, I have to grudgingly admit that once node access enters the picture, block-level caching is frequently useless. I don't see an alternative to "fancy pants" or "screw it, don't bother caching".
Comment #7
eaton CreditAttribution: eaton commentedJust to be clear, it works perfectly fine in a number of common situations. Specifically for small to mid-sized sites, or large ones where the divide was between heavily-cached anonymous users and logged in users with varied levels of access. This work is definitely useful for the previously underserved scenario where visitors have a widely varied set of roles and access levels, so traffic cannot be effectively cached.
I just want to make it clear that "It wasn't good" is FUD. The work here is not to fix something that's broken; rather, it's to better handle a use case that was previously difficult to cache.
Comment #8
catchI'm not sure this is the right thread to be discussing whether enabling render caching by default is a good idea or not...
Just to clarify though, I don't think this is a small/mid vs. large sites issue. Small to medium sites have just as many problems with pages taking a long time to build in PHP as large sites do if not more. They're well placed to throw hardware at performance issues, or development resources to customize core/contrib output to make it cacheable. There's also plenty of medium-scale sites with lots of entity content and/or a high proportion of authenticated users.
For node/entity access in particular, not caching was viable - at least this allowed some caching for other sites as opposed to just not caching ever, and we could still implement that approach if we really have to for 8.x - however for situations where field or entity access is by role, that will break render caching unnecessarily, since it's already cached by role.
Comment #9
eaton CreditAttribution: eaton commentedWell said, catch -- thank you.
Comment #10
moshe weitzman CreditAttribution: moshe weitzman commentedAFAICT, this comes down to making sure there are alter hooks where access modules can change the cache keys. That, or the cache strategy changes to use 'cache by query' like technique that automatically works in an access control situation. This is what the forum block did in D7, and still does.
This is going to be a hard one to solve.
Comment #11
tim.plunkettWe're using the DX tag for issues that improve DX, not worsen it.
Comment #12
xjmDiscussed this with effulgentsia who thought this should at least be a beta target, maybe even a beta blocker.
Comment #13
effulgentsia CreditAttribution: effulgentsia commentedI got started on this, but it's still rough. CNR for bot.
Comment #14
effulgentsia CreditAttribution: effulgentsia commentedThis still needs work before it's ready for human review.
Comment #16
effulgentsia CreditAttribution: effulgentsia commentedI split off a small side issue that would be nice to get in prior to this one: #2218157: Add $display parameter to EntityViewBuilder::getBuildDefaults()
Comment #17
Wim LeersComment #18
Wim LeersI think #1991248: Decouple field access checks from formatters and widgets is related.
Comment #19
xjmIt's been a long time since we looked at this issue and @effulgentsia pointed out that the patch was rough to begin with, but at one point we were considering the beta implications of this. The original proof-of-concept patch was at least adding a method to FormatterInterface, and there's definitely security implications for access control and cache invalidation expectations.
Let's check on this issue and its implications for the pre-AMS beta sprint.
Comment #20
catchHere's how it looks atm:
So build defaults are added, then hook_entity_build_defaults_alter() is called, and this can change the cache contexts.
https://api.drupal.org/api/drupal/core!modules!system!entity.api.php/fun... gives a hint for this, could possibly mention some uses cases for additional docs.
I don't think there's anything else to do here though?
Comment #21
catchComment #22
yched CreditAttribution: yched commentedThis @todo points here, and is about cache tags needed by formatters, regardless of entity/field access.
Still not too familiar with hwo cache tags are collected, but do we still miss a way for formatters to publish their cache tags ? Do they bubble up if the formatter places a '#cache' entry in the render array it returns ?
[edit: sorry, in fact the code pasted here does stuff about cache keys, not cache tags... Still - do we miss a way to let formatters add either cache tags or cache keys ?]
Comment #23
catchA formatter-providing module could implement this alter hook.
Fabian and Wim have been discussing bubbling of cache contexts from child elements, that would allow this to be handled automatically for formatters (if they specify cache contexts correctly), not sure if there's an actual issue for this yet. Also, hard to get right. I think bubbling of cache contexts would be a good API addition, but shouldn't necessarily block release.
Comment #24
yched CreditAttribution: yched commentedSure, but if your formatter needs to implement a hook to not be simply broken, this kind of defeats "formatters are self-contained classes implementating FormatterInterface". Not a necessarily a realease blocker if implementing the hook does make your formatter work, but not ideal.
Looks like adding a new method wouldn't be such a big deal - main obstacle is that the problem space (bubbling of cache tags and keys) is not too clear to me, so I don't really know what this method would look like nor when it would be called. But if we could define precise specs for what is needed, implementing them doesn't seem like the hardest part ?
Also, if we can't answer those questions, then we leave it to each formatter to have figure them out themselves (that is, after having to figure out there's a potential cache problem to begin with) - which doesn't look great either ?
Comment #25
yched CreditAttribution: yched commentedThen again, I'm focusing on the case of formatters that need to set cache info, while the issue title here is about "entity access and field access modules", so maybe that's not the right issue for this.
"entity access and field access modules" is about 3rd party code altering behavior, and it's only fair that 3rd party code may need to implement some hooks. Feels less right for formatters, which are "the native, primary citizens in charge of rendering fields".
It's just that the HEAD code outlined in #20 / #22 has a @todo about formatters, and it points here. But maybe that issue link is wrong, and formatters should be dealt with in another issue ?
Comment #26
catchBubbling of cache tags is fine and should work already.
Bubbling of cache keys is not implementable - since the specific cache key can only be determined by running the formatter each time on a cache hit - and it's exactly the formatters we don't want to run. Chicken and egg.
Bubbling of cache contexts is just about implementable - the entity system knows which displays are configured to use which formatters, and hence could figure out the required cache contexts for each formatter and add those if necessary assuming that information is available. That means you'd need to run a method on the formatter, but it'd just be one that returns cache contexts (i.e. CacheableInterface).
Once you get into nested rendering of entities though this gets a lot trickier. The entity reference formatter would probably need to handle that based on the configuration (view mode etc.) - possible but gets complex.
Yes exactly. A lot of formatters would be covered by the default cache contexts, but clearly some won't.
Comment #27
catchYeah it's slightly different. Entity reference formatters will be affected by entity/field access modules, however entity ref by itself doesn't know whether one of those is enabled.
Comment #28
Wim Leers#20: what's missing is that field formatters should be able to specify their cache contexts and have that be applied automatically, rather than requiring each field formatter-providing module implementing
hook_entity_build_defaults_alter()
, figure out if its field formatter is present in the given entity in the given view mode and then apply a certain cache context.#22: cache tag bubbling works automatically and is trivial, because it's only necessary for invalidation. Cache contexts — which you can look at as "dynamic cache keys" (they're placeholders that are replaced with the relevant value for the current context) — on the other hand cannot be bubbled like cache tags by definition, because you need them in order to determine whether there's a render cache hit or not, to avoid rendering (and the bubbling that rendering includes). At least, it cannot be implemented by the Render API, because the Render API just takes a dumb data structure and renders it. It can be done by an API that knows more about the structure of entities: Entity API knows enough to be able to talk to the relevant formatters, ask if a cache context is necessary, and if so, add it to the cache keys.
(I now see #26 already answered this.)
#24: Agreed with .
#27: yep, formatters != access, but it's definitely related. But… ever since #2287071: Add cacheability metadata to access checks, one could argue that they're one and the same problem… because access results now include cache tags and cache contexts :)
effulgentsia and I were planning to work on this again after the D8 upgrade path issues are done.
Comment #29
andypostThis should affect comments formatter not only ER
Comment #30
yched CreditAttribution: yched commented@catch, @Wim : thanks for the explanations, makes sense.
Should we open a separate issue for "FormatterInterface needs a method to expose cache contexts" then ?
Comment #31
catchHmm, we could possibly re-title this to be about formatters, it might just be that the title is bad.
With entity/field access, is implementing this hook enough? If we were able to collect all entity/field access handlers and collect cache contexts from them (as Wim says), then we could do centralized cache granularity for that too, skipping the need to implement any hooks. However in #2287071: Add cacheability metadata to access checks we made cache contexts the return value of the access checks themselves and I'm not sure how much of an option that is for entities/fields. For a start it absolutely enforces a loaded entity when figuring out cache contexts since the cache context return value of the access check will depend on the entity (i.e. whether the current user is the author or not).
Another question here - if there's an entity listing, how does the listing know to request the cache contexts necessary? Then by extension what happens if that entity listing is nested itself?
I guess it can still do that by entity type + bundle + view mode. The bundles that can be shown in a listing could depend on the results of the query though. This means either a double cache (one for the results, one for the rendering, similar to Views), or potentially we could combine the cache contexts of all the formatters on all the possible bundles rendered by a listing to save that double cache hit. That might handle the listing, but you still can't then bubble that up to a parent element.
Comment #32
Fabianx CreditAttribution: Fabianx commentedI think it is okay to load the plugin to get the cache contexts, but this information should be cacheable, so we don't need to load all field plugins just to get the cache contexts.
So the next time we have an entity we can figure out all the cache_contexts needed by the field formatters with one cache()->getMultiple() call.
Unless there is huge LRU expiration in e.g. memcache, the cache contexts field formatter cache will always be populated when the data cache is cached as well.
Ideal? No.
But better to have more cache_get()'s. (and with the #pre_render pattern there is less cache->getMultiple() anyway in core), than to have to load the plugins.
This would even work recursively by returning a two-fold array with:
array(
contexts => array(...),
field_formatters => array(...), // More field formatters to get information about.
);
Comment #33
yched CreditAttribution: yched commented@Fabianx: we already have cached static metadata about formatters, it's the formatter's "plugin definition", specified in the
@FieldFormatter
annotation, and it makes little sense to add a new one if we have one already. So #32 sounds like the cache contexts needed for a formatter should be exposed as a 'cache_contexts' entry in the annotation ?Comment #34
xjmRelated issue: #2390691: Expose node grants as cache context.
Comment #35
Wim Leers#2390691: Expose node grants as cache context landed and will help with this issue. We probably want to always associate that cache context in
NodeViewBuilder::getBuildDefaults()
, btw.Comment #36
BerdirNot sure if that the right place, that mostly affects lists of nodes, specifically places that filter/query a list of nodes with access checks.
Comment #37
Wim LeersWell, for showing individual nodes, we use
$node->access()
, and inNodeAccessControlHandler::checkAccess()
we call into the node grants access system. Hence, we should? It may not always be necessary, but it never hurts?EDIT: it feels a bit wrong to do it for single node views, but I'm only making sure this is part of the discussion and ends up getting considered when we work on this issue.
Comment #38
catchRe-titling this to make it clear what the bug is.
Comment #39
effulgentsia CreditAttribution: effulgentsia commentedDiscussed this with the branch maintainers, and they agreed that closing this issue is a release blocker, whether closing it is fixing something in core or deciding and documenting what contrib's responsibility is.
Comment #40
Wim LeersComment #41
catchPostponing on #2429257: Bubble cache contexts.
Comment #42
catchComment #43
effulgentsia CreditAttribution: effulgentsia commentedI continue to be very interested in this issue and am very happy with the work happening in #2429257: Bubble cache contexts. No need for this to still be assigned to me. Whatever happens in that issue will change the playing field of this one, and in a very good way.
Comment #44
Wim Leers#2429257: Bubble cache contexts included in earlier iterations changes to entity reference field formatters and taxonomy field formatters (which are a special case of entity reference field formatters). But those formatters do access checking themselves (so not , but ), which puts them in this issue's territory.
So, at #2429257-36: Bubble cache contexts, we decided to remove those changes, to keep that issue sufficiently focused. yched and catch agreed with that, so updating the IS to ensure that is not forgotten.
Updated the IS.
Comment #45
Wim Leers#2429287: [meta] Finalize the cache contexts API & DX/usage, enable a leap forward in performance landed, this is now unblocked.
Comment #46
catchComment #47
Wim LeersFirst: a straight port of #2429257-36: Bubble cache contexts, courtesy of
git apply -R
.Next steps: address yched's feedback/the conclusion from the discussion in comments 17 through 30 on that issue. We agreed that
EntityReferenceFormatterBase::getEntitiesToView()
was the crucial piece to feed the right information to all entity reference formatters, but we wanted to wait with changing it because #2405469: FileFormatterBase should extend EntityReferenceFormatterBase hadn't landed yet, but now it has. So now we can actually try a better approach :)Comment #48
BerdirI assume that if we implement cacheableinterface for entities, then we can actually move that into a method on them?
Also, as mentioned in the bubble issue, I don't think this is enough, because this should IMHO actually use a configurable/parameterized cache context based on the language that the entity is actually being displayed in, not the global current language.
why a different check for language here?
I also still need to read up on the discussion here and the referenced issue, I don't fully understand yet how this is supposed to solve problems like custom access things and query alters, how the node grants context comes into this and so on.
Comment #49
Wim Leers#48: the #47 patch is full of warts, I said I was cleaning them up :) And here is that clean-up!
Per the IS, there are 3 high-level areas to address by this patch. It must ensure that the render array's cache contexts reflect:
AccessAwareRouter::checkAccess()
.) That makes sense. But a consequence is that the render array never gets the associated cacheability metadata. That's a good thing, because the route may have additional access checking logic.However, I wonder whether we really don't want to apply
$entity->access('view')
's cacheability metadata to the entity render array? ATM I think we do. Feedback most welcome.interdiff-field_access.txt
.interdiff-rm_taxonomy_changes.txt
+interdiff-erformatters_entity_access.txt
This is what the patch in #47, which originates from that other issue, is all about. Changes:
EntityReferenceFormatterBase
, so we can reuse the same code. Not doing that yet until we have reviews of the direction so far.ERFormatterBase::getEntitiesToView()
to accept the field's render array by reference, so that it can set the cacheability metadata for the inaccessible fields.ERFormatterBase::getEntitiesToView()
to set$entity->_accessCacheability
, in analogy with$entity->_referringItem
.ERFormatterBase::setAccessCacheability()
to simplify the applying of$entity->_accessCacheability
to the field item (delta) render array.ERFormatterBase
only have to modify 1 LoC + add 1 LoC.interdiff.txt
= combination of all aforementioned interdiffs, representing the total interdiff.I think we'll need to block this on #2443073: Add #cache[max-age] to disable caching and bubble the max-age. See the interdiff for an explanation why (grep for that issue ID). Do others agree?
Comment #51
Wim LeersFixing all the
\Drupal\system\Tests\Entity\EntityCacheTagsTestBase
-based tests that are failing.Comment #53
Wim LeersDrupal\field\Tests\FieldAccessTest
failed, which together with the test coverage added by #2429257: Bubble cache contexts, is probably sufficient to prove that this issue addresses field access cache handling?To fix this one, though, we enter once again the frustrating land of entity and field access, which has a default generated from both an entity-level default, a field-level default, which is then combined with the access results from
hook_entity_field_access()
. The default is stored under the array key':default'
, the others are stored under a key of their module name.Initial problem: Because
:default
is the first element in the array, it is processed first. And usually, that default isAccessResult::allowed()
(which is also cacheable by default), which means that the cache contexts for neutral results are ignored. We can fix this by appending the:default
access result to the end of the array of access results, so that it's really the fallback rather than the starting point. This actually makes sense too.(However… what we actually want is the same pattern as for
checkAccess()
andcheckCreateAccess()
, which was introduced in #2204363: [sechole] Returning TRUE from hook_entity_access()/hook_ENTITYTYPE_access() must not bypass EntityAccessController::checkAccess(), but for some reason was not applied tofieldAccess()
. But that's out of scope here.)Actual problem: On top of that, it just happens to be that for
FieldAccessTest
'shook_entity_field_access()
, we useAccessResult::forbiddenIf()
, which means we're either returning a neutral result or a forbidden result. But that also means that we actually always want our cache context to be taken into account, because for another value of the cache context it might return "forbidden", and hence that should be reflected by the rendered field: it should always contain our cache context, so that the rendered field is varied by the cache contexts for which the access results vary.Now, this is not actually specific to this particular use case at all. It's actually a general problem. Fortunately, there's an easy fix: make sure that when ORing non-forbidden access results that are both cacheable, we merge all cacheability metadata, always. This is not even an API break, it's an API addition, as can be seen in the updated test coverage.
Comment #55
Wim LeersFilterFormatAccessTest
, which is why the number of failures stayed the same. Easy fix; the test assertion now actually makes sense unlike before :)RFormatterTest
failures.getEntitiesToView()
setting properties on the field-level build array (to capture the inaccessible entities' cacheability metadata). This causes fields that would otherwise be empty to still show up, hence causing a bunch of test failures. Fortunately, an easy fix inFormatterBase::view()
is possible :)This patch will hopefully come back green :)
Comment #57
Wim LeersGreener.
RSSEnclosureFormatter
andRSSCategoryFormatter.php
didn't comply with the interface.Comment #58
Wim LeersUpdating IS to clarify the remaining tasks.
Comment #59
catchFor node grants I was thinking of a 'list cache contexts' method on entity types - then that could be used wherever a list of those entity types is built in either views or manually with entity query.
Looks like Daniel's just opened #2445743: Allow views base tables and entity types to define additional cache contexts which suggests much the same thing - however while we might want a separate issue for this, it feels like a hard blocker for this to be completely fixed (at least given current title).
Pondering the 'apply entity view access cache contexts to entity rendering' question, not sure on that.
Comment #60
Fabianx CreditAttribution: Fabianx commentedUh, so doing the equivalent of the Drupal 7 code in Drupal 8 would not check access?
I don't think mixing route based access with other things is a good paradigm for a basic object like entities. I can see that we want to return a 403 and that is why it was mixed, but this is just a convenience IMHO and maybe a discussion for a follow-up ...
And again, yes we need cache contexts set, else we make several other things impossible to do.
So if you do:
you need to return the 'user' cache_context.
If however you check access for another related entity, which is however not displayed, you would need to return a cache tag, so its properly expired, when someOtherEntity changes.
Hence #access is very much related to viewing.
--
The patch already looks great! Thanks, Wim!
Comment #61
catchHere's an example where adding the entity access (or listCacheContexts()) to a single rendered entity could be harmful for caching with no benefit in terms of accuracy.
A site has 15 organic groups (not that many).
98% of nodes are public.
2% of nodes are private to specific groups.
Users can be members of anything between 1 and 15 groups. That allows for thousands of permutations of group membership (6, vs. [1, 2, 3] v [1, 2, 4] vs. [1, 7, 8, 10]).
If you have access to a node, it's rendered the same regardless of access. In real terms you won't get every permutation of membership but you could easily get 15-30 viewing the same entity in the same time period.
Any time we add something like an administrative link, or a 'join this group' on a teaser that depends on the user's membership of the group, then they should be adding cache contexts anyway, or using post_render_cache.
Comment #62
Fabianx CreditAttribution: Fabianx commented#61: That makes a lot of sense. In addition the AccessInterface already supports the cacheable meta data I asked for. Thanks!
Comment #63
catchSo for listings, I think we should use listCacheContexts() from #2445743: Allow views base tables and entity types to define additional cache contexts so apply that to both views and entity query-derived lists of entities.
There's a further problem though which I don't think we need to solve here, but should open a major-ish issue to look at.
Anything that's not nodes is going to return nothing for listCacheContexts()
However custom access modules like taxonomy access, or modules that do lots of query altering like CPS get left out of this.
Something ugly conceptually, but which might work well would be the following:
We already have https://api.drupal.org/api/drupal/core%21lib%21Drupal%21Core%21Database%...
We can have a convention whereby query alters add cacheability metadata in there.
Then when building a listing you can use https://api.drupal.org/api/drupal/core%21lib%21Drupal%21Core%21Database%... to apply the cache contexts to the render array generated by the query. This would work for access as well as use cases like CPS (domain module might be another one).
Comment #64
Wim Leers… so the conclusion is: my suspicion was wrong. Specifically, I said this in ##4:
The answer is: no, we don't!
That is: an entity is always rendered the same, it doesn't vary by entity access. If entity access says I can't view an entity, then it won't get rendered (and hence also not render cached). If an entity says I can view it, then I see it. It's a boolean, where in one case I see something and in the other I see nothing. (Speaking strictly about entity access now.)
When entity access does come into play, is in listings of entities, where the entities selected for the listing depend on entity access. That's where the
listCacheContexts()
comes in, as catch mentions in #63. That will only work for node listings for now, since we only have node grants (not entity grants), and hence only anode_view_grants
cache context.But that's actually totally fine for now, since listings of entity types other than nodes currently are not access checked, so no need to vary by a cache context either! Entity access checking for those entity types is done via entity query tagging + altering.
There are two ways to handle non-node entity types:
$query->addMetadata('cacheability', new BubbleableMetadata()
to be set on entity queries (this is an existing API!) and then consistently doing$query->getMetadata('cacheability')->applyTo($listing_render_array)
when rendering the listing associated with the entity query. This hence allows any query tagging/altering to associate the necessary cache contexts.But even if we do 1, 2 would still be useful on its own, precisely because having a generic entity grants system does not mean it's impossible or useless to do further entity query tagging/altering.
Comment #65
BerdirBut entity query tagging + altering can be doing some sort of access checks (and it is how node grants are actually implemented).
So we do need cache contexts, we just don't know what kind of context :)
Question is, how do we figure it out? Maybe it's as simple as saying that custom/contrib modules that for example implement access checks through query alter for taxonomy terms (just to pick one example) are required to implement hook_entity_type_alter() and add their context there, otherwise it will be a security issue. I assume it also means anyone that lists entities must add those contexts or, security issue :) (Sounds like we are going to have a lot of SA's about this in 8.x contrib or maybe even core (all those custom listings, like forum/tracker/comment...) ;))
Edit: Grml, it would help to read everything.
Also, 1. (entity grant system) is not going to happen, not in 8.0.x.
Comment #66
Wim Leers(#65: talked to Berdir in IRC, there's no actual question there, he just came to the same conclusion after partially reading #64 :))
This adds
EntityTypeInterface::getListCacheTags()
.Tests needed for the list cache tags stuff, so I can update
EntityCacheTagsTestBase
to have one example entity query-based entity listing updated and tested. Being able to write those tests is blocked on (the very simple) #2445761: Add a X-Drupal-Cache-Contexts header to aid in debugging and testing. For Views-powered entity listings, we have #2445743: Allow views base tables and entity types to define additional cache contexts. I propose we open a separate issue for all entity query-based entity listings in core; this issue aims to provide infrastructure, it doesn't aim to solve all problems at once.As per #61–#64: marking remaining task 1 as done, since we have a solution for it.
That means this is now mostly blocked on yched/somebody else reviewing the entity reference formatter handling logic.
Comment #67
Wim LeersOnce #2433513: [PP-1] TaxonomyFormatterBase should extend EntityReferenceFormatterBase lands, we won't have to worry about updating taxonomy term formatters at all :)
Comment #68
effulgentsia CreditAttribution: effulgentsia commentedWhy do we need to separate inaccessible from accessible ones like this, and thereby require each formatter to call setAccessCacheability()? Why not merge all $access_cacheability results (for both allowed and disallowed) and apply them to $build? And that way not require the formatters to do anything other than call getEntitiesToView() which they already need to do?
Comment #69
Wim LeersBecause:
But it is possible, yes. It'd only benefit the few people writing custom entity reference formatters though, at the cost of less granular debugging for all developers.
Comment #70
effulgentsia CreditAttribution: effulgentsia commentedTo add to #69:
3. (practical reason) it's possible that a template prints a specific item without printing the rest, or prints a specific item in a different container than the rest of the items. In this case, the cache contexts/tags for the access of the specific item needs to follow the item.
Comment #71
effulgentsia CreditAttribution: effulgentsia commented#61 gives reasons for keeping access-of-an-entity contexts separate from the-rendered-entity-itself contexts. While that comment might not have been intended to also cover field items or other render elements, I wonder if the reasoning should be applied here as well, instead of the above code which merges them. In other words, what if we apply that reasoning to all of drupal_render(). For example, create a '#access_cache' render property that's similar to '#cache' but just for access. Or else, change '#access' to allow an AccessResult object, from which drupal_render() can apply appropriate cacheability merging logic. Or does that all seem out of scope for this issue?
Comment #72
effulgentsia CreditAttribution: effulgentsia commented"inherit inherit"?
s/as/than/?
Why is this only done in this if statement?
Is #48.1 saying that there's problems with this, and if so, can this move to another issue, or is there a reason it's necessary in this one?
For all the RSS ones, looks like we're not adding cache info (because there's no render array in which to do so). Do we need a followup to ensure that either RSS is never cached or else that the cache info gets propagated properly?
We set $file->accessCacheability here, but looks to me like the parent method overwrites it, or am I missing something?
Comment #73
Fabianx CreditAttribution: Fabianx commented#72.4: I think language is out of scope here as it is one of the most ricky parts to get right, so better in a follow-up.
Comment #74
Wim Leers#66 said tests can't be written yet because #2445761: Add a X-Drupal-Cache-Contexts header to aid in debugging and testing isn't in yet. Now it is, so now we can start to write tests here :) That's for the next reroll.
#70: Correct!
#71: I like the sound of a lot! But, yes, that's out of scope, that's for #2444231: Fix CacheableInterface so it makes sense (was: "Make Config objects & Entities implement CacheableInterface + add BubbleableMetadata::createFromCacheableObject()"). (There, we're looking at Entity & Config in particular, but really any object that affects the cacheability of a render array.)
At worst, we could block this issue on #2444231: Fix CacheableInterface so it makes sense (was: "Make Config objects & Entities implement CacheableInterface + add BubbleableMetadata::createFromCacheableObject()").
#72:
language
cache context (which encompasses all language types: interface, content, URL), but to also have a more granular cache contexts per language type. Because the current pattern across core of associating thelanguage
cache context is too coarse; very often we actually meanlanguage:content
. Doing that is blocked on #2432837: Make cache contexts hierarchical (e.g. 'user' is more specific than 'user.roles') reaching a conclusion and maintainer approval first. Filed an issue already though: #2448823: Add languages:<type> cache contexts.Also, we probably want *every* render array to have the "theme" and "language:interface" cache contexts (because each render array, when rendered, varies by the current theme, and because
t()
is present in almost every render array).X-Drupal-Cache-Tags
header. IMO we don't need a new issue for it; it's a well-known problem.Comment #76
Wim LeersI resolved the conflict incorrectly.
Comment #78
Wim LeersHah! Failed because #2396333: BlockContentBlock ignores cache contexts required by the block_content entity landed :)
Comment #79
Wim LeersThis still needs:
I'll be working on those things tomorrow.
Comment #80
Wim Leers#2445743: Allow views base tables and entity types to define additional cache contexts is RTBC and includes the interdiff from #66, including the two test coverage @todos mentioned in #79. So this is soft-blocked on that issue. Once that patch lands, several K of this patch can be removed, and we'll be down to being blocked on #2443073: Add #cache[max-age] to disable caching and bubble the max-age.
Keeping at NR because reviews are still welcome in the meantime.
Comment #81
Wim Leers#2445743: Allow views base tables and entity types to define additional cache contexts landed, now just blocked on the RTBC #2443073: Add #cache[max-age] to disable caching and bubble the max-age.
Comment #82
Wim LeersRerolled by rebasing, because #2445743: Allow views base tables and entity types to define additional cache contexts got committed, which included the #66 interdiff. Consider that the interdiff.
Yay, slightly smaller patch! :)
Comment #84
Wim LeersI rebased incorrectly. I think there's still going to be a few failures in
ShortcutCacheTagsTest
, but this will fix the majority for sure.Comment #86
Wim LeersAnd green again!
Comment #87
Wim Leers#2443073: Add #cache[max-age] to disable caching and bubble the max-age just landed, this issue is now fully unblocked. Now fixing the last @todo.
Comment #88
Wim LeersEt voilà!
P.S.: I forgot to remove the obsolete @todos in #82, so did that too.
Comment #89
Wim LeersTwo remarks about this patch:
This default cache context for rendered entities will be removed by #2453059: Set default render cache contexts: 'theme' + 'languages:' . LanguageInterface::TYPE_INTERFACE. Then there will be zero default cache contexts for entities, which is exactly how it should be!
This overlaps with #2444231: Fix CacheableInterface so it makes sense (was: "Make Config objects & Entities implement CacheableInterface + add BubbleableMetadata::createFromCacheableObject()"). Ideally that patch lands first — it's a soft blocker.
Comment #90
Wim LeersUpdating the IS; this issue doesn't change any APIs; the changes were made in #2445743: Allow views base tables and entity types to define additional cache contexts and #2443073: Add #cache[max-age] to disable caching and bubble the max-age.
Comment #92
Wim LeersTrivial fix.
Comment #93
YesCT CreditAttribution: YesCT commentedThis still has the needs tests tag. Does it still need tests? I looked and some tests are updated and some are added already.
No api changes now, does that mean we do not need a change record?
Comment #94
Wim LeersI think we might want integration tests to verify that field access checks' cache contexts are indeed bubbling. OTOH, we have a lot of test coverage already for bubbling. That's why I left the "Needs tests" issue tag: because I'm not sure, and it depends on judgement.
Comment #95
yched CreditAttribution: yched commentedSorry for the delay. Lots of catching up to do here, I started looking at the patch, but it's likely that I won't be able to get to the ERFormatter parts before friday.
Meanwhile, a couple remarks on the parts I did read.
Wow, that's some tricky logic to follow along (despite the great comment :-)).
Could we maybe extract the last part of the condition, that is the same in both tests, to a variable, to avoid a sequence of &&s that force a vertical scroll ?
(also, such boolean combinations of method calls are not too friendly when stepping through a debugger trying to figure out why we go in an if branch or not :-p)
So the render array structure is now different depending on field access. I think we did that at some point (can't remember if it was CCK D6 or core D7), and IIRC that unpredictability was painful for people doing alters.
Possibly related thought : for performance reasons, we so far intentionally avoided providing per-formatter alter hooks (like we have hook_field_widget_form_alter() on the widget side), and the only way to alter formatters render array is hook_entity_display_build(), on the entity render array as a whole. And that is then specifically exposed to the unpredictability DX issue mentioned above.
That was before we had entity render caching though. Maybe we should reconsider that now (we mentioned that possibility in the longish issue that removed/renamed D7's hook_field_attach_view_alter()), and introduce a
per-formatter hook_field_formatter_view_alter().
Since that would only be called *if* we call $formatter->view() to begin with, the DX issue disappears for that use case (which IMO is the main use case for altering formatters render arrays)
Nitpick : BubbleableMetadata::createFromRenderArray($build_list[$id][$name]) feels a bit weird in the case access was denied and $build_list[$id][$name] is empty :-)
Any way we can prepare BubbleableMetadata::createFromAccessResult($field_access) upfront, and merge the formatter's metadata only if we do run the formatter ?
Like the following ?
A bit more costly, we have code in template_preprocess_field() that specifically avoids doing that - see the (stale) reference to element_children() in there. That was a measurable perf impact back in D7.
If we consider this a non issue now, should we add a @todo in template_preprocess_field() that we don't need that convoluted construct in there either ?
Nitpick :
- is this "access cacheability metadata" or rather "access and cacheability metadata" ? (just wondering)
- this is letting the metadata pass-through rather than actually setting them
- IIRC, we avoid contractions ("we're") in comments ;-)
Comment #96
Wim LeersPer #2433513-11: [PP-1] TaxonomyFormatterBase should extend EntityReferenceFormatterBase, the taxonomy term formatters actually are removed as part of #1847596: Remove Taxonomy term reference field in favor of Entity reference already. Updating IS and related issues.
Comment #97
rteijeiro CreditAttribution: rteijeiro commentedRe-rolled after #1847596: Remove Taxonomy term reference field in favor of Entity reference is in.
Comment #99
rteijeiro CreditAttribution: rteijeiro commentedAddressed @yched points 1,3 and 5 in #95. Not sure about 2 and 4.
Comment #102
yched CreditAttribution: yched commentedFinally looked into the ERFormatter parts - wow, this does complicate the overall logic :-/
As commented below, I think we can simplify that a bit, though.
The logic seems reversed ?
The needsAccessCheck() submethod was added in #2405469: FileFormatterBase should extend EntityReferenceFormatterBase as a way to avoid running useless access checks (and thus effectively grant access) if the handler *is* FileAccessFormatterControlhandler
Also, the code comment should be adjusted for the new semantics of the method. But see next point about that.
The current code only evaluates $entity->access('view') if needsAccessCheck() was TRUE, while the new code wil evaluate it no matter what, which is a big perf impactfor file / image formatters. We'd really need to keep the logic here in a way that allows us to conditionally avoid running $entity->access()
Not ideal that getEntitiesToView() is not just a getter anymore but also has a side effect on $build.
It's also not helping that the $build that needs to be passed here is not the same as the one that is passed to the setAccessCacheability() helper below (one is the render array for the whole field, the other is the render array for an individual item)
More thoughts below.
If that needs to be called on each of the $build[$delta] sub-arrays, then maybe it could be done automatically by having ERFormatterBase::view() do it after calling viewElements() ?
FormatterInterfaceview::view() is the common external-facing API that wraps the call to viewElements(), which is where each formatter does its specific job. No one currently overrides the view() implementation provided by FormatterBase, but this seems to be a good case ?
Also, if we put logic in ERFormatterBase::view(), then maybe it could also take care of the
$inaccessible_entities_cacheability->applyTo($build_of_the_entire_field);
thing, meaning getEntitiesToView() can go back to just receiving $items ?It would need to set $items->_inaccessibleEntitiesCacheability so that view() can read it back and unset it - not ideal, but it's not as if we didn't already do similar stuff already with ->_adhoc_properties :-)
Comment #103
yched CreditAttribution: yched commentedAlso, regarding @rteijeiro's patch in #99 (thanks !) :
- Re: my own #95.5 :
Looks like "access cacheability metadata" is a thing after all :-), so that part of interdiff #99 should be reverted. My bad.
- The $bothCacheable var is defined in within an if() branch, but is also used outside ;-)
Comment #104
rteijeiro CreditAttribution: rteijeiro commentedFixed @yched points 1 and 2 in #102 and also issues in #103.
Regarding point 3 and 4 in #102 I wonder if I have to revert all the changes in @wimleers interdiff
interdiff-erformatters_entity_access.txt
from #49.Comment #106
Wim Leers#97:
::cachePerRole() doesn't exist anymore since #2428703: Add a 'user.permissions' cache context (was: "Should cache contexts be able to associate a cache tag?").
Plus, the #97 interdiff doesn't make sense. And finally, the test failures indicate that the rebasing was done incompletely: several hunks were simply removed instead of rebased.
#99: I disagree with e.g. point 1 (explanation follows), plus we don't use camelCasing inside methods, only at the class and class method naming level.
Due to the above, and the fatal errors during test runs, it's very hard for me to continue where @rteijeiro stopped, so I'm restarting his work from #92. The work was not in vain though, it allowed me to e.g. see a problem with addressing @yched's #95.1 remark.
Starting with a fresh rebase, and with an interdiff relative to #92. The interdiff shows the changes that had to be made that weren't simply conflicts; they're changes necessary to follow recent changes in HEAD. In this case, they're all caused by #2428703: Add a 'user.permissions' cache context (was: "Should cache contexts be able to associate a cache tag?").
Next: addressing all of @yched's feedback.
Comment #107
Wim Leers#95:
hook_entity_display_build_alter()
, basically instead of doingisset($build[$field_name])
, you'd be doing!empty(Element::children($build[$field_name]))
.I'm sympathetic though wary about adding the hook you suggest: it's fine when you have a warm cache, and thus it can be considered fine for render-cacheable entities. But not every entity is necessarily render cacheable. So, if we choose to go down that path, we're almost saying that we're giving up on cold cache entity rendering performance.
RendererInterface::addDependency()
. So we'll be able to reply those 3 complex lines with this:Much nicer, isn't it? :)
Added a
@todo
for this.FormatterBase
, the expensive work intemplate_preprocess_field()
is now no longer necessary, right? Because AFAIS this is a net performance win, because we now avoid rendering an actually empty field using the 'field' Twig template, whereas in HEAD we don't avoid that.#102:
So you need to return
TRUE
to NOT bypass access checking. The problem with this approach is that::needsAccessCheck()
can have arbitrary logic, which means it's really a "default access result", but without the cacheability metadata. Assume for a second its logic depends on some config. When the config changes, so should the access result. But since::needsAccessCheck()
can't return cacheability metadata, this is impossible.Rather than coming up with very convoluted ways to fix this, the simplest possible approach is to not have a method (
::needsAccessCheck()
) that says whether access checking must run; instead we have a method that contains a default access result. That's what the patch does.So then HEAD saying "nope, I don't need access checking" becomes returning
AccessResult::allowed()
in the patch: defaulting to granting access is equivalent to not checking access at all.And HEAD saying "yep, I need access checking" becomes returning
AccessResult::neutral()
in the patch: defaulting to not having an opinion about access is equivalent to checking access, since access wasn't granted yet.So, the specific example you quoted is saying "I grant access and thus bypass field-level access checking if this is not a file-access formatter control interface subclass".
::orIf()
is carefully optimized to not retrieve the second operand's value if the first operand is already granting access (or forbidding access). Therefore, if::defaultAccess()
returnsAccessResult::allowed()
, then no extra work is done.EDIT: Oh, but the second operand is already a calculated access result :/ Hence my point is completely moot. Damn. Oh well, easy fix. See interdiff.
Good point about those two different
$build
s. The@param
docs do distinguish between the two, we could make that part of the variable name too to stress it more? Or change the method name ofsetAccessCacheability()
?TableFormatter
:i.e. it's not guaranteed that the field formatter has a
$items[$delta]
structure. It may be any structure. Thoughts?I like your proposal in the last paragraph a lot, but I don't think it makes sense without the first part, which the above seems to show is impossible.
Comment #110
yched CreditAttribution: yched commentedHaven't processed #107 yet, but :
I guess you mean $access = $access->orIf($entity->access()), right ? ;-)
Comment #111
Berdir@yched: No, the and/orIf() methods require an AccessResult object, not a boolean.
Comment #112
Wim LeersHah! #2453059: Set default render cache contexts: 'theme' + 'languages:' . LanguageInterface::TYPE_INTERFACE landed before the above patches could be tested, hence breaking this patch again :D
Comment #114
yched CreditAttribution: yched commented@Berdir: sure, but I mean $access already equals $this->defaultAccess(), so no need to call that method twice, right ?
Comment #115
Wim Leers#114: lol, yes, of course. Fixed.
Fixing the failures in #112. Should be green now. (Well, I'm assuming the
MigrateFileTest
was a random failure — I can't reproduce it.)Comment #116
YesCT CreditAttribution: YesCT commentedadding note about tests from #94 to summary so it does not get lost.
Comment #117
Wim LeersThanks, @YesCT :)
Comment #118
yched CreditAttribution: yched commentedre @Wim #107 :
re #95 1. 2. 3. : fair enough :-)
re #95 4. : What I mean is that this patch introduces an Elements::properties($field_render_array) (aka element_children()), which is something we strove hard to avoid back in D7 in field_preprocess() because it had a noticeable perf impact :-) That code explicitly avoiding element_children() is still in D8's template_preprocess_field(). So:
- if Elements::properties($field_render_array) is not a good thing over there, it's not a good thing to add earlier in the callstack either ?
- Or we don't care now because render cache, and then we should open a followup to make template_preprocess_field() use the straightforward code now that it's acceptable ?
re #102.1 : right, I was the one reading the patch logic backwards :-). But then I think it would be simpler and clearer to remove the split logic and move it all in one place into a protected checkAccess($item) helper method in ERFormatterBase :
- ERFormatterBase::checkAccess($item) just returns $item->entity->access('view'...)
- FileFormatterBase::checkAccess($item) does "if (FileAccessFormatterControlHandlerInterface) { return $item->entity->access('view'...) } else { return neutral }"
--> Avoids the "if you return this I'll react by doing that" dance, keeps the logic in one place, with a clear override mounting point for subclasses that need their own item access logic ?
re #102.3, .4 : Aw - right, some formatters munge several items into one single chunck of HTML / render array, I should know about that. Sad panda :-/
Mulling on that a bit more, cause the flow around getEntitiesToView() and cache metadata in the current patch is a DX/maintainance regression IMO.
Comment #119
Wim LeersRE: #95.4: Ah,
FormatterBase
in HEAD just checks if$elements
is empty, rather than checking if there are renderable children. If we don't do this, then inaccessible fields will still be rendered (but end up just showing the field label). I'd very much like to keep the optimization you made, but I don't see how we can do that while also not "bleeding" the labels of inaccessible fields.OTOH, this patch makes this change:
This means that:
Element::children()
), but we did have an extra recursion call to render (drupal_render()
) the inaccessible child, which we now avoid. That recursion call was ended very quickly, though, since the very first thing thatdrupal_render()
does is checking the value of#access
to try to abort early.So… what if we:
#access
#access == FALSE
, the cacheability metadata is bubbled (currently that doesn't happen, if#access == FALSE
, rendering just returns, end of story)OTOH, the answer is
, so we can actually do(Sorry for the long answer, just listing all options.)
RE #102.1: Wow! So very elegant. Why didn't I think of this? :D @yched+++++ Done!
RE #102.3, .4: agreed, would be great if we could simplify this. I guess we could do the
::view()
idea you had, but then also implement::view()
inTableFormatter
. The 90% case likely is going to follow the$items[$delta]
structure. So we can optimize for that, though then make the 10% case even harder?Thinking some more about this… you could say the current implementation strives for perfection in terms of the cacheability metadata: it sets the cacheability metadata of the access checking for a given referenced entity on the exact render subtree that will the rendered representation of the referenced entity. Generally speaking, this is important to allow for advanced optimizations (such as automatically rendering "too dynamic/too personalized" things using ESI/BigPipe). But I think preventing those optimizations to not go more granular than the field level is totally acceptable.
If that is the only consequence, then we can simplify all of this tremendously by just setting the cacheability metadata on the entire field. Then the ERFormatterBase-subclass-has-to-set-the-referenced-entity-cacheability-on-the-right-render-subtree problem goes away entirely. Let me try that in the next reroll.
Comment #120
Wim LeersRE #102.3, .4 as described in #119: done.
I also realized that this may actually be even better. Because it's essentially the same as what Views does. In a sense, an entity reference field is like a view: we're listing a certain set of entities, and in both cases, we perform access checking to determine which entities will be listed/rendered. But then the entities that are rendered, are rendered in exactly the same way as elsewhere. In the patches before this one, we were doing:
which means that we MAY effectively be modifying the cache contexts used to view the entity (if the access result has any cache contexts set), causing a different cache ID to be generated for rendering the entity, thus potentially causing a cache miss.
In other words: just like in a view, we're rendering entities. The entities themselves are rendered exactly the same as they are in other places. Just like a view, an ER field only cares about whether a referenced entity must be rendered, yes or no. Therefore the access cacheability metadata belongs at a level higher than the rendered entity. And this change does exactly that! Fixing the DX problems in the process.
Comment #121
Wim LeersFixing two nitpicks I noticed while working on this.
Comment #124
Wim LeersComment #125
Wim Leers#2444231: Fix CacheableInterface so it makes sense (was: "Make Config objects & Entities implement CacheableInterface + add BubbleableMetadata::createFromCacheableObject()") just landed, breaking this patch.
Comment #127
Wim LeersNext: resolving the two @todos in the patch that were blocked on #2444231: Fix CacheableInterface so it makes sense (was: "Make Config objects & Entities implement CacheableInterface + add BubbleableMetadata::createFromCacheableObject()").
Comment #128
Wim LeersSo, this is about resolving those two @todos, as mentioned in #127.
Darn. We didn't want every access result to be forced to provide cacheability metadata. As a consequence, we cannot actually use
RendererInterface::addDependency()
for access results. Because access results only need to implementAccessResultInterface
; it's specifically designed that implementingCacheableDependencyInterface
is optional for access results.So that basically means that
RendererInterface::addDependency()
(which was added in #2444231) still is only useful for entities & config objects. Because it may fail on an access result object, since it may not implementCacheableDependencyInterface
, despite that being the case 99% of the time.If we don't want to change that, then the best we can do is this:
… but at that point, we might as well just stick with
The options are:
RendererInterface::addDependency()
when possible, fall back toBubbleableMetadata
otherwise. Worst of both worlds.AccessResultInterface
extendCacheableDependencyInterface
, thus forcing every access result to provide cacheability metadataRendererInterface::addAccessResult()
, which must be used for access resultsI've attached interdiffs that show the consequences of the 3 possible choices.
Note: this actually belongs in a separate issue, but given the good examples in this issue, I figured we should start the discussion here.
Comment #129
Fabianx CreditAttribution: Fabianx commentedD) Allow to use BubbleableMetadata as an added dependency, e.g.
$this->renderer->addDependency($build_list[$id][$name], BubbleableMetadata::createFromAccessResult($field_access));
Not sure if that works, but that or B) would be my preference.
B) could provide a BaseClass that can be extended when no CacheableMetadata is preferred.
Comment #130
effulgentsia CreditAttribution: effulgentsia commentedWhat about E) change RendererInterface::addDependency() to not typehint on CacheableDependencyInterface, and instead make it check if the passed in object implements it, and if not, then merge in a maxAge=0 instead.
Comment #131
effulgentsia CreditAttribution: effulgentsia commentedI'm not sure about this and the corresponding interdiff of #120. See #70 for why. I like the code simplification but we should perhaps think through that situation more.
Comment #132
Fabianx CreditAttribution: Fabianx commented#130:
I am okay to support any object and determining if its a supported interface (CacheableDependencyInterface, AccessResultInterface, ...) on runtime as 'instanceof' is a really cheap operator. Mixed is not nice, but its the best DX and returning 'uncacheable' (max-age=0) also sounds good when its not supported, but that might need more thought.
#131:
I thought about that and I don't see a problem.
If you remove individual rows from the ER formatter, then you use an alter hook.
That is:
a) Either always the same
b) If its not the same, you need to specify cache context metadata on what you vary that decision
Could you elaborate some more on that?
How would we deal with an embedded view where you choose to skip some rows as well - if we need to treat this differently for ER?
Comment #133
Wim LeersWent with option E, described in #130, because Fabianx +1'd it, and I can't think of a better choice. I hate giving up typehinting, but in this case it's as justified as it gets.
In this reroll:
BubbleableMetadata::createFromObject()
to no longer have typehinting.BubbleableMetadata::createFromAccessResult()
, since it's now obsolete (thankfully!).Renderer
andBubbleableMetadata
.(Note: we could split this off into a separate issue, that'd make this issue more reviewable again.)
Comment #134
Fabianx CreditAttribution: Fabianx commentedYes, lets split the last interdiff off.
The interdiff itself would be RTBC immediately. E) looks great!
Comment #135
effulgentsia CreditAttribution: effulgentsia at Acquia commentedSo, the case I'm thinking of is:
field_er
) to Article.node--article.html.twig
override.{{ content }}
, printIn the above case, the cache contexts of the first (0th) item would not bubble up to the node, because with the current patch, those contexts are on the field (
content.field_er
) rather than on the item (content.field_er.0
) and the template is printing the latter only, without also printing the former. So, for example, if that first item is an unpublished node, then the context of per-user (due to "view own unpublished content" permission) doesn't bubble up and we get information disclosure if the node's author is the first to view the item and populate the cache for other users who shouldn't see that unpublished node.One could argue that this kind of template override is already not properly supported. For example, if
$content['field_er']['#access']
isFALSE
, then the above template would also end up printing an item of a field that shouldn't be visible. But, that seems a bit different to me, because it's one thing to skip the access check of the field when we're short-circuiting the field, and another to skip the contexts of what is logically part of the item, but that we attached to the field for expediency only and not for any logical reason.I haven't looked in-depth into the Views case, so I don't know, maybe the same problem exists there too. In which case, I'm fine with proceeding here with what's consistent, and having a followup to fix (or choose not to fix) both.
Comment #136
Wim Leers#133 + #134: created a separate issue for that: #2464877: Update RendererInterface::addDependency() to accept *any* object, not only CachableDependencyInterface objects. Please review!
#135: Woah! I wonder if digging down from the entity level (it's an entity template) all the way into the item level is really a thing? It makes sense that it's possible, so I guess it is… But it also breaks in-place editing, for example. For in-place editing to work, we need the field template (
field.html.twig
or overrides) to be used!It feels like a violation of separation of concerns to drill down so far. I know it's supported, but it's also something we've actually moved away from in D8: this is related to why we removed
hook_page_alter()
in D8 too.But it sure is an interesting problem!
Comment #137
Fabianx CreditAttribution: Fabianx commented#135: #136: That is indeed an interesting problem.
I wonder if we need to change the way rendering render arrays works, which would make an approach similar to #953034: [meta] Themes improperly check renderable arrays when determining visibility critical.
They way that I think about it is:
If we see x.y.z, do drill down to y, hide all except z, drill up to x, hide all except y, then show x and revert the hidings.
Not sure that would solve it (have to think more about it), but this is _technically_ feasible in Twig (and faster than what we currently have likely).
Comment #138
effulgentsia CreditAttribution: effulgentsia at Acquia commentedNot sure about that as the desired solution for the general case. Generally, if I print
{{ x.y }}
as its own thing, I am specifically not wanting the wrapper markup provided by the theming ofx
. However, I like the idea of such a drill-down implementation if it were opt-in, such as{{ x|only('y') }}
.I'm ok with field items printed outside the context of a field to not be in-place editable. However, if we had something like the "only" syntax proposed above, that would be a nice way to print a single field item with its full field context.
In any case though, this issue is a critical bug, and the patch solves 99% of it, with #135 being a bit of an edge case, so I opened a separate issue (#2465377: Long-standing architecture flaw: Directly printing a nested element skips important processing (including #access) of parent elements) for that that doesn't need to block this one.
Comment #139
Wim LeersOk, so with that being put in a non-critical follow-up issue at #2465377: Long-standing architecture flaw: Directly printing a nested element skips important processing (including #access) of parent elements, what remains to be done here is:
IS updated.
Comment #140
yched CreditAttribution: yched commentedre: #95.4 : ok, works for me. Can we add a @todo in template_preprocess_field() ?
re: #102.3 : Only assembling items metadata at the field level does indeed simplify the logic for each implementation. I can't say I'm fully at ease with the idea, though, because of the template manipulations @effulgentsia mentioned in #135.
If we do that though, then we could go back to the idea of doing that in ERFormatterBase::view(), thus removing the need for a $build modified by ref in getEntitiesToView(), as suggested in #102.4 ?
I kind of suspect there could be a way for that view() implementation to account for both "regular" formatters and "multiple formatters that munge several items in one single delta" like TableFormatter does. Like, if there's only one delta in the render array, put all the collected metadata on the field level, else put each item's metadata in $render[$delta] ?
Comment #141
Wim LeersFirst, updating for #2464659: Routes that are varied by the 'user.permissions' cache context for anonymous users must also get the anonymous Role's cache tag, which caused some test failures.
Comment #142
Wim Leers#140: RE: #95.4: actually, I wonder if we can replace
with
or even:
?
That'd pretty much address your performance concern there.
#140: RE: #102.3: what you propose makes sense to me. I'd like that.
While writing what's below, trying to figure out a way around the problems I was seeing, I saw a way out: much like
ERFormatterBase::prepareView()
already pre-loads entities, if it'd also do access checking already, then::view()
applying access result cacheability metadata and::viewElements()
only returning the render array for the accessible entities could peacefully co-exist.The real beauty: this now effectively means no API changes to
EntityReferenceFormatterBase::getEntitiesToView()
! Direct consequence: 7 KB smaller patch!My original reply, to be complete:
Comment #143
yched CreditAttribution: yched commentedProblem is, placing the entity in the correct language was explicitly deferred into getEntitiesToView() to avoid modifying the FieldItemList being viewed (aka Heisenberg effect :-))
getEntitiesToView() currently switches the language in the $entities that Formatter::viewElements() implementations work on and display, and leaves the incoming $items unchanged.
And yes, apparently access checking needs to happen *after* language switching (I asked back in #2374019-46: Cleanup the use of custom item properties in EntityReferenceFormatterBase), which is why it also happens in getEntitiesToView()
Comment #144
Wim LeersGAAAAAAAAAH!
Does this mean we have to retrieve the translation and do access checking in two places, and thus effectively do it twice for every referenced entity?
Any alternative ideas?
Comment #145
yched CreditAttribution: yched commentedWhat if :
- translation switching, access check, and metadata gathering stay in ERFB::getEntitiesToView($items), it places each item's metadata in $item->_metadata (or something)
- ERFB::view() then grabs those $item->_metadata and puts them in the right place in the render array (it can then unset $item->_metadata if it wants to leave no traces, not critical since it's just an ad-hoc property for internal use, and we'd re-set it anyway if the same $items is rendered again later)
?
Comment #146
Wim LeersHrm, yes, that works. It requires
::getEntitiesToView()
to be called first though, but that's not a problem. That's making a lot of assumptions/actively relying on side effects, but in this case that's probably justified, and the risk is definitely self-contained.Comment #147
yched CreditAttribution: yched commentedYes, that relies on the fact that ERFB::view() calls its parent first, but that seems a reasonably safe expectation ? if your job is to massage a render array, it's likely that you generate that render array first ?
Comment #148
yched CreditAttribution: yched commentedAnd right, that relies on side effects, but all the approaches so far have, and that one seems to be the least intrusive API-wise ?
(ERFB currently relies on side effects on a ad-hoc _loaded property in each $item between prepareView() and getEntitiesToView() too, so that's "just" a continuation of the current - not meaning that's ideal, but it appears to be the lesser of two/N evils)
Comment #149
Wim LeersAgreed on all counts :) Done. I had to update
AuthorFormatter
, because it was extendingEntityReferenceFormatterBase
without calling::getEntitiesToView()
.(In a way, this is actually an improvement over the current DX, in that it fails if
::getEntitiesToView()
is not called. So it prevents you from shooting yourself in the foot.)Comment #152
effulgentsia CreditAttribution: effulgentsia at Acquia commentedYes, I think we do. While we have lots of tests for bubbling in general, and I don't think we need to duplicate that, this patch adds the following line, which I think we need a test for:
Looks like this patch already includes some test coverage for item level (i.e., target-entity level) access checks in EntityReferenceFormatterTest, but only for the label formatter. Do we want similar test additions for the other formatters?
Additionally, HEAD has a FilePrivateTest, which tests some important access checks for viewing private files and their links. Probably a good idea to add some more assertions for cache metadata there.
Those are the only extra tests I can think of that would be useful. Anyone else have other suggestions?
Comment #153
effulgentsia CreditAttribution: effulgentsia at Acquia commentedOh, also, issue title and summary still mention entity-level access (for all viewed entities, not just the target of references), but I lost track of whether we already have coverage for that. If not, we need that too.
Comment #154
Wim Leers#149 had 12 test failures for two reasons:
ERFormatterBase::getEntitiesToView()
now receives$items
by reference, but the override of that method byImageFormatterBase
was not receiving them by reference. Obviously that doesn't work very well.AuthorFormatter
to use::getEntitiesToView()
. ButAuthorFormatter
was therefore implicitly bypassing access checking and I didn't realize this. Now made this explicit, using the same mechanism as everywhere else. (It makes sense: anonymous users cannot access user profiles by default, but they should still be allowed to view the author's name.)Funny that you wrote #152 — while on a run a few hours ago, I was thinking the same. Thanks for writing it in my place, with more precision than I would have :D
Will work on tests and IS update for #153 next.
Comment #155
yched CreditAttribution: yched commentedI'm confused - aren't objects always passed by ref ? Why do we need to eplicitly denote it in *this* method signature, as opposed to every other method that takes an object and does stuff on it ?
Comment #156
yched CreditAttribution: yched commentedre: my own #155: oh - that's because ImageFormatterBase takes $items and replaces it with *another* (cloned) object with the runtime-added "default image" , and we need that new $items to stick until the end of ERFB::view() - is that it ?
But then oops - if that $items is passed by ref, it is modified in the containing $entity, and we're back to "ImageFormatterBase modifies the actual items in the rendered entity", which is another Heisenberg effect ($entity->field_image is empty, $entity->view(), $entity->field_image is not empty anymore) that we got rid of in #2405469: FileFormatterBase should extend EntityReferenceFormatterBase.
If ImageFormatterBase clones the $items, it's precisely so that the original entity stays unaffected as the formatter mocks a runtime item for rendering :-/
Wondering if we *really* need access checks (and thus access metadata) on that default image that gets runtime added on an empty image field. If we don't, then we don't need its ->_accessCacheability to persist until ERFB::view() ?
Comment #157
Wim Leers#2464877: Update RendererInterface::addDependency() to accept *any* object, not only CachableDependencyInterface objects landed, which means we can remove 6 KB of off-topic changes, yay :) Just a straight rebase.
Comment #158
Wim Leers#155 + #156: you're right. Fixed.
And no, we don't need the access checking on the default image. The default image should be accessible by all. I already concluded that that myself before you wrote it, which is why I already removed that in #142 :) I was wondering myself why the hell I did that!
@yched++
Comment #159
Wim LeersSelf-review + clean-up.
Next:
Comment #160
yched CreditAttribution: yched commentedAwesome !
Will try to make a last in-depth review shortly :-)
Comment #161
yched CreditAttribution: yched commentedRe-read the whole patch, this is awesome, and way less intrusive than I feared initially :-)
Remarks below - only the first one deserves a "needs work", the rest is minor / nitpicks.
I think this should have been reverted, because of #143 ? Looks like a leftover, since interdiff #149 reintroduced the $entity->getTranslation() and access check in getEntitiesToView(), but did not remove them from prepareView() :-)
Which leads to a bigger issue :
The entity access check needs to happen, not on $item->entity, but on the $entity in the correct language, which is figured out only in getEntitiesToView() but is not available in $items
Meaning, my bad, the formulation of ERFormatterBase::checkAccess($item) is nice but doesn't work, it would need to receive the $entity instead of the $item. Slightly less elegant :-/ ("why would a formatter have a wrapper method to check the access of some $entity" ?), but well...
$item->_accessCacheability can only exist if $item->_loaded could be set to begin with - that's true, but a bit coupled with the current implementation logic and flow between prepareView() and getEntitiesToView().
Why not directly check if (!empty($item->_accessCacheability)) ?
s/uses/users
\o/
Uber nitpick : the epxlanation starts by talking of a $merge_other var that now only appears 15 lines of comments below. Moving the $merge_other = FALSE line above the explanation might make this easier to understand :-)
Also, wondering if we should split the 1. / 2. / 3. items closer to their respective "if"s ? (although I suspect those items in the comment do not actually match 1:1 the three if branches in the code ?)
Why not just "if (Element::properties($elements))" ?
Also, we could simplify this code a bit by doing :
As a side effect, we could accordingly s/$addition/$elements in the ERFormatterBase::view() override.
Wondering why we need to explicitly add those empty properties, since we don't test equality of the render arrays themselves, but equality of rendered result and computed cacheability metadata. Does BubbleableMetadata::createFromRenderArray() stumble if #attached and #post_render_cache are absent ?
(side note, ouch at this test still being in modules/field, but that's far outside the scope here. field.module's Tests folder is filled with stuff that doesn't belong there anymore. Opened #2469385: Move field.module's tests where they actually belong )
Aside from the remark above that we can't receive the $item, but need to receive an $entity - this logic here couples implicit knowledge that the parent implementation does an entity->access(), just wrapping logic to decide if we want to let that happen or not. Wondering if it would be best to inline that entity->access() call here, instead of calling the parent. No strong opinion.
Leftover from previous iterations I guess, but why not just "return []" now ?
This means hook_entity_field_access() cannot return nothing anymore, but have to at least return neutral() ? Feels like a slight DX regression - can't this be handled by the code that invokes the hook (null means neutral()) ?
No strong opinion though, maybe explcit is best.
Moving this up to EntityDisplayBase makes sense in itself I guess, but $this->renderer is currently only used in EFD, EVD still doesn't use it atm, right ?
Wondering why that is, BTW - #2463029: EntityFormDisplay should update $form with cache tags of FieldConfig, FieldStorageConfig, EntityFormDisplay config entities added logic about "add cacheability data of the $field and $field_storage" to EFD for entity forms, but wouldn't that be needed for displayed entities as well ?
Could move to a ternary ?
Comment #162
yched CreditAttribution: yched commentedAlso :
ERFormatterBase::view() is in charge of collecting the cacheability metadata in each item and assigning it to the render array.
I'm wondering if this logic actually doesn't belong straight to the base implementation in FormatterBase::view().
I mean :
- Any formatter on any field type could have associated cacheability metadata that needs to bubble up, right ? ER formatters are just the most prominent and critical example ?
- The logics of *putting stuff* in $item->__accessCacheability is up to each formatter, according to what field type it works on and what it does as a formatter - and that's what ERFormatterBase::getEntitiesToView() does for ER formatters.
- But the logic of *collecting* stuff from $item->_accessCacheability (if present), and making sure it's bubbled up correctly, could be moved to generic code in FormatterBase::view() ?
Just thinking out loud, this could totally happen as a separate step in a followup.
Comment #163
dawehnerWhile talking with yched I think we encountered another problem:
We are talking about the following part of the patch:
... Views doesn't use $build_list[$id][$name] for rendering but rather extracts the individual deltas in order to be able to bypass
field templates, which is a huge thing for themers, see
core/modules/views/src/Plugin/views/field/Field.php:735
for more information.It feels like we then also have no test coverage for those kind of access metadata for rendered entity referenced fields.
Comment #164
larowlanFirst a re-roll, didn't apply anymore
Comment #165
larowlanFixes 161 items 2, 3, 5, 6, 9 and 12
Leaving others until debate/consensus
Comment #167
amateescu CreditAttribution: amateescu for Drupal Association commentedI was working on this yesterday at Dev Days and discussed #163 in detail with @yched. Here's a patch that fixes all the points from #161 and #163. Interdiff is to #159.
#161.6:
Because
Element::properties()
only returns the properties of a render array (those that begin with #), not the element's children. But the cleanup you proposed can surely be done :)#161.10:
We don't need that for displayed entities because we're invalidating the
<entity_type>_list
cache tag when a EVD is changed.Comment #169
yched CreditAttribution: yched commented@larowlan / @amateescu : thanks !
Not sure how @larowlan's #165 and @amateescu's #167 relate to each other - was the latter rolled on top of the latter, or are they crossposts ?
(#165 has the wrong interdiff ;-))
Right, silly me, I meant Element::children().
if (count($elements) !== count(Element::properties($elements)))
should be equivalent toif (Element::children($elements))
, right ?@Wim explained the reason to me again, and that is the one mentioned in #107 (the part that replies to #102.1) - in short, a simple yes / no is not enough here since we are going to cache stuff, it needs to also have the *cachability info* of that yes / no (i.e "no at the moment, but that might change if that piece of config changes). That means a method that returns an AccessResult (which have APIs to specify associated cacheability) rather than a bool.
Long story short : the current ERFB::checkAccess($entity) is fine :-)
Comment #170
Wim LeersGreat cleanup :)
Nit: we could also move these together into one if-test.
+1 again for this s/$addition/$elements/. Much clearer.
Why this change? Inlining/performance? Clarity?
The parent method does exactly the same in any case.
This overwrites the existing cacheability metadata on the item, if any.
And you want to pass
$render_array
, not$render_array['#cache']
.So you want:
Comment #171
yched CreditAttribution: yched commentedre @Wim #170.4 : see #161.8 :-)
As mentioned there, no strong opinon though.
Comment #172
Wim Leers#171: ok, thanks :) It's fine, it's even slightly more efficient. I was just wanted to understand why, I felt like I maybe was missing something.
Comment #173
amateescu CreditAttribution: amateescu for Drupal Association commented#169:
Yep, #167 is basically a crosspost with #164/#165.
1) right, we can definitely use
Element::children().
.2) cool :)
#170:
2) we can not merge them because the else branch of the second if needs the first if check to be true ($item->_accessCacheability being available)
4) it was part of @yched's review, where he argues that doing the access check inline is more clear code-wise than relying on the implementation of the parent, and I agree :)
5) DOH, fixed
Comment #174
yched CreditAttribution: yched commentedLast nitpick : the fact that we have to specifically check for "are there actual children" instead of "is there anything in the render array" so far, is because of cacheability metadata, so I think we should keep the comment @Wim was adding in earlier iterations.
Something like "... otherwise, let access cacheability metadata pass through for correct bubbling" ?
Other than that, RTBC :-)
Comment #175
amateescu CreditAttribution: amateescu for Drupal Association commentedSure, makes sense to have that extra comment in there.
Comment #176
yched CreditAttribution: yched commentedYeeeeehaa !
Comment #177
alexpottWhy are we not testing what the result is? I guess that it should be 3600.
Comment #178
amateescu CreditAttribution: amateescu for Drupal Association commentedHah, that makes perfect sense :) I ran the Access phpunit test group locally and it passes.
Comment #179
Wim Leers#177: It was just to keep the assertions as similar as possible … but absolutely, +1.
Back to RTBC with that nitpick fixed.
Comment #180
alexpottThis issue addresses a critical bug and is allowed per https://www.drupal.org/core/beta-changes. Committed 7e8d212 and pushed to 8.0.x. Thanks!
Comment #182
Fabianx CreditAttribution: Fabianx for Acquia commentedThis looks great! I took the time to take a look to see how this influences other cacheability work.
It is fantastic how simple this became in essence! We just ensure that we pass data correctly over and merge it appropriately.
Belated: RTBC +1
Follow-up for the BubbleableMetadata part:
https://www.drupal.org/node/2474121
Comment #184
yched CreditAttribution: yched commentedFYI, about Views cherry-picking in the formatter render array - we took care of carrying over CacheableMetadata here, but not #attached
#2496039: Formatter's #attached assets are not carried over by Views
Comment #185
yched CreditAttribution: yched commentedOops - actually this patch here did merge BubbleableMetadata in core/modules/views/src/Plugin/views/field/Field.php,
but then #2474121: CacheableMetadata should get BubbleableMetadata's merge/applyTo/createFromRenderArray/createFromObject methods changed that back to CacheableMetadata, which leaves #attached out.
Pinging over there then.