Problem/Motivation
In #2865616-49: Add an option for EntityConverter to load the latest entity revision and fix all entity forms to use this option. and following we discovered that the rendering of HEAD links can be very expensive, depending on the amount of links a node defines.
Proposed resolution
Remove the adding of every relationship (delete-form, edit-form etc.) which is done by both node and taxonomy modules in different ways. These are expensive to access check, and we don't even have theoretical use cases for them, let alone real life examples. They also add additional cache contexts to node renders, potentially reducing cache hit rate.
There is no way to verify that no-one uses them, except for the complete lack of use cases for doing so. However the only thing that would use them would be some kind of external tool (like a scraper) rather than Drupal code.
In place of this, generically add shortlink and canonical links for every entity type. This will maintain the useful links for node and taxonomy, and also add them to other entity types.
Remaining tasks
Perform more benchmarks once the following issues are in:
- #2865616: Add an option for EntityConverter to load the latest entity revision and fix all entity forms to use this option.
- #2620980: ContentEntityStorageBase::loadRevision() should use the static and the persistent entity cache like ContentEntityStorageBase::load()
- #2784201: Track the latest_revision and latest_revision_translation_affected ID in the entity data table for revisionable entity types
- Validate the proposed solution
- Write a patch
- Reviews
User interface changes
TBD
API changes
None foreseen
Data model changes
None foreseen
Release note snippet
Drupal-specific <link>
tags have been removed from node and taxonomy term pages to improve performance. There are no known uses of these tags in the wild.
<link rel="canonical">
and <link rel="shortlink">
tags remain, and have been extended to cover all entity types.
Comment | File | Size | Author |
---|---|---|---|
#91 | 2922570-90.9.2.5.patch | 17.69 KB | Tomefa |
#90 | 2922570-90.9.2.4.patch | 17.72 KB | dww |
#89 | 2922570-89.9.2.x.patch | 18.27 KB | dww |
#88 | D91x-2922570-88.patch | 17.54 KB | joseph.olstad |
#80 | 2922570-80.patch | 17.66 KB | paulocs |
Comments
Comment #2
dawehner#2406533: edit-form, delete-form etc. <link> tags added on /node/{node} are invalid according to W3C Validator added the original access check
Comment #3
plachComment #4
dawehnerReading through #1970360: Entities should define URI templates and standard links it was never discussed that this might become a performance issue.
I am also wondering whether we could make this somehow an opt IN feature rather than an opt out feature.
Comment #5
plachAfter researching this a bit more, I realized that probably the proposed solution is not viable. AAMOF the access checks for many node routes, including the edit form, are performed anyway while generating local tasks. In case of an anonymous user, these access checks are performed twice.
This is probably not hugely problematic in the current HEAD, because these checks are run only with cold caches and a big part of the effort is loading the entity, which is then statically cached. So performing those checks multiple times is not making a big difference.
OTOH #2620980: ContentEntityStorageBase::loadRevision() should use the static and the persistent entity cache like ContentEntityStorageBase::load() and #2784201: Track the latest_revision and latest_revision_translation_affected ID in the entity data table for revisionable entity types are likely going to bring us in a similar scenario while loading the latest revision, but I don't think we can skip these checks altogether.
We could explore statically caching access checks and entity conversion so we perform those just once, but I don't think it would imply a huge gain once those two issues are fixed.
Comment #6
dawehnerI personally haven't seen those calls, but maybe I was blind, or render caching got applied to those blocks.
Comment #7
plachLikely render cache, I have it disabled in my local env.
Comment #8
catchDoes render caching get applied to the HEAD links too? I think it does now so this might be moot after all. I left a comment on the main issue which would help the specific case there though.
Comment #9
dawehnerThese links are generated on the controller level, which are only cached by the page level caching (page cache vs. dynamic page cache).
Comment #10
plachI'd say let's get back to this and perform more benchmarks once the following issues are in:
Comment #11
dawehnerFair point, let's hope the best as always.
Comment #12
Wim Leers#1970360: Entities should define URI templates and standard links added the links, #2406533: edit-form, delete-form etc. <link> tags added on /node/{node} are invalid according to W3C Validator added the access checking. Because it was causing W3C validation errors. The performance problems were even pointed out (incidentally by yours truly): #2406533-67: edit-form, delete-form etc. <link> tags added on /node/{node} are invalid according to W3C Validator. Ah well.
Comment #13
plach#2865616: Add an option for EntityConverter to load the latest entity revision and fix all entity forms to use this option. was committed.
Comment #19
BerdirSo here's a pretty aggressive proposal to resolve this. From the original issue that added this:
> We can do the same in HTML
elements. Hypermedia FTW, again. (What could we do with that information client-side? The mind boggles!)
Yeah, my mind is indeed still boggled about it. As in, I don't understand the use case, to be honest. I get it for REST/JSON-API, fair enough, as these are then standardized protocols that can be more or less understood by generic clients. (And also fair point that for a decoupled site that actively uses those endpoints instead of the HTML format, this won't actually help, I get that too). But some random HTML forms, I can't imagine what the benefit there is.
If someone has a use case, IMHO a contrib module relatively easy provide this as well for all entity types with some trivial checks on the route name. Like content_translation_page_attachments() already does for translations and taxonomy does it in a hook too: taxonomy_page_attachments_alter().
Beside this performance issue, there are also #2856823: delete-form and edit-form links in header for anonymous users and #2821635: edit-form, delete-form etc. <link> tags added on /taxonomy/term/{taxonomy_term} are invalid according to W3C Validator (duplicate) to add access checks to terms as well. And [#2782283 ] about making it more generic.
So I'd propose to just keep the canonical and shortlink as hardcoded special cases but do those for all entity types*. We might also want to move content_translation_page_attachments() into a default implementation, as everything it depends on is actually generic entity API concepts, content_translation is just about the backend UI. We would need access checks for the translations but not the current language IMHO, because we know that you can access it.
To be defined: BC? Opt-in? Opt-out? This first patch is just to show how it could look like.
Comment #20
BerdirI guess that would break on node preview, this should fix that. Although preview of an already saved entity would still add those, but we could unset them in the node preview controller if we care.
Comment #23
catchTo me this seems like 'data structure' - so could go into a minor release without a bc layer. Would obviously be possible to do something with
Settings
for opt-in/opt-out though if we can think of something the change would actually break.Comment #24
BerdirHad the wrong variable name, so that actually didn't do much yet. This should be better. Also fixed a bunch of tests, but adding url.site to more responses (had to make the URL absolute so we do need that cache context unfortunately, no way around it) might fail some additional ones.
Also fixed a bug in \Drupal\Tests\node\Functional\NodeViewTest::testLinkHeader(), it passed the alias = TRUE option to the wrong method. Was wondering about adding an alias to test that explicitly, but skipping that for now as it's also tested in some other places like \Drupal\Tests\path\Functional\PathAliasTest::testNodeAlias().
Comment #26
BerdirSome more test fixes around cache contexts. We could still split up the part that's making it the default from simplifying node/term, would require fewer test changes.
Comment #27
BerdirReroll for 9.1.x, conflict on setUp() return type changes.
Comment #29
catchThis isn't passing $langcode. Also do we need the method at all now?
Otherwise looks great. I'd prefer to make this change than #2856823: delete-form and edit-form links in header for anonymous users.
Comment #30
BerdirThanks for the review.
> This isn't passing $langcode. Also do we need the method at all now?
The parent method doesn't have a $langcode parameter, that was always bogus or at least for a very long time.
Not sure if I remember why I left the method, would have to look it up. BC maybe? I think there was some reason.
Comment #31
anmolgoyal74 CreditAttribution: anmolgoyal74 at OpenSense Labs for DrupalFit commentedFixed the minor CS failure.
removed the unused use statement.
Comment #32
catchCan't see why we'd need bc for the method, even if something passed $langcode or extended the controller it shouldn't make a difference.
Comment #33
catchComment #34
catchSince my question was answered and the coding standards issue has been fixed, moving this to RTBC.
Adding issue credit from #2856823: delete-form and edit-form links in header for anonymous users which I'm about to mark as duplicate.
Comment #45
catchComment #46
alexpottThis needs a re-roll unfortunately.
Comment #56
catchAlso marking #2454915: Entity link annotations in HTML head are not valid HTML as duplicate and adding credit from there.
Comment #57
ravi.shankar CreditAttribution: ravi.shankar at OpenSense Labs commentedAdded reroll of patch #31.
Comment #59
alexpottI think we need a release note to document that node html output is changing. Also this should have a change record which details which link relationships are being removed and how to add them back if they are being relied upon.
Comment #60
paulocsComment #61
paulocsFixing the tests.
Comment #62
paulocsMoving back to NW as the Change Record need to be created.
Comment #64
dwwThis all looks promising. Haven't super closely reviewed, yet, but a quick skim looks good.
Sorry for the noise, but I'm going to be running this on a 8.9.x site, and the latest patches don't apply there. So here's a composer-friendly reroll. ;)
Comment #65
dwwVersion change wasn't intentional...
Comment #67
SpokjeLet's see if Version
9.2.x-dev
sticks now.Comment #69
tobiasbReroll against latest core changes.
Comment #70
longwaveThis method can be deleted entirely, as it just calls the parent? This was also noted in #29-32 but I don't see the point of keeping it.
Comment #71
BerdirThe variable name is different between the methods, which is relevant for controller methods.
Comment #72
longwaveOh I see. Worth a followup to deprecate that and unify it somehow?
Added a release note snippet and change notice: https://www.drupal.org/node/3216941
The code looks good to me, so back to RTBC.
Comment #73
longwaveComment #74
joseph.olstadgreat work on this one, I did some XHProfGUI profiling not long ago and confirm that we do need the performance enhancement. Has anyone done any profiling work on the latest patch yet? just wondering.
I recently used XHProfGUI and would recommend it but it did take some effort to set up. https://github.com/perftools/xhgui
It would be nice to know how much of a performance gain this patch will bring (approximately).
Comment #75
BerdirI can try, but there's likely not a simple answer for this, as it depends a lot on what kind of node access checks you have. If you have node grants, are using modules like group/og or have expensive access logic in custom hooks then the benefit is going to be much bigger than just with core.
There might also be fewer cache variations and cache redirects in some cases, which is not something you can quantify easily with profiling of single requests.
Comment #76
joseph.olstadYa for profiling access checks here make sure to use a non-administrative role as the admin user account bypasses all these checks so to test access check improvements you'd want to use a different role/account.
Comment #77
BerdirYes, not just non-adminstrative but explicitly as anon user, because the old logic only checked access for anonymous users.
Did some testing. Disabled page and dynamic page cache to simulate uncached pages for anonymous users, created a very simple page node.
Actually surprised to see a bigger improvement as I expected here. Comparing also with blackfire and seeing a considerable improvement there as well (as always, take with a grain of salt as blackfire tends to considerable increase the cost of function calls. Response time there is 50% higher than with ab). Url/Route access checking really is expensive, especially of entity routes. We need to load the route, do upcasting and all kinds of other things beside just checking access.
So yes, lets do this :)
Considering the performance gains and amount of issues that were closed as duplicate, I'd even say this could be set to major.
I also added a code snippet and references to the change record if someone wants to have those relationships back.
Comment #78
larowlanThanks for working on this folks, will be great to see this solved once and for all.
Got a couple of questions.
Here we say 'non-aliased' but then we use set alias to TRUE? Wouldn't non-aliased use FALSE?
I've searched the code in UrlGenerator and I don't see any mention of 'alias'
I was expecting this to use 'path_processing'
nit: as arrays > 80 chars these should be wrapped
now this defers to the parent, is there any need to keep it?
If 'alias' => TRUE is indeed not supported, then this shows the test is too close to the implementation, we should instead be compiling the expected uri manually, e.g.
'node/' . $node->id()
hmm but we're using alias TRUE in HEAD
I wonder if that's broken too 🤔
Comment #79
BerdirThanks for the review.
1./5. The alias option is weird, but correctly use. Easy to verify, just look at an article that does have an alias and you can see that shortlink is node/N. It kinda means "this url is already an alias, do not look for an alias again". Super weird, but has been like this for as long as I can remember. The option is in \Drupal\path_alias\PathProcessor\AliasPathProcessor. Disabling path processing entirely is not the same thing and not what we want, that would also not add the language prefix then for example.
3. See #70/#71.
4. I get the point, but it's an absolute link, so we would also need to replicate adding the domain/path to it and so on. This method apparently didn't test shortlink before, but there is an existing test just below that for this. And all existing tests also used toUrl().
Leaving at needs work for #2.
Comment #80
paulocsHere is patch that addresses #78.2
I also changed
protected $defaultCacheContexts = ['languages:language_interface', 'theme', 'url.query_args:_wrapper_format', 'user.permissions', 'url.site'];
inUserTranslationUITest.php
and
protected $defaultCacheContexts = ['languages:language_interface', 'theme', 'url.query_args:_wrapper_format', 'user.permissions', 'url.site'];
inTermTranslationUITest.php
Comment #81
longwaveAll points in #78 answered or fixed, back to RTBC.
Comment #83
catchYeah I think this dates back about 12-15 years or so, must be barely used any more except in this issue. The 'alias' option used to mean "this is an alias, don't try to look up an alias for it again", in this issue it's being used purely to disable alias look-up. So I think we should open a follow-up to rename the option to something a bit more semantic - have done so here: #3218088: Rename the 'alias' option for URLs.
Nice to see how much this cleans things up too.
Committed bfe59b2 and pushed to 9.3.x. Thanks!
Comment #84
longwaveI will leave it up to others to decide if this is a release highlight or not.
Comment #85
catchI think it might be, adding the tag.
Comment #86
larowlanthanks @Berdir, I see the code now.
again this slide proves true http://larowlan.github.io/ds.core-til/#/9/2
glad to see this fixed
Comment #88
joseph.olstadToo bad it missed 9.2.0
we need speed and are impatient so I'm rerolling backports
Reroll for D91x and also applies cleanly for HEAD of D92x (same patch for both)
Comment #89
dww#88 no longer applies to 9.2.x branch. Here's another backport for composer folks.
Comment #90
dwwSorry for the noise, but #89 only applies to
9.2.x
branch. If you've got 9.2.4 official, it doesn't apply due to #3139404: Replace usages of AssertLegacyTrait::assertText, that is deprecated. So here's one that should apply to 9.2.4 itself...Comment #91
Tomefa CreditAttribution: Tomefa as a volunteer commentedPatch in #90 wasn't working for me on drupal 9.2.5, here the reroll for Drupal core 9.2.5
Comment #92
Zsuffa Dávid CreditAttribution: Zsuffa Dávid commentedPatch #91 works for me on drupal 9.2.6. thx.
Comment #93
gilles_webstanz CreditAttribution: gilles_webstanz at WebstanZ for WebstanZ commentedHello,
I tested the Patch #91 on drupal 9.2.5 and it works.
Thank you !