Problem/Motivation
When we converted Drupal 7 to the Symfony routing system we made fast 404s only do stuff on exception - i.e. when routing has determined there is a 404 already. This makes the system/files part of the suggested config in default.settings.php pointless but also it means that we have to do routing for paths that we would previously (in Drupal 7) not have done.
Locally fast 404s are not fast for me. In fact they are slower. Compare:
Request: /something
Document Path: /something
Document Length: 23112 bytes
Concurrency Level: 1
Time taken for tests: 4.321 seconds
Complete requests: 1000
Failed requests: 0
Non-2xx responses: 1000
Total transferred: 24727000 bytes
HTML transferred: 23112000 bytes
Requests per second: 231.42 [#/sec] (mean)
Time per request: 4.321 [ms] (mean)
Time per request: 4.321 [ms] (mean, across all concurrent requests)
Transfer rate: 5588.09 [Kbytes/sec] received
Request: /something.jpg
Document Path: /something.jpg
Document Length: 193 bytes
Concurrency Level: 1
Time taken for tests: 6.423 seconds
Complete requests: 1000
Failed requests: 0
Non-2xx responses: 1000
Total transferred: 704000 bytes
HTML transferred: 193000 bytes
Requests per second: 155.68 [#/sec] (mean)
Time per request: 6.423 [ms] (mean)
Time per request: 6.423 [ms] (mean, across all concurrent requests)
Transfer rate: 107.03 [Kbytes/sec] received
Proposed resolution
Emit a cacheable exception from Fast404ExceptionHtmlSubscriber so that it can be cached in the same way as other 404s.
Remaining tasks
User interface changes
None
API changes
Fast404ExceptionHtmlSubscriber now takes the cache_tags.invalidator
service instead of http_kernel
service.
Data model changes
Fast 404s are cacheable in the page cache and work the same way as other 404s.
Release notes snippet
It is no longer necessary or recommended to configure fast 404s in settings.php.
Comment | File | Size | Author |
---|---|---|---|
#41 | 3292908-cacheable-41.patch | 12.13 KB | alexpott |
#41 | 38-41-interdiff.txt | 843 bytes | alexpott |
#38 | 3292908-cacheable-38.patch | 12.12 KB | alexpott |
#38 | 27-38-interdiff.txt | 5.34 KB | alexpott |
#27 | 3292908-27.patch | 7.2 KB | dww |
Comments
Comment #2
alexpottWith this patch...
So faster than for /something... but not by much. Way less HTML transferred so probably still worth doing and my test set up is not that similar to a real production work load.
Comment #4
Berdirthat means an extra config call on every request, no? should we convert fast 404 to settings?
Comment #5
BerdirDid you profile the difference?
I'm a bit confused by this. regular 404 also happens as part of 404 exception handling, as long as this is first, generating a fixed predefined response string should be faster than for example rendering a node? Maybe you tested with the default 404 controller, which is probably faster, but I'd expect it to be slower if you have a 404 node or so, which is pretty common to have a custom and nice page with content on it.
I also don't understand how this would work as a middleware. we _need_ routing for fast 404 extensions, for private files, generating image styles and so on, as the failing tests are showing?
Comment #6
BerdirI can confirm that regular 404 is faster, but that's because that's a cached page. The correct fix here is to return a cacheable response I think. Of course, that does mean we need to consider cache invalidation here, but the default 404 stuff *probably* has that covered. To test, we need to access a certain private file path, then create said file, then access again.
Comment #7
BerdirIf you disable page cache, you get a very different picture, blackfire numbers, so a bit slower overall:
cached 404: 4ms
fast 404: 10ms
non-cached 404: 20ms
so fast 404 is twice as fast, you only get the cache benefit in case of repeated requests on the same URL.
Comment #8
catchYes we should have done that when we introduced $settings...
The idea is that 'exclude paths' don't get fast_404s, so that image styles, private files (and eventually asset aggregates if that patch ever lands) are ignored. Maybe we should consider converting this particular setting to container parameters so that the relevant subsystems can add entries themselves. From the test failures it seems that logic isn't working, but conceptually it should work and it used to in 7.x
Worth referencing the original issue where this was introduced: #76824: Drupal should not handle 404 for certain files.
Comment #9
Ghost of Drupal Past#1833516: Add a new top-level global for settings.php - move things out of $conf two choice quotes
Aside from two examples for the Configuration overrides, this issue is the only one left in settings.php which hasn't been converted.
Comment #10
alexpottSo here's what I think we can do...
1. we can put the config into a container parameter - like we do with default language so getting it is free
2. The fast 404 service can be both a middleware and an exception subscriber and we can make the middleware registration dependent on a config option
3. We can add proper test coverage
This way we'll get the old D7 behaviour where you could uncomment the line in settings.php to get super fast 404s. I also think that regardless of configuration system/files should never fast 404.
Comment #11
alexpottHere's a patch that implements #10. We get back fast 404s with no routing, no queries for fast 404s as the config is now stored on the container. I think this patch now gets us back to where D7 is at.
I guess there is a question about whether this is worth it. We could cache the fast 404s and get rid of the excluded paths concept altogether since the exception subscriber part of this only happens after routing and when there is a real 404.
Comment #12
alexpottFixing the test group and moving it to a better location.
Comment #13
catchThis is good but looking at the patch I'm wondering if we'd be better off moving it to $settings.
If we always set a cacheable header regardless of page cache settings maybe that'd be a good trade-off to simplify things a bit.
Comment #14
longwaveI also wonder whether all this complexity is worth it. If you want even faster 404s that match a regex, you can do it directly in the webserver configuration so the requests never even reach PHP.
Comment #15
alexpottI think changing the exception subscriber to cache the responses is by far the best way to do this. We get the same speed as the page cache and the consistency of Drupal handling the responses as it would for most other things. If we want to make #12 more similar to D7 we need to change the middleware to have a higher priority than
http_middleware.page_cache
- as there is no point checking caching if you're just going to do a fast 404 immediately after.No interdiff because a different approach.
I also think that with the patch attached here we can remove the
system.performance:fast_404.exclude_paths
setting. This setting is only really required when the fast 404s occur prior to routing and we've not been able to do that since D7 :)Comment #16
longwaveIn practice what is going to trigger this if there is no UI?
I think we could do that anyway - untested but you should be able to use negative lookahead in the
paths
setting.Comment #18
alexpottRe #16 this event will trigger via config import or a config edit via drush. I think we should have way more config listeners that do things like this rather than having anything in forms.
Re removing
system.performance:fast_404.exclude_paths
I think we could convert #3216687: Remove 'fast_404.excluded_paths' from system.performance.yml into the issue to do this. Especially as it will need an upgrade path it would be nice to handle elsewhere.Comment #19
longwaveThe fix is nice and clean and the test looks comprehensive.
Worth injecting
cache_tags.invalidator
instead? If you think not, then this is RTBC.Comment #20
alexpottYeah let's inject it and remove the bogus http kernel injection that's unused.
Comment #21
longwaveLooks good to me.
Comment #22
alexpottWe need to update the issue summary with the findings and the new approach and create a CR about the caching change and mention the constructor change.
Comment #23
longwaveDo we also need to do a deprecation dance in the constructor?
Comment #24
alexpottI've thought about the deprecation dance - here it is going to be complex - we'd have to remove the typehint completely. Given that there are no classes extending in contrib I think it is more valuable to have the correct typehint. Especially as event subscriber constructors are not API - see https://www.drupal.org/about/core/policies/core-change-policies/bc-policy
I've updated the issue summary. Note sure a release note is worth it. All other 404s are cached in page cache so this is not really a big change.
I've add the change record - https://www.drupal.org/node/3293205 - this notes the fast 404s are cacheable responses now and that the constructor has changed. Sites are extremely unlikely to have to make any change to benefit from this.
Setting back to rtbc as per #21
Comment #25
dwwLooks great. I love the irony of this issue! ;) Nice job, everyone.
I can only spot a few pedantic nits. Please ignore, if you prefer.
Nit: These aren't exactly in order now. /shrug
Nit:
===
Nit:
: void
?Comment #26
acbramley CreditAttribution: acbramley at PreviousNext for Service NSW commentedAnother minor nit, do we not want to save
$this->assertSession()
to a local variable to save setting it up on every line?Also might be nice to abstract
$this->assertSession()->responseNotContains('modules/system/css/');
into a function likeassertNotFast404
or something similar to make it easier to improve/refactor in future?Comment #27
dwwThis patch fixes all nits from #25. I didn't touch
core.services.yml
so I'm not sure why it's appearing in the interdiff.Re: #26.1: Core isn't consistent about this and there's no standard. Many tests do it either way. If anything, it looks like not having a local var is the more common approach. So I didn't want to change that here.
Re: #26.2: Thing is, we do both
responseContains()
andresponseNotContains()
in this test, so we'd need 2 new custom assert methods (or 1 method that takes a param). In some ways, the custom assert method makes it harder to know what the test is doing. I'm a little worried about YAGNI for 2 1-line functions that are only invoked a handful of times for the sake of "easier to improve/refactor in future?".So for now, no changes for either suggestion in #26, but I'd be happy to do either of them if a core committer prefers...
Thanks!
-Derek
Comment #28
borisson_Changes in #27 look great, they fix the remakrs in #25, and I agree that the refactor in #26 is not yet needed.
Back to rtbc.
Comment #29
alexpottRe #26 - yeah the saving of things to local class properties in tests is often problematic. In this case it is not but things like services and users and not actually that good an idea because they often go stale - especially in BrowserTestBase tests.
Comment #30
acbramley CreditAttribution: acbramley at PreviousNext for Service NSW commented@alexpott not class props, just a variable, i.e
$assert = $this->assertSession()
Comment #31
Ghost of Drupal Pastwhat does this mean for a settings.php edit, will drush cr pick up the change or an explicit call to the tag invalidator is needed? If the latter, we might want to add a note to settings.php to do
drush php '\Drupal::service("cache_tags.invalidator")->invalidateTags(["4xx-response"]);'
after editing.Comment #32
catchThe reason we originally added settings.php configuration for this was because it could break actual Drupal 404 errors due to running before routing.
If we're letting routing happen, then do we need the configuration at all?
There are some very specific use cases for not showing a minimal 404 for files (like an image hosting site), but for those there's the option of removing the event subscriber altogether.
Could be done in a follow-up but feels worth looking at.
If we leave this in config, I think we should take the lines out of default.settings.php and put some comment documentation in system.performance.yml instead. Or actually move it to $settings altogether.
Comment #35
alexpottI've marked #3216273: Add tests for Fast404ExceptionHtmlSubscriber as duplicate so crediting the contributors.
Comment #36
alexpottRe #32 I think once this lands we can re-purpose #3216687: Remove 'fast_404.excluded_paths' from system.performance.yml to decide wether to keep this in config or move it to settings. The exclude_paths config is meaningless in the current implementation. So deciding what to do with that and whether all the things should be config or settings feels all intimately related. As #32 states this could be done in a follow-up and I think that that would be best. Tentatively putting back to rtbc.
Comment #37
catchI think we should still take the $config lines out of settings.php in this issue - it's not doing anything useful, and to the extent that it would, the config listeners here won't work.
Comment #38
alexpottSure - I've removed the comments and move them to the event listener so we don't lose the docs.
Comment #39
longwaveAdd a void return type? Otherwise this looks good to go, all review points have been addressed.
Comment #40
Ratan Priya CreditAttribution: Ratan Priya at OpenSense Labs commentedComment #41
alexpottSure - I'm not a huge fan of random non interface :void - but I also am not sure it matters.
Comment #42
longwaveThanks - I think it is easiest to add return type everywhere for now, otherwise we have to come up with a set of explicit rules.
Comment #44
catchCommitted/pushed to 10.1.x, cherry-picked to 10.0.x and 9.5.x
Comment #45
andypostI published CR but it doesn't mention that this settings no longer in settings.php file
Comment #47
Wim LeersLooks like this introduced a critical regression: #3312198: Regression concerning the cache of private files.
Comment #48
AnybodyIf anyone else is using the Search 404 module, I think this change should be considered over there. If it's no more recommended to use
fast_404
in settings.php it might mean that by removing these settings, all those 404's are handled by Search 404, resulting in many database queries for the missing files.That might be a heavy performance trap? See #3360782: Exclude fast_404 paths and file extensions by default (Performance) and same for Redirect 404: #3360799: Redirect 404: Exclude file paths by default (Performance)