Problem/Motivation
The page cache middleware was supposed to prevent the construction of services further up the stack. This has been the responsibility of the responder attribute on the http_middleware service tag implemented in #2473113: All stack middlewares are constructed at the same time even for cached pages during the development of Drupal 8.0. But it was rendered ineffective by #2408371: Proxies of module interfaces don't work before 8.0 was even released.
Steps to reproduce
Proposed resolution
Allow to pass in a service closure to the page cache middleware as its first argument. Services further up the stack and their dependencies are only constructed in case of a cache MISS. On a cache HIT, a response is served with considerable reduced overhead compared to previous versions.
Stop marking services lazy in StackedKernelPass. It did not have an effect in any shipped release.
Remaining tasks
User interface changes
Introduced terminology
API changes
Data model changes
Release notes snippet
| Comment | File | Size | Author |
|---|---|---|---|
| #22 | cms-dev-patched.txt | 1.21 KB | znerol |
| #22 | cms-dev-baseline.txt | 1.21 KB | znerol |
| #20 | core-dev-patched.txt | 1.21 KB | znerol |
| #20 | core-dev-baseline.txt | 1.21 KB | znerol |
Issue fork drupal-3538294
Show commands
Start within a Git clone of the project using the version control instructions.
Or, if you do not have SSH keys set up on git.drupalcode.org:
- 3538294-regression-all-stack
changes, plain diff MR !12857
Comments
Comment #2
znerol commentedI guess this could be solved with a service closure for the
$http_kernelargument in PageCache middleware.Comment #3
ghost of drupal pastEdit: nevermind, found it, lazy services are processed in ProxyServicesPass
Comment #4
catch#2 looks like it should work.
Comment #6
znerol commentedAdded a draft MR. This prevents the event dispatcher from being instantiated when a page is delivered from the cache.
Comment #7
znerol commentedSetting to NR for early feedback. This still needs tests for the deprecation in the page cache middleware. I think we should deprecate the
respondertag attribute as well. This could be done in a follow-up though.Comment #8
ghost of drupal pastAmazing work, thanks.
But if this is necessary then what is a lazy service, wasn't that supposed to solve this? And if it doesn't, should the code setting that up be removed?
Comment #9
znerol commentedI do not know what happened to lazy services since #2473113: All stack middlewares are constructed at the same time even for cached pages. They are being removed gradually anyway, so didn't bother to look.
Comment #10
znerol commentedYes. In a follow-up.
Comment #11
godotislateA couple questions on the MR.
Comment #12
znerol commentedWent through an epic git bisect only to find that this broke just two months after it was introduced. Everything happened before 8.0 was even released.
Lazy middlewares and responder tag introduced in: #2473113-47: All stack middlewares are constructed at the same time even for cached pages commit date: 2015-04-30
Lazy middlewares and responder tag broken by: #2408371-83: Proxies of module interfaces don't work commit date: 2015-06-30
Comment #13
znerol commented#2408371: Proxies of module interfaces don't work introduced
ProxyServicesPass. But that runs beforeStackedKernelPass. So thelazyflag addedStackedKernelPassto middleware definitions on top of the first responder has no effect.I tried to move the
StackedKernelPassbeforeProxyServicesPass.But
ProxyServicesPassdoesn't like that:Besides the
HttpKernel, the following middleware classes are reported:Drupal\Core\ProxyClass\StackMiddleware\ContentLengthDrupal\Core\ProxyClass\StackMiddleware\KernelPreHandleDrupal\Core\ProxyClass\StackMiddleware\SessionDrupal\big_pipe\ProxyClass\StackMiddleware\ContentLengthThere are for sure more missing in contrib and custom code. Thus, trying to fix the lazyness is not an option I guess.
Comment #14
needs-review-queue-bot commentedThe Needs Review Queue Bot tested this issue. It fails the Drupal core commit checks. Therefore, this issue status is now "Needs work".
This does not mean that the patch necessarily needs to be re-rolled or the MR rebased. Read the Issue Summary, the issue tags and the latest discussion here to determine what needs to be done.
Consult the Drupal Contributor Guide to find step-by-step guides for working with issues.
Comment #15
nicxvan commentedI unpublished the cr.
Comment #16
ghost of drupal pastThe CR is not clear to me. Is there a before/after of what should be done?
Comment #17
znerol commentedThe CR is a stub. Help welcome.
Comment #18
znerol commentedComment #19
znerol commentedComment #20
znerol commentedPerformance (Drupal core standard install)
Software:
Webserver:
Core baseline:
Core + patch:
Comment #21
znerol commentedDrupal CMS baseline:
Drupal CMS + patch:
Comment #22
znerol commented(forgot to attach raw data)
Comment #23
godotislateThere are a couple test failures, not sure if intermittent or unrelated.
One thing to think about, and maybe this is a follow up:
Deprecate prepending the decorated
http_kernel.basicservice object directly, so that it's always the service closure prepended as an argument to each http middleware instead. That way we eventually don't need to run reflection and get the constructor arguments at all.Comment #24
znerol commentedThe failure of
UncaughtExceptionTest::testLoggerException()is interesting. This test examines the PHP error.log in order to check whether exceptions get logged when the normal logger failed.https://git.drupalcode.org/issue/drupal-3538294/-/pipelines/560646/test_...
I reproduced it locally and found that
NegotiationMiddlewareappears in the stack trace with the patch, but doesn't on the 11.x branch. The reason is thatMonkeysInTheControlRoommiddleware is registered with the samepriority. But sincePriorityTaggedServiceTrait::findAndSortTaggedServices()uses a different method of sorting, the middlewares swapped their location in the stack.Bumped the monkey middleware to an appropriate priority to solve the test fail (
404).Comment #25
znerol commentedComment #26
catchI think the dates in #12 should be 2015 rather than 2026?
It's a shame we don't currently have a way to cover this with performance tests, but it would end up being almost as specific as the new test coverage, and the explicit test coverage added here looks like it should stop this regressing again. Maybe one day - it should be enough of an improvement to show on the dashboard though which is nice.
Would a service locator be an option here instead of the lazy services pass? We've been switching to those in #3483996: Remove lazy declaration and proxy class for cron and use service closure instead and similar issues.
Comment #27
znerol commentedThanks, fixed.
I haven't looked into service locators. The cited issue is using service closures (same tech as the PR). Lazy services are not part of the fix (the dysfunctional lazy flagging is removed in this PR).
As for the question whether it would be possible to do this without a custom service pass: I do not yet know. The service pass has two responsibilities: Assemble every service tagged with
http_middlewareinto a stack. This is done by prepending a reference to the decorated service to the arguments array of the decorator. The second responsibility is to inject all discovered middlewares into theStackedHttpKernel. This is required to propagate theterminatecall. Maybe the symfony container provides all the necessary tooling to realize that. But even then, it would be necessary to add some kind of BC pass to move old declarations (tagged withhttp_middlewareto their spot in the stack. I'd be interested to explore. Not sure whether this should be done in this issue or in a follow-up.Comment #28
znerol commentedCR is ready. Also pushed an update to the
StackedKernelPassdocs.Comment #29
berdirCan confirm that this loads around 16 fewer services using my dependency scripts from #3540386: [meta] Explore using more service closures to break deep dependency chains and load fewer services
Profiling is always tricky. The differences in #21 seem huge and honestly too good to be true :) I tested with just core and standard profile, not sure how many more services cms adds.
When I test with AB, thn it's something like 48ms vs 54ms or so, varies between runs. There's a ton of overhead there, this is on WSL with ddev, so I'm likely mostly testing all kinds of network things. Instead of trusting ab numbers, it might be more useful to parse php fpm logs.
Testing with blackfire gives more useful results I think: https://blackfire.io/profiles/compare/3f98e259-aff4-401b-b363-13523b17fe...
That's 9ms to 7ms, so a 2ms improvement, ~23%. blackfire in my experience blows up requests between 2x to 5x. But much more meaningful than that is the memory improvement, which is ~4MB to ~2MB, almost a 50% improvement.
Note: I was testing with redis enabled as a forgot/was too lazy to switch.
Comment #30
catchIt's unambiguously a significant improvement, but also this is a good reminder why we didn't add request timings to performance tests because the variation is just too much to get anything close to reproducible.
Comment #31
needs-review-queue-bot commentedThe Needs Review Queue Bot tested this issue. It fails the Drupal core commit checks. Therefore, this issue status is now "Needs work".
This does not mean that the patch necessarily needs to be re-rolled or the MR rebased. Read the Issue Summary, the issue tags and the latest discussion here to determine what needs to be done.
Consult the Drupal Contributor Guide to find step-by-step guides for working with issues.
Comment #32
znerol commentedComment #33
needs-review-queue-bot commentedThe Needs Review Queue Bot tested this issue. It no longer applies to Drupal core. Therefore, this issue status is now "Needs work".
This does not mean that the patch necessarily needs to be re-rolled or the MR rebased. Read the Issue Summary, the issue tags and the latest discussion here to determine what needs to be done.
Consult the Drupal Contributor Guide to find step-by-step guides for working with issues.
Comment #34
znerol commentedComment #35
godotislateThis slipped under my radar for a bit, but it LGTM.
Comment #36
longwaveGreat find. Certainly undeniable that there is a performance improvement here!
Could/should we add an explicit test for this exact case to prevent a regression? All we would need to break this again is remove the
\Closuretype in the PageCache constructor, I think. We deletedtestLazyLateMiddlewares()because it's broken and wasn't testing what we thought it was, but perhaps we should port the underlying intent to this new way of doing things.Comment #37
catchI think that would explode because we also call the closure in runtime code, so it would have to be an actual revert, not just a missing type hint.
$response = ($this->httpKernel)()->handle($request, $type, $catch);Had one idea for a test that might work, (although typing this out seems a bit flawed). We could add a test stack middleware that throws an exception in its constructor and comes after page cache, then a test-only run would hit the exception. The problem with that though is that test would also pass if we delete the test module/middleware because we're testing the absence of something happening.
Comment #38
berdir> The problem with that though is that test would also pass if we delete the test module/middleware because we're testing the absence of something happening.
Instead of an exception, we could increment a key value counter, then we can verify the code runs, something like?
reset counter
Initial request to warm up cache, ensure counter is 1
page cache hit, counter is still 1
maybe even invalidate page cache, visit again, ensure counter was incremented again?
maybe a follow-up to explore so we can get this in and not hold it up on that?
Comment #39
znerol commentedI pushed a test which asserts that
http_kernel.basicis not initialized whenhttp_kernelis retrieved from the container.Also rebased and converted phpunit annotations to attributes.
Comment #40
catchNew test coverage looks great to me https://git.drupalcode.org/project/drupal/-/merge_requests/12857/diffs?c...
I still need to review the MR in a bit more depth, but on the basis of the test addition happy for this to stay at RTBC both for resolving #36 and also the test itself looking great.
Comment #43
catchCommitted/pushed to 11.x, thanks!
Comment #45
longwaveTagging as a highlight, improving performance is always good. See #20/#21 for some statistics, not all sites will manage this improvement but in general cached pages should load a bit faster for everyone.