Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
Follow-up to #2470679: [meta] Identify necessary performance optimizations for common profiling scenarios
Problem/Motivation
Page Cache is still slower as in Drupal 7.
Identify what takes how much time in the process of bootstrapping Drupal to the page cache.
Remaining tasks
- Profile
User interface changes
None.
API changes
N/A
Comment | File | Size | Author |
---|---|---|---|
#38 | interdiff.txt | 4.73 KB | Fabianx |
#38 | meta_page_cache-2501989-37.patch | 104.7 KB | Fabianx |
Comments
Comment #1
Fabianx CreditAttribution: Fabianx as a volunteer commentedCurrent findings (detailed report to follow):
- Request_Policy is overall a bad idea - performance wise.
There should never be a case where a request policy cannot prevent an object entering the page cache
=> always cache miss
=> better performance
It also prevents from doing a page cache earlier in the process (e.g. settings.php or preContainer - container loading regardless of method takes 50% of the time a full page cache Drupal 7 response did take)
Proposed resolution: Drop request policies as a concept
-
HttpKernel.Basic gets @event_dispatcher, @controller_resolver and @request_stack injectedIt is in itself not lazy and especially @event_dispatcher gets a lot of configuration injected itself.
That is all not necessary.
However making that service lazy automatically was somehow not possible? (or not performant?)
The following works however:Edit: It is lazy, I had wrong information in mind.
plus
Needs an issue and is an easy win. (for page_cache)
Proposed resolution:
Do not inject parameters into the http.basic kernel.
EDIT: Nnm, it is proxied in the original, so the above is not needed ...
Comment #2
Fabianx CreditAttribution: Fabianx as a volunteer commentedComment #3
dawehnerGreat analysis!
Note: On an ideal project I'd use a proper index.php, which means you can setup your kernel as you want.
In HEAD we do something like:
But ideally you would like to have something like
Just adding a couple of issues which fixes some of the problems ...
#2384675: Deprecate conf_path()
One small bit: https://github.com/symfony/symfony/issues/14906
As noted, this will be removed mostly in #2481453: Implement query parameter based content negotiation as alternative to extensions
Comment #4
Fabianx CreditAttribution: Fabianx as a volunteer commented#3: Thanks for opening the issues!
#2384675: Deprecate conf_path() unfortunately won't help, because it still re-creates the request from globals.
IMHO Drupal Kernel should inject the Site Path statically into the PublicStream Wrapper as default value.
Comment #5
dawehnerIf I understand your point correctly the problem is that we create the request object multiple times?
\Drupal\Core\DrupalKernel::getSitePath
though is certainly just a simple getter, so this cannot happen?
Oh I see ... so why do we access the public stream wrapper before the kernel?
Comment #6
catchAbout 50% of the classes loaded from page cache are from #2497143: Proxy services in the container force interfaces to be loaded - this will help with container/autoloader overhead. Made that a child issue of this one.
In general the idea is to drop request policies but keep response policies? I agree that apart from the session cookie there is not much of a use case for request policies - you almost always want to prevent something entering the cache rather than stopping a cache hit.
Without debugging, isn't this phpstorage loading the container?
Comment #7
znerol CreditAttribution: znerol commentedRequest policies are there in order to allow contrib authors and site owners to make requests cachable which by default would not be cached. The only use-case in core is the toolbar.
If you extend request policies you normally add allow-rules, while if you extend response policies you add deny-rules.
Comment #8
catchToolbar is probably going to stop doing that assuming #2217985: Replace the custom menu caching strategy in Toolbar with Core's standard caching. gets done.
I've personally forced certain responses to cache in reverse proxies before for authenticated users, but that's always been done by changing the http headers for the response. I guess it comes down to whether that should also be supported by the internal page cache.
Comment #9
Fabianx CreditAttribution: Fabianx as a volunteer commented#6: Correct, that is the container loading from PHPStorage using public://.
#7: But can't you just add a ResponseSubscriber instead to allow caching by overriding e.g. a max-age=0 set on the Response?
Also how does a request policy help if the cache item never makes it into the cache in the first place? I will need to check that ...
Comment #10
catchThe trick with toolbar is that it allows the toolbar path to be cached for authenticated users in the internal page cache, see https://api.drupal.org/api/drupal/core%21modules%21toolbar%21src%21PageC...
Non-zero max age doesn't let you do that - it's skipping the session check that's important there.
That implementation won't be necessary in core once the toolbar is using the new menu render caching, so it comes down to whether there's a use case for that.
Comment #11
Fabianx CreditAttribution: Fabianx as a volunteer commentedHm, so but not having it is not a security issue, right - it would just make things slower - because they only override things that are disallowed?
So if I moved page_cache to a bootstrap service via settings.php and static namespace registration for the page_cache module, I would miss out on request policies, but then the page_cache middleware could still check them.
Double check, but that check itself costs overall only 5 ms (and probably less because it would only need to deal with denied things, so the double hit is worth the 1-2 ms for cache hit.
Comment #12
Wim Leers#2: wow, great detective work!
#3:
:P
Comment #13
Fabianx CreditAttribution: Fabianx as a volunteer commentedGiving short update:
- Using our own composer autoloader should save 500 us
- Spoke with znerol, while removing request policies is not a good idea, removing the possibility to return DENY from request policies is.
- If only DENY is allowed, it is no security issue to not take request policies into account, page_cache then just is asked twice.
( Will also need logic to check $_COOKIE for SESS* as that is no longer available pre-container. )
=> Overall: Pre-container page_cache is feasible.
Upstream Symfony is very happy to accept patches for performance improvements - we'll see what will be possible to optimize.
TL;DR: I am positive D7 page cache speed can almost be reached.
Comment #14
Wim Leers\o/
Comment #15
Fabianx CreditAttribution: Fabianx as a volunteer commentedQuick update:
- 3.5 ms ( 5.7 ms before) reached with autoloader and some Symfony optimizations.
- 2.2 ms reached with a pre-kernel middleware.
I think around 1.9 ms is D7 on that same machine ...
Comment #16
Wim LeersNice! :)
Comment #17
Fabianx CreditAttribution: Fabianx as a volunteer commentedIssues that need to be created based on my latest local patch:
- vendor/symfony/http-foundation/ServerBag.php - check if key starts with 'H' via array string syntax; not sure it will fly upstream, but it saves a lot.
- vendor/symfony/http-foundation/ServerBag.php - Use if (!empty($authorizationHeader)) instead of === null to remove unnecessary stripos() calls
- Ensure Request::createFromGlobals() is only called once per request, used a GLOBAL set by DrupalKernel in my patch, could use the new &DrupalStatic component if available.
- Move accept header negotiation after page_cache (saves 0.2 ms - not yet done locally, so still to gain from above numbers!)
- Instead of populating classmap with files, use files directly (order matters, saves 11 autoloader calls)
- Use getRequestUri() instead of getUri() and figure out domain in a cheaper way than $request->getUri(), remove $request->getFormat().
- Prepare Response already in PageCache ($response->prepare() and save prepared Response to cache instead of whatever it gets, saves loading of CacheableResponse and CacheableResponseInterface, e.g. have very fast PreparedResponse class, which just does the necessary header sending and response sending.
- Consider using some D7 code for things like request_uri() in an utility class, our D7 version uses 3 us, while Symfony needs 77 us for getRequestUri() and I _think_ around 200 us for getUri().
Part of autoloader issue, so don't need issues, but noting as subtasks here:
- Use optimized PSR-4 loading of optimized build autoloader
- Register APC prefix in optimized autoloader
Comment #18
Fabianx CreditAttribution: Fabianx as a volunteer commented!!! WARNING !!! - The following code is very very hacky and very work-in-progress - it might not be suitable to minors and could generate feelings of severe hacky-code-paranoia. - !!! WARNING !!!
So after you have been warned (and with the @todo taken into account that this might not be possible, here are the numbers):
With pre-container-kernel "middleware" (hacked into index.php for now):
Not sure all of this is feasible, but pre-container middleware could be very very fast - is what it shows.
Without pre-container kernel middleware:
--
Patch includes all things from above and we need to split into sub-issues.
Comment #19
Fabianx CreditAttribution: Fabianx as a volunteer commentedComment #21
Fabianx CreditAttribution: Fabianx as a volunteer commentedOh, yeah, uhm, absolute paths won't work ...
Need to fix that here, too ...
Well at least the test-run will be really fast ;).
Comment #22
dawehnerImpressive numbers!
I think we can rid of timer now, given that we removed the timer out of the Drupalkernel
Comment #23
Wim LeersYes, absolutely impressive numbers! Go Fabian :)
Comment #24
Fabianx CreditAttribution: Fabianx as a volunteer commentedPatch won't apply either, but new patch solves the absolute paths by moving variable initialization to __construct() :).
!!! WARNING !!! - STILL VERY HACKY - !!! WARNING !!!
Comment #25
Fabianx CreditAttribution: Fabianx as a volunteer commentedNew find:
No middleware in core actually implements the TerminableInterface.
Therefore we can remove those services already during container building and not pass them on to Stack.
As we have a container based middleware, I would even vote to remove that feature completely and always pass an empty array ...
Then only the first middleware after a responder like page_cache needs to be proxied, which saves time and CPU.
Edit: By doing so one saves another 250 us and by moving page_cache as first middleware (prio: 600), we save another 300 us.
=>
New low:
=====
Request->getHost() is very slow, should be cached in the Request once trusted proxies are set - Symfony is very problematic there with using statics: It means an earlier getHost() potentially is different from a later one. If someone sets new information on the request, IMHO need to call resetHost() on the request.
There is a really good reason why PSR-7 objects are immutable - I get that now ...
======
There is several file_exists() in MTimeProtectedFastFileStorage (not gonna optimize those) and one is_readable() in Settings::initialize, which probably could be changed to @require_once instead ...
There is one file_exists() in findSitePath().
======
Comment #26
Fabianx CreditAttribution: Fabianx as a volunteer commentedSpent a little more time to optimize pre-kernel page cache:
Latest achievement:
LEFT: Drupal 7
RIGHT: Drupal 8
=> Comparable performance is _reachable_.
=========
It depends on how much upstream patches are accepted and to what lengths we want to go, but it seems possible at least.
( and the above includes cache tags for Drupal 8, but none for Drupal 7 (!))
Latest upstream patches include:
- Add static caches for host, scheme and httpHost.
Latest local optimizations include:
- Use static::CACHE_PERMANENT in DatabaseBackend.php to save class load of 'Cache" class
- Wrap Response in a new Response in PageCache
- Use $request->getSchemeAndHttpHost . $request->getRequestUri() as $cid. (good enough and faster than to normalize)
- Hardcode values for authenticated and anonymous in bootstrap.inc -- saves a class load, those are deprecated anyway and don't think we'll change them in D8 again ...
Probably the Response->prepare() change must be reverted though as the request can change (e.g. HTTP 1.0 vs. HTTP 1.1)
Comment #27
Wim LeersWOW!!!!!!!!
Comment #28
catch#1055862: Require sites.php to opt-in for multi-site support/functionality was intended to remove all file_exists() from findSitePath() but the actual committed patch didn't. I'd still go for refactoring that so we can fully ditch it.
Also wow++
Comment #29
Fabianx CreditAttribution: Fabianx as a volunteer commentedLatest patch and latest benchmark:
Drupal 7 vs. Drupal 8 with pre-container page cache.
Obviously with more complexity (brought Response::prepare back and added the two hardcoded request policies (check for cli and session active) there is more effort, so the GAP is smaller.
But overall this page cache is about as fast / slow as Drupal 7 (but Drupal 8 has the cache tags supported).
And the attached patch might even work ...
Lets see ...
Given the above numbers I recommend we open the major / critical issue to remove the possibility of request policies to deny something, but only return ALLOW. (API change). That is at this point the only thing blocking this patch.
Oh and likely session_name($session_configuration->getName()) should be called, then use session_name() to check for request->cookies->get(session_name()) ourselves.
As ::lookup in page cache relies on session_name() for some decisions.
Comment #31
Fabianx CreditAttribution: Fabianx as a volunteer commentedAnother try.
Comment #33
Fabianx CreditAttribution: Fabianx as a volunteer commentedOkay, two competing autoloaders are not fun. (PHPUnit loaded the composer one early)
Solved it by returning the composer autoloader in case the factory is loaded already.
Also moved file loading to a static factory method, again 3 function calls more, but no overhead lost. (phew)
Could be green (very optimistic), lets see ...
Comment #35
BerdirHm.. not being able to deny caching does break a use case I have (I have a special cookie that I'm using that I want to prevent the page cache on certain pages only).
If we remove that, can we at least make sure that we have support to set this through settings.php, which is what I used before request policies were added. So some sort of global or static method that I can call through that?
Comment #36
Fabianx CreditAttribution: Fabianx as a volunteer commentedOkay, so _that_ happens when you forget to turn off page_cache for POST requests :p.
Comment #38
Fabianx CreditAttribution: Fabianx as a volunteer commentedFix tests from removal of getRequestFormat().
Comment #41
andypost