Problem/Motivation
During the installation of Drupal we rebuilding and invalidate the caches a lot. This does a lot of queries on the database.
Proposed resolution
Use the MemoryBackend during installation - which results in the following comparison of installing standard via drush: https://blackfire.io/profiles/compare/9d742dcd-4160-45c2-96a9-09d2c5b56f...
Also during the interactive installer we are over 1.5 secs quicker and make at least one less query during the install. See #20 and #21.
The query count is also instructive - for a standard installer without this patch we do 17723 queries - with this patch we do 10733 queries.
Remaining tasks
User interface changes
None
API changes
None
Data model changes
None
Original issue summary
From comment #2294569-18: Determine cause of high memory consumption:
Performance wise I will open an issue to disable all caches during installation - this brings down to the time to install standard.profile from 35s to 9s.
Doing so! :)
| Comment | File | Size | Author |
|---|---|---|---|
| #101 | 2488350-by-alexpott-Wim-Leers-Berdir-Dimiter-c.patch | 6.11 KB | sylus |
| #100 | 2488350-3-100.patch | 6.23 KB | sylus |
| #98 | 2488350-3-98.patch | 5.56 KB | berdir |
| #88 | 2488350-3-88.patch | 5.62 KB | alexpott |
| #86 | 2488350-86.patch | 11.37 KB | wim leers |
Comments
Comment #4
alexpottStandard installs via Drush
Database compared with Memory https://blackfire.io/profiles/compare/f74b5c05-de0a-4213-a05e-516951436c...
Database compared with Null https://blackfire.io/profiles/compare/91f2097b-63ee-4215-bd38-193047f933...
Memory to Null https://blackfire.io/profiles/compare/08e021fd-6072-4004-9fbf-778e7c468d...
So both Memory and Null result in less queries (nearly 6000!) - Null results in less memory usage too... Memory increases it.
Comment #5
alexpottHere's a patch
And a profile of it. https://blackfire.io/profiles/compare/ea0491b2-1434-4934-a3a8-590f75111b...
Over 5000 less queries and less memory during the installer - I think this sort of performance benefit is worthy of a critical.
Comment #6
timmillwoodThis is pretty awesome, and a very nice patch, but wouldn't it be simpler just to call
drupal_installation_attempted()in CacheFactory? rather than adding InstallationAttemptedFactory.Comment #7
catchCan we mark the services private?
Drupal should be capitalised. Can we explicitly mark the class @internal? drupal_installation_attempted() should eventually be retired completely, not just refactored out to a service.
This patch might be the only way to do this at the moment, but there are old issues like #2115533: Add InstallerInterface to provide a stable non-interactive installer and #2213357: Use a proper kernel in early installer were moving towards the installer having its own kernel/container at which point we'd not need to rely on this installing-or-not global switch.
Comment #8
dawehnerI really love those improvements ... there is hope.
One day we should talk upstream about calculated container parameters ...
It feels like this belongs more into \Drupal\Core\Install ...
TRUE or FALSE, no matter what, its a string
Does that mean we should deprecate that method?
Let's use
::classComment #9
alexpottComment #10
alexpottThanks for the reviews!
Fixed everything in #7. Yes it would be nice to have an installer kernel but at the moment we swap out to the full kernel and container after system is installed.
#8
Comment #11
kingdutchPatch applied and looks good. Ran a Drupal installation (standard profile) on Docker for Mac using Drush. Once before and once after the patch.
Using the
timecommand I timed the whole thing for a rough benchmark.Before: 3m17.927s
After: 1m43.933s
Marking this cool improvement RTBC!
Comment #12
alexpottNoticed a minor docs thing.
Comment #13
heddnIs there any way to do this without the change to the interface? It seems like this should be a more encapsulated change that doesn't require change to CacheFactory. Especially when I'm seeing comments in #7 that some of this could go away sometime.
Or is CacheFactory considered @internal?
Comment #14
berdirConstructors are excluded from BC, it also has a default value, so a subclass could still call it.
We call drupal_installation_attempted() in many places, so I think it wouldn't have hurt too much to add one more. But it's IMHO also OK to make this change, at least for 8.4.
However, what we might want to do is remove the check from \Drupal\Core\Cache\ChainedFastBackendFactory::__construct() because this makes that pretty bogus now?
I'm trying this on a custom install profile, but I'm running into a fatal error. The problem is that we have some things that do not work at all with a null backend. \Drupal\hal\LinkManager\RelationLinkManager::getRelations() is one example and I'm pretty sure that's exactly the one I'm running into because default content denormalize is completely broken then, silently, without an exception or anything, it just doens't work..
That means this is blocked on #2868362: HAL RelationLinkManager caches and returns entity type definition object instead of id. And we might need to check if we more code like that somewhere, could also exist in contrib/custom code. But I suppose possibly breaking them is acceptable, since installer related stuff is also excluded from BC, this could theoretically fall under that..
Combined with that issue, installation works for me.
Before: 3m55.179s
After: 3m21.758s
I also tried MemoryBackend, didn't make a difference (2s slower actually, which isn't enough of a difference to be relevant). However, now that I think about it, using MemoryBackend would not trigger the RelationLinkManager bug.
Another idea would be cache tag invalidation, if we use a null backend here, might also want to avoid cache tag checksum updates? On my install profile, I get this after an install:
Another fun idea: Instead of adding an argument to CacheFactory, we could change the container during the installer to directly use MemoryBackendFactory, like kernel tests?
Comment #15
berdirTesting with standard as well:
Before: 0m30.702s
After: 0m24.489s
Not sure much virtualization slows it down, but if you're seeing improvements as big as #11 with this, then I'd really recomend to tune mysql a bit, see https://www.md-systems.ch/de/blog/techblog/2012/02/15/improve-mysql-perf... for example.
Comment #16
xjmLet's actually postpone then.
Also, I don't really see how a performance improvement for the installer could be considered critical (no matter how awesome), so downgrading to major.
Comment #17
alexpottDiscussed with @Berdir - regardless of #2868362: HAL RelationLinkManager caches and returns entity type definition object instead of id there might be other things that do this in contrib or custom land during the install. We can swap to the MemoryBackend instead.
Comment #18
alexpottHere's a switch to memory backend. And here is the profile for drush site-install https://blackfire.io/profiles/compare/9d742dcd-4160-45c2-96a9-09d2c5b56f...
Comment #19
timmillwoodJust to add my stats with patch from #12.
Without patch:
drush si --account-pass=admin standard -y 13.98s user 0.53s system 66% cpu 21.803 totalWith patch:
drush si --account-pass=admin standard -y 12.46s user 0.42s system 76% cpu 16.941 totalOutput from
timewith native local environment on Ubuntu 17.04 installed viaapt-get install lamp-server^on Intel i5-7600K with 16GB ram.Update: patch from #18:
drush si --account-pass=admin standard -y 12.49s user 0.41s system 75% cpu 17.087 totalComment #20
alexpottSo I've tried to profile the interactive installer. The key step is
install_profile_modules. With the patch applied my test run spent 8523ms as opposed to 9757ms. In terms of total memory used with patch 106954752 bytes as opposed to 96468992 bytes. We also save entire web request as we only have to call it 11 times instead of 12 - this is because we can do more in a batch step because we're doing less per module install.See patch attached for how I got these figures. You need to do a
composer require symfony/stopwatchto get the utility classes.Fuller details
With patch
Without patch
Comment #21
alexpottDid another run to get memory_max used per task.
Patch
VS (8.4.x)
So interestingly not seeing the total memory increase here and actually seeing a less high max memory usage!
Fuller details
With patch
Without patch
Comment #22
xjmWRT the difference between critical runtime vs. installer issues, we added this draft revision to the policy:
https://www.drupal.org/node/45111/revisions/view/10314345/10471220
Comment #23
dawehnerThis makes sense at least, given that the batch process ensures that the memory is lost in the meantime.
Comment #24
dawehnerAt least for me I use the installer more often than most of the administration pages.
Comment #25
berdirAlso, blackfireio might mess with your memory usage.
I tried to do a blackfire run of my custom install profile installation, but gave up after PHP ended up using about 6GB of memory and my system was swapping, decided that would not result in any meaningful information anyway ;)
Comment #26
gábor hojtsyComment #27
alexpottJust re-uploading the patch from #18 so it's the last file on the issue. Crediting @Berdir for the testing that lead to the swap of Null to Memory. Crediting @dawehner for code review in #8 and @catch for #7
Comment #28
alexpottComment #29
alexpottWrote a change record https://www.drupal.org/node/2876030
Comment #30
alexpottComment #31
dawehnerThis in general could lead to a lot of improvements all over the place. On the other hand it might be tricky, but I think on the longrun, we want to probably go down this route.
Comment #32
alexpottIn reviewing the profile for a standard install after applying this patch, #2302137: Improve performance when menu link value matches with the original value and #2872611: Optimise \Drupal\Core\Extension\ThemeHandler::refreshInfo() for the early installer to 8.3.x - https://blackfire.io/profiles/compare/a2ccd04e-d0e7-49fd-8e3b-18bf52a6c2... we're down to 12,000 or so queries during a standard install. Another 1500+ are due to cache_tags.invalidator.checksum being called from cache_tags.invalidator whilst invalidating cache tags. Since there are no database cache bins after this patch this is unnecessary too.
I don't think the pattern of injecting the core.installation_attempted into either or these services is a good idea. So I want to explore altering services during a compiler pass or something like that.
Comment #33
alexpottSo we already have \Drupal\Core\Installer\InstallerServiceProvider for the super early install environment where nothing works because there are no storages. So the patch adds \Drupal\Core\Installer\FullInstallerServiceProvider which does alot of the same things apart from not replacing things where we have storages. Here's the resulting drush standard install profile difference - https://blackfire.io/profiles/compare/cb0e6912-3c65-47ad-b2cd-64427bdfdb... - it is pretty sweet. We've also got rid of some of the expensive route rebuilding queries because we only finally dump the router once the install is done.
No interdiff because it's a new approach and one that allows us to make other changes to the installer container without having to inject the is installation_attempted thing everywhere.
Comment #35
alexpottHo hum it looks like the router improvements will have to wait for a follow-up. Which makes sense since they probably should have more discussion. Here's a more minimal change - but also at least it proves what @dawehner says in #31.
Lots of possible improvements and some might be a bit tricky but at least this now puts us in a place to explore this.
Here's a profile of the latest patch... https://blackfire.io/profiles/compare/cc3bf028-8144-4f1f-a923-e6e9867d02...
Comment #37
alexpottWe're hitting a segfault in
\Drupal\node\Tests\Views\FrontPageTest::testCacheTagsWithCachePluginTimelooking at the fails it is clearly to do with garbage collection. Going on a hunch our biggest use of objects is during the recursive container building so let's see what happens when we force it to be off during that.Comment #41
alexpottImplementing the new segfault patch as this issue seems to expose it. See #2859704-32: Intermittent segfaults on DrupalCI (some "did not complete due to a fatal error" with no additional info)
Comment #42
wim leersVery interesting progress here!
Comment #43
alexpottHow moving segfault seems to have moved on yet again... nice. So re-uploadng the patch in #35 as that is the correct patch to review.
Comment #44
catchShould this just be removed? Otherwise this looks great and more than resolves my reviews from earlier.
Comment #45
alexpott@catch yep that should go - it was an idea that maybe the installer should have it's own services yaml - but we can explore that later it is every necessary. Not reason for it now.
Comment #46
catch#45 has the gc hunks that were removed in #42.
Comment #47
alexpottToo many patch on a single checkout.... thanks @catch
Comment #48
wim leersBut what about other services having this service tag?
I think a safer approach is to make cache tags have zero effect in the installer environment: i.e. replace
\Drupal\Core\Cache\CacheTagsInvalidatorwith a no-op implementation.Comment #49
alexpott@Wim Leers hmmm. I'm not sure about that. This is the only core service with that tag. Whether or not something should be removed depends on what it is doing - no?
Comment #50
berdirI was thinking about that as well.
\Drupal\Core\Cache\CacheTagsInvalidator also supports invaliding on cache backends, which for example MemoryBackend uses. And likely requires, otherwise cache tag invalidations wouldn't work at all and I can totally imagine that this would break stuff.
So only replacing the checksum service likely makes sense. Redis for example replaces that specific service as well and doesn't define another one.
Comment #51
dawehnerIt feels better to implement a variant of the interface which does nothing rather than removing the tag.
Comment #52
dawehner@alexpott and myself had a little chat about it. I agree now, this is the minimal change we can do. The service is still around, but its simply not called anywhere. On top of that keep in mind, this is "just" the installer ...
Comment #53
catchOnly question here is whether we want a change record for the installer change.
We should probably delete the cache factory CR.
Comment #54
alexpottI've deleted the CR. I'm not sure that we need a CR because there is nothing that a module developer needs to do or can do to effect this change. The installer still has the same services, you can still cache things and invalidate tags that fact that this is stored in memory and not the DB should be and is transparent to the module / theme /site builder.
Comment #55
wim leersAgreed with #54.
Comment #56
dawehnerIn case anyone "relied" on it, they would have realized pretty quickly that the cache is cleared constantly.
Comment #58
alexpottMaybe all the objects in the cache are causing issues.
Comment #59
catchIf that passes, +1, it's removing more of a workaround (the clone) than it's adding (the cast). Also a small enough change I'd be OK with it happening here.
Comment #61
catchgc again, here's the start of the backtrace:
Comment #63
JvE commentedNice, this patch looks promising.
Unfortunately it doesn't seem to play nice with the config_installer profile.
I get this crash towards the end of the installation:
Update: Apparently a multilingual issue. "Fixed" by adding the --locale flag to the drush invocation. Strange that it works fine when both that option and this patch are absent, but not when only the option is absent.
Result of the patch: total install time went from 3m30s to 3m15s
Update 2: Locale flag was a red herring, it started working because the database wasn't properly dropped before reinstall. Completely fresh installs always run into this error.
Update 3: Moving all config translations from
sync/language/nl/to the config files insync/(and making them langcode: nl) and then deleting thesync/language/nl/directory made it work.This definitely needs more testing with multilingual setups. They seem to cause the system to confuse the config items on the filesystem, in the database and in the cache.
Comment #66
psf_ commentedHi, any advance here? We are developing a profile that require about 20 min in each install test :_ D
Thanks : )))
Comment #67
alexpott@psf_ how much does the patch here save you? Other profiles take a different route and build an install first and then use that install for each test. See https://github.com/BurdaMagazinOrg/thunder-distribution/blob/723feaafdbf... for example
Comment #68
psf_ commentedMy profile is broken now, I can't do a complete install to verify, but in the very first step install 84 modules. Without the last path here it take about 18 min, with the patch take about 17 min.
Comment #69
psf_ commentedThe patch work nice for me, in D8.5.6 with 84 modules and multi language.
Comment #70
psf_ commentedI builded a LXC VM where all file systems are in memory to try own profile. In that conditions, and without this patch, the installation time it's the about same that with disc access. I think that the big problem is that MySQL is slow...
In the DrupalCon 2018 was presented a new command to install, to dev not production, that use SQLite and it install the standard profile in about 1.5 minutes: Link
The default MySQL engine, InnoDB, use transactions that is slow but I don't know if we can use other engine or the transactions is a must to Drupal.
Comment #71
dimiter commentedThe patch mentioned in #58 did not apply any more to 8.6.5, so I have re-rolled it.
Comment #72
dimiter commentedI have removed the patch I had supplied above, because it didn't seem to work as I had hoped it would. We ended up just removing the patch in order to be able to upgrade to >= 8.6 .
Comment #74
alexpottRerolled on top of 8.8.x - which now being PHP7 only allows us to think about this again.
Comment #76
alexpottComment #77
wim leersLooks like this makes Drupal core test runs ~10% faster!
I think it'd be good to document the rationale for this change. #58 introduced it.
AFAICT it's to avoid the use of
cloneand instead rely on array-to-object casting to break any references. Because usingclonecaused some problems. But why exactly?Nit: should use
Blah::class.Fixed point two myself, leaving point one for @alexpott.
Comment #78
alexpottHere's some comments I prepared earlier... over in #3055443: Switch to a null backend for all caches for running the database updates which has the same changes for the same reasons. It's about keeping the number of objects down so that doing this doesn't have as big an impact on PHP's internal object tracking for garbage collection. Understandably we get hundreds of cache objects created if we don't make this change and PHP has to keep track on them all and garbage collect them. But must cache writes are not read again in the same request so there's no point in creating an object early. And cloning creates even more objects.
Comment #79
catchJust for archaeology's sake, we first added a no-op cache implementation in 2007 #138706: FormAPI 3: Ready to rock. This wasn't for performance reasons but to have a backend that worked both before and after the cache tables being created. We removed it again in #1792536: Remove the install backend and stop catching exceptions in the default database cache backend when the default database cache started to create its own tables and not rely on system_schema().
The comments added in #78 look good.
Comment #80
wim leersSee, my guess was wrong! Thanks for adding that comment :)
RTBC++
Comment #81
alexpottWe have too many installer service providers... will move some stuff into a trait.
Comment #82
alexpottFixed #81. Also without this patch standard install does 17723 queries. With it does 10733 queries... putting that in the issue summary. The patch in #78 also results in the same number of queries as this one so the changes here don't affect the outcome of the patch.
Comment #83
wim leers"InstallerCaching" vs "use memory cache backend and replace cache factory"
I think the "use memory cache backend" could be removed if the trait's name and the method name would be improved. Also, it does much more than just using a memory cache backend; I think most of the now no longer executed queries are for cache tag invalidations?
What about
InstallerContainerOptimizerTraitandoptimizeContainerForInstallation()?Comment #84
alexpottLooking at all the optimisations we need throughout the installer need to be in NormalInstallerServiceProvider so we can use class inheritance and make it a bit simpler (in a way) - at least less code and places to put things.
Comment #85
alexpott@oknate pointed out on #3055443-18: Switch to a null backend for all caches for running the database updates that some British spellings are in my comments.
Comment #86
wim leers#84: even better!
#85: I noticed that too but bit my tongue 😜
In this interdiff:
InstallerServiceProviderno longer needs to do thanks to it extendingNormalInstallerServiceProvider.Comment #87
alexpott@Wim Leers nice one! Those changes look great to me.
Comment #88
alexpottRerolled now that #3055443: Switch to a null backend for all caches for running the database updates has landed.
Comment #89
catchComment #91
mbovan commentedThe latest patch (#88) applies cleanly to 8.9.x branch and passes tests.
Is there anything remaining preventing it to be RTBC taking into account that it was already RTBC'd in #79?
Comment #92
alexpott@mbovan i don;t think so.
Comment #93
catchLooks great now and should improve installer performance a fair bit.
Comment #94
alexpottThere are no API changes here now.
Comment #96
catchCommitted f14155d and pushed to 9.1.x. Thanks!
This is arguably beta-safe but marking fixed against 9.1.x for now.
Comment #98
berdirConfict due to #3138718: Convert British English spellings to American English, for the umpteenth time, so rerolling for 8.9.0. And probably only conflicted because this issue made out of scope changes to that block and already included the spelling fix :)
Comment #99
heddnComment #100
sylus commentedI needed a patch that installs on top of Lightning so re-rolled the patch in #98 for that purpose.
Comment #101
sylus commentedJust adding a patch against latest profile inheritance for 9.0.x