Drupal 7 issue for #2747075: [meta] Improve WebTestCase / BrowserTestBase performance by 50%
Problem/Motivation
- Tests are slow and installation takes a long time (both for d.org test runs and local development)
- Backdrop found out that installation is deterministic and can easily be cached per profile
Credit: https://github.com/backdrop/backdrop/pull/1366
Their patch works, but is preparing for a known set of profiles instead of demand.
Proposed resolution
- Introduce --cache option for BC compatibility
- Cache database tables and files after installation
- Re-use cache when it exists
- Prevent race conditions with transactions
- Also add --cache-modules for local development only
99% of people don't change the database data or schema during development, so the cache will work fine.
Remaining tasks
- Get it committed to Drupal 7
User interface changes
- None, commandline only. UI change can be follow-up
API changes
- None
Profiling of setUp()
Before is without cache.
--cache

--cache-modules

| Comment | File | Size | Author |
|---|---|---|---|
| #81 | 2759197-simpletest-perf-63-reroll-81.patch | 15.33 KB | joelpittet |
| #63 | 2759197-simpletest-perf-63.patch | 15.32 KB | Stevel |
Comments
Comment #2
fabianx commentedComment #3
fabianx commentedComment #4
fabianx commentedComment #5
fabianx commentedThis should be fixed separately in a child issue.
Comment #6
stefan.r commentedI like the idea of this, it's fully opt-in, and if it's truly 50% that's a great improvement.
Comment #7
fabianx commentedHere are the benchmarks for one setUp() call with enabling common_test module (if not --cache-modules), but drupal_render() calls it 6 times (as there are 6 test* methods).
drupal_render() 62 passes, 0 fails, 0 exceptions, and 13 debug messages
Without cache:
Test run duration: 1 min 4 sec
With --cache:
Test run duration: 34 sec
And the diff to the version without cache (just setUp):
With --cache-modules:
Test run duration: 15 sec
And the diff to the version without cache (just setUp):
Need to work with Mixologic that we get the test runner --directory patch applied, so this can be tested.
Comment #8
fabianx commentedThe --cache part did resetAll() twice. So the benchmark was off. Fixed in above comment.
And now we got the 50%.
Comment #9
fabianx commentedFixed the bug, showed that the critical path has not changed (critical-path-unchanged.txt; created with diff -w) and added interdiff.
Comment #10
mpdonadioWon't this break transactions / rollbacks, or is there lack of test coverage for D7 with this area?
Comment #11
fabianx commentedVery good point!
Removed that and copy speed is thanks to the turn on indexes the same.
Comment #12
fabianx commented#10:
Seems we don't have tests for that except for the tables that had been excluded, but I don't think it was that important.
Comment #13
mpdonadioI think this will also help the copy speed. The FK check is probably unneeded since there aren't proper FKs, but I think it will help.
Comment #15
mpdonadioThat's what I get for trying to do this while I eat. This should be better.
Comment #16
mpdonadio/givesup
Comment #18
gor commentedMyIssam will brake transactions for nodes.
But difference even on /dev/shm stored tables is huge enough to do exclude list.
That's why I originally added it to backdrop patch.
Fabian one more time! Well done! Dris made a right choice :)
Comment #19
gor commented@Fabianx, I see that you got rid of profile related cache and generate cache on first test. Am I correct?
In first version I was doing the same but then I faced issue when some tests run under tests profile and another under standard.
As a result, cached tables did not represent proper set of modules, settings and test was failing.
That's why I included extra step as fake setUp before run tests and do it for each default profile.
Comment #20
fabianx commented#15: It was not you, it was DrupalCI having a conflict due to #2551981: Add --directory option to run-tests.sh test discovery. However as DrupalCI does not have --cache for obvious reasons to test the speed-up on the CI, we need to hardcode $test->cache = TRUE inside the patch for now.
Thanks for your contribution!
#18: I do agree and the boost is huge. However as this code will also run for contrib project tests a fixed whitelist is unfortunately not possible.
#19: I resolved the problem of concurrency by using a transaction for the DB copying plus try {} catch. For parallel runs this gives a minimal performance loss of 2s (until the caches are primed), but ensures I don't need any complicated locking. The transaction also ensures that the cache is always consistent.
As for different caches per profile, that is why I use $this->profile as normal cache key and a normalized version of func_get_args() for the module cache key.
I am not sure why your implementation did not work, but as $this->profile is setup before installation, I cannot see where this would fail.
Comment #21
fabianx commentedUnfortunately turning the checks off does not make the copy faster for me and the statements itself take 30 ms instead.
I found two flaws however and fixed them:
- Using a cache key suffix was not a good idea as it tried to copy the already cached tables as well. With a cache_key prefix, it works way better.
- Using a transaction won't work as DDL is non transactional in MySQL, so I removed it.
Uploading a new patch, which should pass as there should be no changes.
Comment #22
gor commented#21 Yeah, concurrency issues with cache generation was a part of moving cache create process outside of tests.
Comment #23
fabianx commented- Fixed concurrency problem using a different method (re-using semaphore table, but creating a new simpletest_semaphore table, which _only_ exists in the cache copy and is not copied over)
- Allowed cache to be re-created if e.g. the user pressed ctrl-c during storeCache()
- Warning the user if cache copy failed via a debug message, but not fataling
This works like a charm, now testing on d.org again.
Comment #24
fabianx commentedPostponed for the bug fix for:
drupal_static_reset('drupal_get_schema_versions');
Comment #25
fabianx commentedTurns out the drupal_static_reset() is no longer needed as I cold not reproduce the error anymore and was just a problem due to me forgetting to module_load_all() after loading the cache.
Fixed in interdiff, therefore no longer blocked.
Testing test suite with --cache enabled on d.org again.
Comment #27
fabianx commentedOh, it seems it _is_ a problem after all.
I now could also reproduce it. The problem is that some kind of state is held between test runs even with vanilla Drupal 7:
e.g. if I change:
testUpdateAccess => ttestUpdateAccess, _then_ the test fails, too.
Also the wrong state is stored in cache.
E.g. I apply the fix, but have corrupted state in cache, then it fails, too.
Will open a child issue now.
Comment #28
fabianx commentedOMG I know why:
- After the first test has run all the modules install files are already loaded in memory, so function_exists() works obviously.
This means that drupal_get_schema_versions() is called for the module before its .install file has been loaded!
Because we never install some modules in the cached version, the schema is wrong then.
And the reason is that more modules could be loaded already than had install files loaded, which mostly is for core modules.
The reason is purely test based however:
module_list() uses a static, therefore it has the module_list of the host installation.
Therefore the first call to drupal_get_schema_versions() sets all modules that the host has to array(), e.g. block, node, ...
But the module list cannot be refreshed before the system table exists, therefore the test runner cannot refresh it prior to installation, but the installation of system module already leads to the problem then.
The static reset fixes it obviously, but could lead to performance penalty.
Another possibly is to module_load_install() before checking the schema version, but that also runs always.
The best way to fix this is to reset the schema versions when the module list is reset in system_install():
And that fixes the problem as well.
Comment #29
fabianx commentedUploading a patch to see if its fixed.
Comment #31
fabianx commentedThe test fail looks unrelated.
Comment #32
fabianx commentedSelf-reviewed as if I was reviewing this for commit:
nit - These comments could be clearer. Ideas?
Missing return type.
nit - trailing "whitespace"
This needs a try { } catch as well as copyCache() can throw an Exception().
It might be even better to now put the try { catch } directly into copyCache().
nit - $this->profile or on the profile used by the test.
The prefix is already quite long, while 1cache is pretty explicit, it might be better to use:
1c_
to allow for longer table names as the 'simpletest' prefix and profile name and modules hash already have quite some length.
I do agree though that it is good to use the profile to also show which cache this is as this improves debuggability - as if it was just a hash.
--
Should also add a comment that 1 is needed as first character to ensure proper cleanup.
This should explain a little more why it is important that everyone can pass this stage.
Missing types.
prefixId / cacheKey => prefix_id / cache_key
Missing types
We use full descriptive parameter names in Drupal.
Some helper variables would be nice.
nit - can use $this->pass() instead.
Normal Cache should be "Installation Cache" as that is what we cache.
In theory this would not be needed, but there is a comment over there that cache tables need this change to properly register the new DB prefix, so we are running this when using --cache-modules.
Removing this woulds save ~ 1 sec, too.
If we do the 1cache_ change, this should be 1c_ as well.
Also needs a comment.
nit - setUp => the test setup
We already use a public property for the test results.
This avoids having to add a setter / getter and also allows tests to opt-out of caching by setting $this->cache = FALSE; in their setUp() method.
Comment #33
stefan.r commentedMay be prudent to keep 15 for now. As to 1, the short descriptions sound fine, but it could be nice to add a long description describing what those flags actually do.
Comment #34
stefan.r commentedI am fine with trying to get this into 7.50, but we need to clearly mark these options as experimental, and ensure no single line of code changes behavior with the options disabled (it doesn't look like it does).
Comment #35
fabianx commentedAs this only affects tests, this does not need to make 7.50 to go in, but can go in afterwards as experimental option.
Comment #36
fabianx commentedHere is the cleanup with interdiff, etc.
Can still be reviewed if someone wants to.
Comment #38
fabianx commentedNo longer blocked on the other issue, but needs to be rerolled now.
Comment #39
David_Rothstein commentedTo expand on that, one thing we talked about was that a testing-only experimental feature can probably be committed to core at any time - because it won't affect live sites, and also because core/contrib tests tend to run off 7.x-dev which means the feature can be meaningfully used as soon as it's in, regardless of when the next release is.
We would still wait until 7.60 (or later) to consider making this a "non-experimental" feature.
Comment #40
fabianx commentedJust a re-roll.
This needs some good reviews now.
Comment #41
fabianx commentedThis is blocked on reviews now.
Comment #42
MixologicTest run duration: 9 min 46 secfor the last branch test vsTest run duration: 9 min 51 secfor the patch in #40.What happened to the speeds we were seeing in #29?
Comment #43
fabianx commented#42: The thing is opt-in. You will need to put --cache for the test runner for that to work.
I hardcoded $this->cache = TRUE for the test run in #29. I can easily hard-code it again - but I think we tested performance on test bot enough now.
Comment #44
MixologicAh. Thanks for explaining that. ++
Comment #45
mpdonadioI had some suggestions in #13 for potentially speeding up the table copies. Did you get a chance to look at these? I'm not on my Drupal dev machine right now, or I would look at this again.
Comment #46
fabianx commented#45: I have tried those, but as we disable / enable the keys already, this did not bring any further speedup for me.
Comment #47
hestenetComment #48
tobiasbBefore
After (--cache):
After (--cache-modules):
Comment #49
joseph.olstadWow, amazing numbers, seeing as the D8 and D7 tests are currently very slow and the test server is always overloaded, it seems like this patch should get more attention. We could run twice as many tests in the same time given the reported performance benefits it's two fold.
Comment #50
larowlanThis doesn't work for me because of table name length violations.
My profile name is 13 characters long.
So with the simplest_{hash} prefix, the $to_prefix is 36 characters.
With a table name like cache_entity_message_type_category, the overall table name length comes to 69 characters.
Which is larger than the 64 characters.
Will attempt to come up with an approach to mitigate.
Comment #51
larowlanI don't think we need to include the profile in the table name given it is already used in calculating the hash
This works for me.
Also dropped the underscores.
3 minute test takes 21 seconds on subsequent run.
Big profile, lots of features (lots of features processing during install).
Comment #52
fabianx commentedNice, that change works for me :).
With --cache or with --cache-modules? (just curious)
Comment #53
joseph.olstaddrupal.org could patch it's D7 test runners using #51, would really help speed up the testing, lighten the server load.
Comment #54
fabianx commentedTesting speed increase on test runners again (hardcoded --cache to TRUE and --cache-modules to true in other patch), DO NOT USE.
Comment #55
fabianx commentedThe problem is that we need to get the Drupal 8 patch in first, before we can profit from it.
The other problem is that the Drupal 8 patch does not profit from raw installation caching as much as the Drupal 7 patch, so it is not secure to get in ...
Comment #56
jibranI don't think d8 patch should block this issue.
No need to pass anything to
timefunction.Comment #57
jibranComment #58
larowlanMy base class has both set to true
Comment #59
Stevel commentedThere seems to be a problem when the database gets out of sync with the file system (for instance when you clean up the test environment).
There can be various reasons:
- The 'clean up' action checks for folders starting with '1c_', but the cache folders don't have an underscore. This is an easy fix: just check for folders starting with '1c' instead of '1c_' (or change the prefix to something else)
- The database is migrated from another environment (or restored from a backup). In this case, there are no database tables to start with. Perhaps the code in copySetupCache() should check that there are tables to copy, and return FALSE otherwise?
Another question about copySetupCache(): why is the simpletest_semaphore table skipped? This skip doens't seem necessary (and would break install profiles that enable the simpletest module (theoretical, none that I know of)
Comment #60
fabianx commented#59: Good catch!
- #51 broke it, by removing the underscore.
- As we call it after an installation there should be tables to copy ...
- simpletest_semaphore is just an internal table to ensure not two processes try to gain the 'lock' to create the setup cache first. It is not part of the actual Drupal installation.
Leaving at CNW to revert parts of #51.
Comment #61
Stevel commentedScenario to reproduce:
But why not use the original semaphore table and combine the cache prefix and name in the same column, instead of creating a table per cache id with a single row in it?
Comment #62
Stevel commentedWriting an updated patch...
Comment #63
Stevel commentedComment #64
Stevel commentedAnd some patches that force the testbot to use the cache options.
Comment #69
joseph.olstadThe testrunner is broken for D7, I've openned an issue:
Composer command failed
#2970950: D7 test runner not working since may 4th 2018 'Composer command failed'
I think it was caused by this:
#2968630-2: Send phpversion to the composer command
Comment #70
joseph.olstadso ya, I think @Stevel patches are fine but they won't be able to be tested until testrunner gets fixed
Please feel free to make some noise in
#2970950: D7 test runner not working since may 4th 2018 'Composer command failed'
#2968630-2: Send phpversion to the composer command
Comment #74
joseph.olstadtest runner is fixed
Comment #75
marcelovaniWhat is the current situation with this issue?
Comment #76
fabianx commentedI am providing now a justification, why this can go into Drupal 7 even though it never entered Drupal 9:
- For Drupal 9 it's less useful, because we have minimal profile there so the (--cache) saves almost nothing
- --cache-modules would still be useful, but a large part of rebuilds is in the container, which varies a lot and is hard to cache (I tried really really hard but without further restrictions this just cannot be done efficiently)
So for Drupal 9 this is not really useful, but for Drupal 7 it saves a ton of time.
And even if we don't use it for testbot, anyone that still needs to run tests locally on D7 can profit from this.
So let's get this finally in and push it over the finish line.
Comment #77
fabianx commentedComment #78
mcdruid commentedtagging so that we try to get this into the next release; will do a proper review of #63 beforehand
Comment #79
mcdruid commentedThis needs a reroll, but I'm very keen to get it in to the next release.
Comment #80
joelpittetReroll fuzzed well, here it is.
Comment #81
joelpittetHmm, maybe I wasn't up to date, anyway new fuzz
Comment #83
mcdruid commentedStill needs a CR, but this is in. Thanks everyone!
Very happy that this was my 100th commit to core :)
Comment #84
joseph.olstadGreat work @mcdruid, thanks!