Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
Problem/Motivation
After #3031128: Update from 8.6.7 to 8.6.8 warnings - Drupal\Core\Extension\Extension has no unserializer was committed, test run times went from ~51 minutes to ~1h42 minutes.
Not 100% clear what in that commit caused it, but the callback in fixSerializedExtensionObjects()
seems a likely culprit.
Proposed resolution
Perhaps this error handler can be turned off during test runs?
Remaining tasks
User interface changes
API changes
Data model changes
Release notes snippet
Comment | File | Size | Author |
---|---|---|---|
#33 | php-7.2.x-apache_mysql5.5-buildTimeTrend-after.png | 17.29 KB | tacituseu |
#33 | php-7.2-apache_mysql5.5-buildTimeTrend-after.png | 16.74 KB | tacituseu |
#18 | php-7.2.x-apache_mysql5.5-buildTimeTrend.png | 17.94 KB | tacituseu |
#18 | php-7.2-apache_mysql5.5-buildTimeTrend.png | 16.79 KB | tacituseu |
#13 | 3032869-2-13.patch | 1.51 KB | alexpott |
Comments
Comment #2
tim.plunkettOne patch reverts #3031740: Updating to 8.6.8 or 8.6.9 with Drush 8 causes data loss via update_fix_compatibility() and #3031128: Update from 8.6.7 to 8.6.8 warnings - Drupal\Core\Extension\Extension has no unserializer, the other changes nothing. Let's compare test runs.
Comment #3
dwwHere's a patch that implements the suggested solution from the summary:
Early return in
UpdateKernel::fixSerializedExtensionObjects()
ifdrupal_valid_test_ua()
returns something TRUEish.Curious both about the bot result, and the timing. ;)
Comment #5
tim.plunkettempty.patch (HEAD)
Unit: 26 sec
Kernel: 7 min 10 sec
Simpletest: 55 sec
Functional: 56 min 36 sec
FunctionalJS:
Total: 71 min 21 sec
test.patch
Unit: 24 sec
Kernel: 6 min 56 sec
Simpletest: 57 sec
Functional: 32 min 52 sec
FunctionalJS: 6 min 6 sec
Total: 47 min 15 sec
#3
Unit: 28 sec
Kernel: 7 min 4 sec
Simpletest: 57 sec
Functional: 32 min 51 sec
FunctionalJS: 5 min 42 sec
Total: 47 min 2 sec
Looks like the one failure was for the test of that code...
The HEAD run wasn't _as_ slow as it's been all day, but this still illustrated the problem
Comment #6
dwwYup, nice. Thanks for the timing result summary. Very helpful.
Let's see what @alexpott has to say on how to proceed.
Cheers,
-Derek
Comment #7
alexpottugh.
I think there is some recursion going on.
Comment #8
LendudeFirst alarms raised in #2946294: Fix race conditions in OffCanvasTestBase by @tacituseu. Please consider @tacituseu for credit here.
Comment #10
tacituseu CreditAttribution: tacituseu commentedI suspect because of #2474909: Allow Simpletest to use the same APC user cache prefix so that tests can share the classmap and other cache objects and
'apcu_ensure_unique_prefix'
the affected tests are basically flushing cache from under all other currently running ones, no idea why this doesn't seem to affect 8.6.x to which it was committed too.Edit: The behaviour was similar when APCu was doing forced expunges because it was running out of memory (see: #2926309-21: Random fail due to APCu not being able to allocate memory).
Comment #11
alexpottHere's another attempt to ensure we only do one clear.
Comment #12
alexpottfor what it is worth - Functional testing took 57 min 44 sec for #7 so that didn't help at all.
If #11 doesn't help then it's not the recursion that's the problem it's the clearing of the caches which is quite a big concern.
Comment #13
alexpottThinking about this some more regardless of whether #11 makes things faster we still want the check from #7 because atm there are no other errors at this point there might be in the future and the correct action might not be clearing the cache.
Comment #14
alexpottSo for #11 Functional Test run duration: 44 min 8 sec
And for #13 Functional Test run duration: 44 min 12 sec
So this is much better. It's still longer but getting back towards where we were. This result points to our update path testing only being as fast as it is because of caches and pseudo caches in state which is quite worrying really.
Comment #15
alexpottI suspect the initial 44 min runs are part of the natural variation of aws testbots...
Looks like we have two sizes of instances :)
Imo this fix is good.
Comment #16
alexpottSo one thought I've got is - is the check on the error string too fragile? It's not necessary for the improves in test times.
OTOH being specific here might prevent future surprises.
Comment #17
alexpottI re-ran the revert patch a few times. Less variation hmmm.... but maybe a time of day thing. Tricky. Not sure why that'd be. Resubmitted some more tests of #13 to see how that's doing now.
Here's the results of re-run the revert:
https://www.drupal.org/pift-ci-job/1200735 00:34:15.642 Test run duration: 26 min 56 sec
https://www.drupal.org/pift-ci-job/1200736 00:36:07.507 Test run duration: 28 min 40 sec
https://www.drupal.org/pift-ci-job/1200737 00:43:50.140 Test run duration: 33 min 19 sec
https://www.drupal.org/pift-ci-job/1200739 00:34:20.779 Test run duration: 25 min 53 sec
https://www.drupal.org/pift-ci-job/1200740 00:45:58.911 Test run duration: 33 min
Comment #18
tacituseu CreditAttribution: tacituseu commentedRe #15: they all run on the same instances (c4.8xlarge), some time trends:
PHP 7.2 & MySQL 5.5
PHP 7.2.x-dev & MySQL 5.5 (ignore the spike between #2122-#2160 - it is an anomaly)
Comment #19
alexpotthttps://www.drupal.org/pift-ci-job/1200765 Test run duration: 44 min 30 sec
https://www.drupal.org/pift-ci-job/1200762 Test run duration: 27 min 34 sec
https://www.drupal.org/pift-ci-job/1200763 Test run duration: 29 min 12 sec
https://www.drupal.org/pift-ci-job/1200766 Test run duration: 26 min 29 sec
https://www.drupal.org/pift-ci-job/1200767 Test run duration: 34 min 57 sec
https://www.drupal.org/pift-ci-job/1200789 Test run duration: 43 min 45 sec
Comment #20
alexpott#19 is from another round of testing for #13/ #11 - still looks much quicker than HEAD.
It'd be great someone can think about #16. I'm unsure and could go either way.
Comment #22
tim.plunkettThis whole method is already really specific and targeted anyway. Relying on
$container->get('state')->get('system.theme.data', []);
to be the indicator of a needed cache clear is obscure enough that I don't see a problem with hardcoding the expected error message.Comment #23
tacituseu CreditAttribution: tacituseu commentedGoing over #15,#17,#19 it seems to be affecting only PHP 7.2 and PHP 7.2-dev MySQL environments.
Comment #24
xjmThat would explain why we weren't seeing such a problem with 8.6.x -- because the default tests for 8.6.x run against 7.0 and 7.2 is just a nightly.
Comment #25
tacituseu CreditAttribution: tacituseu commented@xjm: Not really, I did check PHP 7.2 and PHP 7.2.x-dev with 8.6.x for anomalies and not the default one.
Comment #28
xjmCommitted to 8.7.x and cherry-picked to 8.6.x. Thanks!
Comment #30
xjmThis has been temporarily reverted from 8.6.x only.
Comment #32
xjmRecommitted to 8.6.x. Thanks!
Comment #33
tacituseu CreditAttribution: tacituseu commentedSame time trends as #18:
PHP 7.2 & MySQL 5.5 (~15 min difference)
PHP 7.2.x-dev & MySQL 5.5 (~5 min)
Comment #34
alexpott@tacituseu are you saying that we've not yet fixed the problem? It looks like you are saying we still have a serious regression in PHP 7.2 test performance. I've just had a test run timeout on another issue where functional tests took 54 min 50 sec.
Comment #35
tacituseu CreditAttribution: tacituseu commented@alexpott: It's tricky as it depends how you define the problem, all I'm saying is that there's still some time penalty compared to pre-#3031128: Update from 8.6.7 to 8.6.8 warnings - Drupal\Core\Extension\Extension has no unserializer times (and so far only on two particular environments). I don't know what exactly is the mechanism behind it, didn't dig in yet.
It still only seems to affect
8.7.x
and PHP 7.2.* combination and for some reason has no effect on8.6.x
on the exact same PHP 7.2.* environments (or any other combination on8.7.x
andPHP 7
I checked for that matter - however the data points there are scarce).The
PHP 7.2 & MySQL 5.5
environment hasPHP 7.2.8
and PHP 7.2.x-dev isPHP 7.2.17-dev
, so that might account for some (~10 min) of the difference between the two.Edit: didn't notice any timeouts.
Comment #36
alexpott@tacituseu thinking about this has made me realise that we're clearing the cache all the time during 8.7.x updates without fixing it because we no longer use system.theme.data. Opened a new issue to address this - see #3037823: The system.theme.data key remains corrupted in state causing performance issues and is not used in >= 8.7