Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
It's the old Rock, Pebbles, and Sand analogy but for test. We need to run the slowest tests first so we know that they won't hog the test runner while everything else has finished. By running the slowest tests early we make optimal use of all 32 cores available.
Three possible approaches:
- Mark the slow tests with #slow. TestDiscovery orders by alphabet so the # ensures that these tests come first. It can be any tag obviously, but #slow has a nice ring to it.
Reverse the order of the tests. This would work because the slowest tests come late in the alphabet (Update, Views, Workflow).Randomise the order of the test. This helps because the same type of test tend to be slow and this evenly distributes them though the run.
Pros-cons:
- Option one would give us the highest level of control, but would take manual management and updating. Drawback pointed out by @Mile23 in #5 was addressed in #2296615: Accurately support multiple @groups per test class
- Option two is only by coincidence that it works, if we add a slow test to the access group, we are back to square one. But it seems like a logical assumption that updates tests are always going to be the biggest bottleneck, so it solves that. No manual management.
- Option three is random, so the slow tests could still wind up at the end, but over the long haul this should even out, new slow tests will just get shuffled in. No manual management.
Comment | File | Size | Author |
---|---|---|---|
#28 | 3038901.25_28.interdiff.txt | 357 bytes | dww |
#28 | 3038901-28.patch | 3.16 KB | dww |
#25 | 3038901-25_vs_3040694-11.interdiff.txt | 18.03 KB | dww |
#25 | 3038901.14_25.interdiff.txt | 1.75 KB | dww |
#24 | 3038901_14-vs-3040694_11.interdiff.txt | 19.78 KB | dww |
Comments
Comment #2
LendudeHere are the three options as shown in patches
Comment #3
LendudeHEAD
random
reversed
#slow
All tests done on PHP 7.2 with MYSQL. There is of course variation in test times but I ran this a couple of times in #3038758: [ignore] Lendude testing issue and these times seem a good indicator, same for the test of HEAD.
Comment #4
MixologicMaybe we could reuse the phpunit @large annotation and set the timeout to something long, like 20 minutes? Not sure if that's using that annotation appropriately, or not.
Comment #5
Mile23So when you say 'orders by alphabet' that's a bit of a problem, because TestDiscovery doesn't know how to deal with multiple groups. So if you make the first group '#slow' then 'settings_tray' is ignored by run-tests.sh. Anyone using run-tests.sh to run that group then loses those tests.
But hey, guess what. We have this: #2296615: Accurately support multiple @groups per test class
We could also introduce our own group annotation scheme like @early or something, which would then be a sort for run-tests.sh. Then establish some metrics with, for instance, speedtrap: https://github.com/johnkary/phpunit-speedtrap
Comment #6
Lendude@Mile23 yeah the loss of group also occurred to me, but -5 minutes on all run times is straight money in the bank, so it might be worth it. Or it means a +1 for randomise which currently doesn't have that drawback.
Doing an @early annotation sounds like a pretty good idea too, since that would clearly signal that this is a Drupal only thing instead of misusing the groups annotation for this.
The charm of the #slow group for me was that it really doesn't require any additional work in TestDiscovery nor run-tests.sh, updated the IS with the drawbacks of #5
Comment #7
alexpottI think Using @large is a bit problematic
see https://phpunit.de/manual/6.5/en/appendixes.annotations.html#appendixes....
60 seconds :) yes we can change it but we're amusingly different from that.
I think using groups is a good idea but it does sound like we need to fix #2296615: Accurately support multiple @groups per test class first.
Comment #8
Lendude#2296615: Accurately support multiple @groups per test class is in, nice! Updated the IS again to reflect that one of the drawbacks of using groups has been addressed. It seems like that option has the most traction, so struck out the others for now.
#2296615: Accurately support multiple @groups per test class also addressed an issue I had that the #slow group always had to be the first group in the annotation. Since the groups now get sorted, it should always float to the top of the list.
So the only question I still see is do we want to use #slow or something similar that uses the fact that we sort by alphabet or do we want a dedicated group (like @large) for which we provide special handling in run-tests.sh that moves that group to the front of the list.
To me, no new special cases/handling/logic in run-tests.sh sounds good, so I would opt for the #slow group, but curious what others think.
Comment #9
Mile23Here's a first pass at @early.
Comment #10
LendudeCouple of things I see with #9
This would lead to the entire group to be place at the front, so the whole group becomes early when one test in that group is @early. So I think we need to make @early into a pseudo group and place that at the start of the list. Since tests can now be in multiple groups anyway, that shouldn't be a problem.
This doesn't get picked up by the TestDiscovery regex
preg_match_all('/^[ ]*\* \@([^\s]*) (.*$)/m', $doc_comment, $matches);
it's also looking for a value, so if we don't want to modify the regex we need to do something like
@early true
, also annotations are not automatically added to the info, so we need to add it to the infoThis addresses those issues, lets see what happens.
Comment #12
LendudeFixed test.
Comment #13
Mile23Using a group is fine, too. I just wanted to avoid magical naming conventions that work because other things don't work.
Comment #14
LendudeAgree 100%, so I really wanted to see what it would take to get @early to work, 'cause that is a much less magical solution. But looking at it, I'm not sure if its worth the added complexity, since we would also need to update the regex, because having to do
@early something
is really bad DX I think.Lets try another #slow when #3039572: Prevent tests with multiple groups running multiple times lands.
Comment #15
LendudeHEAD => #14
Kernel: 6 min 57 sec => 5 min 12 sec
Functional: 34 min 23 sec => 31 min 38 sec
Comment #16
larowlanIt's primitive, but it works.
Comment #17
dwwI appreciate the primitive simplicity. ;)
Do we document test groups anywhere?
- If so, can we update about this? What's the criteria for being #slow? How will people know when to use this group or not? Is the expectation that core committers will Just Know(tm) and get it right from here on out? Or will @Lendude keep a watchful eye on all tests added to core? ;) Neither seems particularly sustainable.
- If not, can we add some? ;) Not sure where. Perhaps in core/tests/README.md and/or core/core.api.php? Both mention that test groups exist, but neither really gives any guidance about how to define them.
Otherwise, +1 for RTBC on this. Getting the slow operations started first is a classic scheduling algorithm in concurrent systems.
Cheers,
-Derek
Comment #18
Mile23Doc pages: https://www.drupal.org/docs/8/phpunit
We don't have much about groups, other than mentioning you need at least one in this document: https://www.drupal.org/docs/8/phpunit/phpunit-file-structure-namespace-a...
As far as metrics for which tests get the treatment: I'd say we should include a timing tool that fails your test run if an unmarked test takes too long. I'd say something like https://github.com/johnkary/phpunit-speedtrap but that requires PHP 7.1.
It's not a very complicated listener to put together, but does require doing. And then bikeshedding. :-)
Note that one benefit here is that we can say:
Comment #19
dwwYeah, I'm actually a maintainer of that doc guide. ;) I was thinking it'd be good to document this stuff directly in core, not just in a handbook page.
Meanwhile, https://www.drupal.org/docs/8/phpunit/phpunit-file-structure-namespace-a... does mention
@group
, but only to say:Not particularly helpful...
So yes, we could add something about #slow to that page. But I'm wondering if we can also put it into some of the included-with-core-in-Git docs, and if so, if there's a preference about where and what.
Speedtrap sounds nice, but that's obviously a separate issue. For now, I'd like some clarity (I guess from @Lendude) on the criteria used to generate this patch so we have something to start with.
Thanks,
-Derek
Comment #20
Lendude@dww excellent point on the docs. I would argue against putting it in the core git docs since this is a far from normal occurrence.
For now I would add a point to the 'group' explanation pointed out in #19. Something like:
An additional group '#slow' can be added to test that have a long running time to ensure these tests are run first.
Not very specific, but then again, currently there are no specific criteria for when to add this.
Which brings me to how I build this patch. I just looked at the tests that are last to finish or that ended way out of alphabetical order and tagged them to see if this had any effect. And I totally agree that this is not a sustainable method of doing this :)
#3033037: DrupalCI should track and display job run times should help determine if new slow tests get added by patches, so we could bump the priority on that one.
Adding an automated way of detecting this is obviously even better, so I agree with @Mile23 we should do the speedtrap follow-up and see if that works for us.
Comment #21
Mixologic#3033037: DrupalCI should track and display job run times is about the time of the entire build run (1 hr 10 min etc).
#2943340: Process PHPUnit junit reports separately is the drupalci issue that would have us use the phpunit junit.xml, which already has the test timing data in it, vs the junit that run-tests.sh generates.
Comment #22
Mile23Here's your follow-up: #3040694: Use the phpunit result cache to order tests by slowest first in gitlab jobs
Comment #23
amateescu CreditAttribution: amateescu commentedHow about including all the update path tests in the
#slow
group? They're all really slow "by nature" :)Comment #24
dww@amateescu re: #23: At #3040694: Use the phpunit result cache to order tests by slowest first in gitlab jobs we've got a patch that puts all tests that take longer than 75 seconds in the #slow group, and a mechanism to fail new tests that are added which take more than 90 seconds and aren't #slow. There's a lot from core/modules/system/tests/src/Functional/Update in that patch, but definitely not everything. ;) If we put too many things in #slow, it starts to undo the effectiveness of the scheduling algorithm.
Anyway, I'm really in favor of #3040694 as a quantifiable way to decide what belongs in #slow or not.
I just tried applying both patches. Much to my surprise, a few of the tests marked #slow in here didn't hit the 75 second mark over there. Of course that patch catches a lot more tests than this did. Looking at the interdiff, I also noticed one of the ones in here has a typo (
@gruop #slow
).So, not entirely sure how to proceed now. Honestly, I think #3040694-11: Use the phpunit result cache to order tests by slowest first in gitlab jobs is the actually RTBC patch now. But in fairness, this was @Lendude's cool idea, and @Mile23 and I just built upon it. In terms of issue history, commit credits, etc, I think we should commit something here first, with @Lendude as the primary author, then we can re-roll #3040694 to remove whatever the conflicts are, and commit @Mile23's cool test listener and the rest of the tests that should also be #slow.
How does that sound to everyone?
NW for the typo, and removing the #slow tests from here that aren't 75 seconds slow.
Comment #25
dwwComment #26
amateescu CreditAttribution: amateescu commented#24 sounds great to me :)
Comment #27
Lendude#24 makes sense to me, lets land this soon so we get the initial benefit and continue the awesome stuff in #3040694: Use the phpunit result cache to order tests by slowest first in gitlab jobs to automate this which might take a bit longer to perfect.
Comment #28
dwwLet's keep both of the Kernel tests from #14.
Comment #30
larowlanCommitted dcd66e1 and pushed to 8.8.x. Thanks!
c/p to 8.7.x