Speed up the tests on DrupalCI by not running slow tests last [#3038901]

It's the old Rock, Pebbles, and Sand analogy but for test. We need to run the slowest tests first so we know that they won't hog the test runner while everything else has finished. By running the slowest tests early we make optimal use of all 32 cores available.

Three possible approaches:

Mark the slow tests with #slow. TestDiscovery orders by alphabet so the # ensures that these tests come first. It can be any tag obviously, but #slow has a nice ring to it.
~~Reverse the order of the tests. This would work because the slowest tests come late in the alphabet (Update, Views, Workflow).~~
~~Randomise the order of the test. This helps because the same type of test tend to be slow and this evenly distributes them though the run.~~

Pros-cons:

Option one would give us the highest level of control, but would take manual management and updating. Drawback pointed out by @Mile23 in #5 was addressed in #2296615: Accurately support multiple @groups per test class
Option two is only by coincidence that it works, if we add a slow test to the access group, we are back to square one. But it seems like a logical assumption that updates tests are always going to be the biggest bottleneck, so it solves that. No manual management.
Option three is random, so the slow tests could still wind up at the end, but over the long haul this should even out, new slow tests will just get shuffled in. No manual management.

Comment	File	Size	Author
#28	3038901.25_28.interdiff.txt	357 bytes	dww
#28	3038901-28.patch	3.16 KB	dww
#28	8.8.x: PHP 7.2 & MySQL 5.5 26,131 pass
#25	3038901-25_vs_3040694-11.interdiff.txt	18.03 KB	dww
#25	3038901.14_25.interdiff.txt	1.75 KB	dww
#25	3038901-25.patch	2.65 KB	dww
#25	8.8.x: PHP 7.2 & MySQL 5.5 25,271 pass
#24	3038901_14-vs-3040694_11.interdiff.txt	19.78 KB	dww
#14	3038901-14.patch	4.6 KB	Lendude
#14	8.8.x: PHP 7.2 & MySQL 5.5 25,268 pass
#12	3038901-12.patch	6.88 KB	Lendude
#12	8.8.x: PHP 7.2 & MySQL 5.5 25,228 pass
#10	3038901-10.patch	5.48 KB	Lendude
#10	8.8.x: PHP 7.2 & MySQL 5.5 25,198 pass, 1 fail
#9	3038901_9.patch	2 KB	Mile23
#9	8.8.x: PHP 7.2 & MySQL 5.5 Patch Failed to Apply
#2	reverse-3038901-2.patch	364 bytes	Lendude
#2	8.8.x: PHP 7.2 & MySQL 5.5 25,220 pass
#2	randomize-3038901-2.patch	345 bytes	Lendude
#2	8.8.x: PHP 7.2 & MySQL 5.5 25,220 pass
#2	mark-slow-tests-3038901-2.patch	4.63 KB	Lendude
#2	8.8.x: PHP 7.2 & MySQL 5.5 25,220 pass

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Comment #1

10 March 2019 at 09:11

Lendude created an issue. See original summary.

Comment #2

Lendude

Dutch

Amsterdam

CreditAttribution: Lendude at ezCompany commented 10 March 2019 at 09:13

Status:

Active

» Needs review

File	Size
mark-slow-tests-3038901-2.patch	4.63 KB
8.8.x: PHP 7.2 & MySQL 5.5 25,220 pass
randomize-3038901-2.patch	345 bytes
8.8.x: PHP 7.2 & MySQL 5.5 25,220 pass
reverse-3038901-2.patch	364 bytes
8.8.x: PHP 7.2 & MySQL 5.5 25,220 pass

Here are the three options as shown in patches

Comment #3

Lendude

Dutch

Amsterdam

CreditAttribution: Lendude at ezCompany commented 10 March 2019 at 18:15

HEAD

Unit: 25 sec
Kernel: 7 min 16 sec
Functional: 34 min 31 sec
Javascript: 6 min 4 sec
Total: 52 min

random

Unit: 33 sec
Kernel: 5 min 32 sec
Functional: 30 min 31 sec
Javascript: 5 min 48 sec
Total: 47 min

reversed

Unit: 42 sec
Kernel: 5 min 7 sec
Functional: 34 min 5 sec
Javascript: 3 min 54 sec
Total: 46 min

#slow

Unit: 30 sec
Kernel: 5 min 10 sec
Functional: 31 min 47 sec
Javascript: 4 min 8 sec
Total: 46 min

All tests done on PHP 7.2 with MYSQL. There is of course variation in test times but I ran this a couple of times in #3038758: [ignore] Lendude testing issue and these times seem a good indicator, same for the test of HEAD.

Comment #4

Mixologic

He/Him/His

English

Portland, OR

CreditAttribution: Mixologic at Drupal Association commented 10 March 2019 at 19:15

Maybe we could reuse the phpunit @large annotation and set the timeout to something long, like 20 minutes? Not sure if that's using that annotation appropriately, or not.

Comment #5

Mile23

English

Seattle, WA

CreditAttribution: Mile23 as a volunteer commented 11 March 2019 at 01:39

+++ b/core/modules/settings_tray/tests/src/FunctionalJavascript/SettingsTrayBlockFormTest.php
@@ -9,6 +9,7 @@
+ * @group #slow
  * @group settings_tray

So when you say 'orders by alphabet' that's a bit of a problem, because TestDiscovery doesn't know how to deal with multiple groups. So if you make the first group '#slow' then 'settings_tray' is ignored by run-tests.sh. Anyone using run-tests.sh to run that group then loses those tests.

But hey, guess what. We have this: #2296615: Accurately support multiple @groups per test class

We could also introduce our own group annotation scheme like @early or something, which would then be a sort for run-tests.sh. Then establish some metrics with, for instance, speedtrap: https://github.com/johnkary/phpunit-speedtrap

Comment #6

Lendude

Dutch

Amsterdam

CreditAttribution: Lendude at ezCompany commented 11 March 2019 at 07:46

Issue summary:

View changes

@Mile23 yeah the loss of group also occurred to me, but -5 minutes on all run times is straight money in the bank, so it might be worth it. Or it means a +1 for randomise which currently doesn't have that drawback.

Doing an @early annotation sounds like a pretty good idea too, since that would clearly signal that this is a Drupal only thing instead of misusing the groups annotation for this.

The charm of the #slow group for me was that it really doesn't require any additional work in TestDiscovery nor run-tests.sh, updated the IS with the drawbacks of #5

Comment #7

alexpott

he/they

English

🇪🇺🌍

CreditAttribution: alexpott commented 11 March 2019 at 16:55

I think Using @large is a bit problematic

If the PHP_Invoker package is installed and strict mode is enabled, a large test will fail if it takes longer than 60 seconds to execute. This timeout is configurable via the timeoutForLargeTests attribute in the XML configuration file.

see https://phpunit.de/manual/6.5/en/appendixes.annotations.html#appendixes....

60 seconds :) yes we can change it but we're amusingly different from that.

I think using groups is a good idea but it does sound like we need to fix #2296615: Accurately support multiple @groups per test class first.

Comment #8

Lendude

Dutch

Amsterdam

CreditAttribution: Lendude at ezCompany commented 11 March 2019 at 19:34

Issue summary:

View changes

#2296615: Accurately support multiple @groups per test class is in, nice! Updated the IS again to reflect that one of the drawbacks of using groups has been addressed. It seems like that option has the most traction, so struck out the others for now.
#2296615: Accurately support multiple @groups per test class also addressed an issue I had that the #slow group always had to be the first group in the annotation. Since the groups now get sorted, it should always float to the top of the list.

So the only question I still see is do we want to use #slow or something similar that uses the fact that we sort by alphabet or do we want a dedicated group (like @large) for which we provide special handling in run-tests.sh that moves that group to the front of the list.

To me, no new special cases/handling/logic in run-tests.sh sounds good, so I would opt for the #slow group, but curious what others think.

Comment #9

Mile23

English

Seattle, WA

CreditAttribution: Mile23 as a volunteer commented 11 March 2019 at 19:55

File	Size
3038901_9.patch	2 KB
8.8.x: PHP 7.2 & MySQL 5.5 Patch Failed to Apply

Here's a first pass at @early.

Comment #10

Lendude

Dutch

Amsterdam

CreditAttribution: Lendude at ezCompany commented 12 March 2019 at 08:31

File	Size
3038901-10.patch	5.48 KB
8.8.x: PHP 7.2 & MySQL 5.5 25,198 pass, 1 fail

Couple of things I see with #9

```
+++ b/core/modules/simpletest/src/TestDiscovery.php
@@ -198,16 +200,26 @@ public function getTestClasses($extension = NULL, array $types = []) {
+    $list = array_merge($early_list, $list);
```
This would lead to the entire group to be place at the front, so the whole group becomes early when one test in that group is @early. So I think we need to make @early into a pseudo group and place that at the start of the list. Since tests can now be in multiple groups anyway, that shouldn't be a problem.
```
+++ b/core/modules/system/tests/src/Functional/Module/InstallUninstallTest.php
@@ -10,6 +10,8 @@
+ * @early
```
This doesn't get picked up by the TestDiscovery regex preg_match_all('/^[ ]*\* \@([^\s]*) (.*$)/m', $doc_comment, $matches);

it's also looking for a value, so if we don't want to modify the regex we need to do something like @early true, also annotations are not automatically added to the info, so we need to add it to the info

This addresses those issues, lets see what happens.

Comment #11

12 March 2019 at 09:36

Status:

Needs review

» Needs work

The last submitted patch, 10: 3038901-10.patch, failed testing. View results

Comment #12

Lendude

Dutch

Amsterdam

CreditAttribution: Lendude at ezCompany commented 12 March 2019 at 15:33

Status:

Needs work

» Needs review

File	Size
3038901-12.patch	6.88 KB
8.8.x: PHP 7.2 & MySQL 5.5 25,228 pass

Fixed test.

Comment #13

Mile23

English

Seattle, WA

CreditAttribution: Mile23 as a volunteer commented 13 March 2019 at 02:04

Using a group is fine, too. I just wanted to avoid magical naming conventions that work because other things don't work.

Comment #14

Lendude

Dutch

Amsterdam

CreditAttribution: Lendude at ezCompany commented 14 March 2019 at 09:36

File	Size
3038901-14.patch	4.6 KB
8.8.x: PHP 7.2 & MySQL 5.5 25,268 pass

I just wanted to avoid magical naming conventions that work because other things don't work.

Agree 100%, so I really wanted to see what it would take to get @early to work, 'cause that is a much less magical solution. But looking at it, I'm not sure if its worth the added complexity, since we would also need to update the regex, because having to do @early something is really bad DX I think.

Lets try another #slow when #3039572: Prevent tests with multiple groups running multiple times lands.

Comment #15

Lendude

Dutch

Amsterdam

CreditAttribution: Lendude at ezCompany commented 14 March 2019 at 11:02

HEAD => #14
Kernel: 6 min 57 sec => 5 min 12 sec
Functional: 34 min 23 sec => 31 min 38 sec

Comment #16

larowlan

🇦🇺🏝.au GMT+10

CreditAttribution: larowlan at PreviousNext commented 14 March 2019 at 21:47

Status:

Needs review

» Reviewed & tested by the community

It's primitive, but it works.

Comment #17

dww

we/he/they

CreditAttribution: dww as a volunteer commented 14 March 2019 at 22:03

I appreciate the primitive simplicity. ;)

Do we document test groups anywhere?

- If so, can we update about this? What's the criteria for being #slow? How will people know when to use this group or not? Is the expectation that core committers will Just Know(tm) and get it right from here on out? Or will @Lendude keep a watchful eye on all tests added to core? ;) Neither seems particularly sustainable.

- If not, can we add some? ;) Not sure where. Perhaps in core/tests/README.md and/or core/core.api.php? Both mention that test groups exist, but neither really gives any guidance about how to define them.

Otherwise, +1 for RTBC on this. Getting the slow operations started first is a classic scheduling algorithm in concurrent systems.

Cheers,
-Derek

Comment #18

Mile23

English

Seattle, WA

CreditAttribution: Mile23 as a volunteer commented 14 March 2019 at 22:30

Doc pages: https://www.drupal.org/docs/8/phpunit

We don't have much about groups, other than mentioning you need at least one in this document: https://www.drupal.org/docs/8/phpunit/phpunit-file-structure-namespace-a...

As far as metrics for which tests get the treatment: I'd say we should include a timing tool that fails your test run if an unmarked test takes too long. I'd say something like https://github.com/johnkary/phpunit-speedtrap but that requires PHP 7.1.

It's not a very complicated listener to put together, but does require doing. And then bikeshedding. :-)

Note that one benefit here is that we can say:

./vendor/bin/phpunit -c core --exclude-group #slow

Comment #19

dww

we/he/they

CreditAttribution: dww as a volunteer commented 14 March 2019 at 23:24

Yeah, I'm actually a maintainer of that doc guide. ;) I was thinking it'd be good to document this stuff directly in core, not just in a handbook page.

Meanwhile, https://www.drupal.org/docs/8/phpunit/phpunit-file-structure-namespace-a... does mention @group, but only to say:

Every test class MUST specify at least one (first) @group that matches the originating module short name (or Drupal core component name).

Not particularly helpful...

So yes, we could add something about #slow to that page. But I'm wondering if we can also put it into some of the included-with-core-in-Git docs, and if so, if there's a preference about where and what.

Speedtrap sounds nice, but that's obviously a separate issue. For now, I'd like some clarity (I guess from @Lendude) on the criteria used to generate this patch so we have something to start with.

Thanks,
-Derek

Comment #20

Lendude

Dutch

Amsterdam

CreditAttribution: Lendude at ezCompany commented 15 March 2019 at 06:36

@dww excellent point on the docs. I would argue against putting it in the core git docs since this is a far from normal occurrence.

For now I would add a point to the 'group' explanation pointed out in #19. Something like:
An additional group '#slow' can be added to test that have a long running time to ensure these tests are run first.

Not very specific, but then again, currently there are no specific criteria for when to add this.

Which brings me to how I build this patch. I just looked at the tests that are last to finish or that ended way out of alphabetical order and tagged them to see if this had any effect. And I totally agree that this is not a sustainable method of doing this :)

#3033037: DrupalCI should track and display job run times should help determine if new slow tests get added by patches, so we could bump the priority on that one.

Adding an automated way of detecting this is obviously even better, so I agree with @Mile23 we should do the speedtrap follow-up and see if that works for us.

Comment #21

Mixologic

He/Him/His

English

Portland, OR

CreditAttribution: Mixologic at Drupal Association commented 15 March 2019 at 14:19

#3033037: DrupalCI should track and display job run times is about the time of the entire build run (1 hr 10 min etc).
#2943340: Process PHPUnit junit reports separately is the drupalci issue that would have us use the phpunit junit.xml, which already has the test timing data in it, vs the junit that run-tests.sh generates.

Comment #22

Mile23

English

Seattle, WA

CreditAttribution: Mile23 as a volunteer commented 15 March 2019 at 22:01

Here's your follow-up: #3040694: Use the phpunit result cache to order tests by slowest first in gitlab jobs

Comment #23

amateescu CreditAttribution: amateescu commented 20 March 2019 at 14:01

How about including all the update path tests in the #slow group? They're all really slow "by nature" :)

Comment #24

dww

we/he/they

CreditAttribution: dww as a volunteer commented 20 March 2019 at 16:45

Status:

Reviewed & tested by the community

» Needs work

File	Size
3038901_14-vs-3040694_11.interdiff.txt	19.78 KB

@amateescu re: #23: At #3040694: Use the phpunit result cache to order tests by slowest first in gitlab jobs we've got a patch that puts all tests that take longer than 75 seconds in the #slow group, and a mechanism to fail new tests that are added which take more than 90 seconds and aren't #slow. There's a lot from core/modules/system/tests/src/Functional/Update in that patch, but definitely not everything. ;) If we put too many things in #slow, it starts to undo the effectiveness of the scheduling algorithm.

Anyway, I'm really in favor of #3040694 as a quantifiable way to decide what belongs in #slow or not.

I just tried applying both patches. Much to my surprise, a few of the tests marked #slow in here didn't hit the 75 second mark over there. Of course that patch catches a lot more tests than this did. Looking at the interdiff, I also noticed one of the ones in here has a typo (@gruop #slow).

So, not entirely sure how to proceed now. Honestly, I think #3040694-11: Use the phpunit result cache to order tests by slowest first in gitlab jobs is the actually RTBC patch now. But in fairness, this was @Lendude's cool idea, and @Mile23 and I just built upon it. In terms of issue history, commit credits, etc, I think we should commit something here first, with @Lendude as the primary author, then we can re-roll #3040694 to remove whatever the conflicts are, and commit @Mile23's cool test listener and the rest of the tests that should also be #slow.

How does that sound to everyone?

NW for the typo, and removing the #slow tests from here that aren't 75 seconds slow.

Comment #25

dww

we/he/they

CreditAttribution: dww as a volunteer commented 20 March 2019 at 16:51

Status:

Needs work

» Needs review

File	Size
3038901-25.patch	2.65 KB
8.8.x: PHP 7.2 & MySQL 5.5 25,271 pass
3038901.14_25.interdiff.txt	1.75 KB
3038901-25_vs_3040694-11.interdiff.txt	18.03 KB

Comment #26

amateescu CreditAttribution: amateescu commented 20 March 2019 at 17:06

Status:

Needs review

» Reviewed & tested by the community

#24 sounds great to me :)

Comment #27

Lendude

Dutch

Amsterdam

CreditAttribution: Lendude at ezCompany commented 20 March 2019 at 18:40

#24 makes sense to me, lets land this soon so we get the initial benefit and continue the awesome stuff in #3040694: Use the phpunit result cache to order tests by slowest first in gitlab jobs to automate this which might take a bit longer to perfect.

Comment #28

dww

we/he/they

CreditAttribution: dww as a volunteer commented 20 March 2019 at 23:47

File	Size
3038901-28.patch	3.16 KB
8.8.x: PHP 7.2 & MySQL 5.5 26,131 pass
3038901.25_28.interdiff.txt	357 bytes

1 file was hidden/shown/deleted

File	Size
3038901-25.patch	2.65 KB
8.8.x: PHP 7.2 & MySQL 5.5 25,271 pass

Let's keep both of the Kernel tests from #14.

Comment #29

21 March 2019 at 09:52

larowlan committed dcd66e1 on 8.8.x

Issue #3038901 by Lendude, dww, Mile23: Speed up the tests on DrupalCI...

Comment #30

larowlan

🇦🇺🏝.au GMT+10

CreditAttribution: larowlan at PreviousNext commented 21 March 2019 at 09:53

Version:	8.8.x-dev	» 8.7.x-dev
Status:	Reviewed & tested by the community	» Fixed

Committed dcd66e1 and pushed to 8.8.x. Thanks!

c/p to 8.7.x

Comment #31

21 March 2019 at 09:53

larowlan committed 28ec907 on 8.7.x

Issue #3038901 by Lendude, dww, Mile23: Speed up the tests on DrupalCI...

Comment #32

4 April 2019 at 09:54

Status:

Fixed

» Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

Speed up the tests on DrupalCI by not running slow tests last

Comments

HEAD

random

reversed

#slow

Child issues

Related issues

Referenced by