Problem/Motivation

There are way too many test failures in CI. In fact, it's nearly impossible to have a fully green commit or daily pipeline.

This impairs the credibility of the framework in identifying real issues vs just reporting flakiness.

It's possible that at least some of the failure are due to concurrency. There is work upstream in #[Retry] attribute to support retrying flaky test #6182 to implement retries, but even if it will ever see the light, it will enable retries within the same PHPUnit process where the failure happened, which may not be 'enough' in run-tests.sh as the concurrency factor will still be in the equation during the retries.

Steps to reproduce

Look at any pipeline run and get frustrated by the number of failed jobs.

Proposed resolution

Introduce a configuration file for run-tests.sh, holding a list of tests that can be retried on failure, and the allowed # of retries.

During the concurrent test run, keep track of the failing tests.

Once the concurrent test run has completed, loop through the remaining failed tests and run them in non-concurrent mode (or lower concurrency, let's see).

If there are still failing tests after all the retries, fail the entire job.

Remaining tasks

User interface changes

Introduced terminology

API changes

Data model changes

Release notes snippet

Comments

mondrake created an issue. See original summary.

mondrake’s picture

Personally, I will only work on this once #3526459: Autodetect available CPUs in run-tests.sh for concurrency will be in.

mondrake’s picture

Title: Let run-tests.sh re-run flaky tests on failure » Let run-tests.sh re-run flaky test classes on failure
Related issues: +#3526459: [PP-1] Autodetect available CPUs in run-tests.sh for concurrency
mondrake’s picture

Uh there already is #3165263: Allow known flaky tests to be automatically repeated, and I even commented on that... anyway, 5 years ago the scenario was

Mark a patch RTBC and wait a few days/weeks for a random test failure.

Right now it's much worse, as the probability of random test failures is such that in a context of multiple child pipelines, in the vast majority of the cases at least one failure occurs, invalidating the main pipeline overall.

mondrake’s picture

I closed #3165263: Allow known flaky tests to be automatically repeated as it was referring to a scenario, issues moved from RTBC to NW because of daily reruns, that occurred in DrupalCI. Now in GitLabCI issues are no longer knocked down in d.o., but it's the general situation that makes things difficult (developers re-trying MR pipelines multiple times etc).

Version: 11.x-dev » main

Drupal core is now using the main branch as the primary development branch. New developments and disruptive changes should now be targeted to the main branch.

Read more in the announcement.

mondrake’s picture

mondrake’s picture

Issue tags: +PHPUnit 13