Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
Problem/Motivation
9.2-dev HEAD test failed today in CKEditorIntegrationTest in the media_library module:
https://www.drupal.org/pift-ci-job/1942623
In case that goes away, the error says:
Media_library.Drupal\Tests\media_library\FunctionalJavascript\CKEditorIntegrationTest
✗
Drupal\Tests\media_library\FunctionalJavascript\CKEditorIntegrationTest
fail: [Other] Line 0 of sites/default/files/simpletest/phpunit-100.xml:
PHPUnit Test failed to complete; Error: PHPUnit 8.5.13 by Sebastian Bergmann and contributors.
Testing Drupal\Tests\media_library\FunctionalJavascript\CKEditorIntegrationTest
.F. 3 / 3 (100%)
Time: 1.04 minutes, Memory: 4.00 MB
There was 1 failure:
1) Drupal\Tests\media_library\FunctionalJavascript\CKEditorIntegrationTest::testButton
Failed asserting that a NULL is not empty.
/var/www/html/vendor/phpunit/phpunit/src/Framework/Constraint/Constraint.php:118
/var/www/html/vendor/phpunit/phpunit/src/Framework/Constraint/LogicalNot.php:111
/var/www/html/vendor/phpunit/phpunit/src/Framework/Assert.php:2887
/var/www/html/vendor/phpunit/phpunit/src/Framework/Assert.php:863
/var/www/html/core/modules/media_library/tests/src/FunctionalJavascript/CKEditorIntegrationTest.php:261
/var/www/html/vendor/phpunit/phpunit/src/Framework/TestResult.php:691
Steps to reproduce
Proposed resolution
Remaining tasks
User interface changes
API changes
Data model changes
Release notes snippet
Comment | File | Size | Author |
---|---|---|---|
#14 | 3192260-14-fail-100x30.patch | 1.94 KB | longwave |
|
Issue fork drupal-3192260
Show commands
Start within a Git clone of the project using the version control instructions.
Or, if you do not have SSH keys set up on git.drupalcode.org:
Comments
Comment #2
dwwYup. :( Hit this at https://www.drupal.org/pift-ci-job/1944871 for #997826: #states doesn't work correctly with type text_format. Moar tagz. ;)
Comment #3
catchComment #6
danflanagan8Adding a (likely) related issue.
Comment #7
danflanagan8Comment #8
danflanagan8Let's start over kind of...
Here's a patch that will simply runs the test that is randomly failing 100 times. Let's see if it fails and then go from there. I've never personally done this before. Fingers crossed.
I was referred by @longwave to #3055648: Frequent random fail in \Drupal\Tests\media_library\FunctionalJavascript\MediaLibraryTest.
Comment #9
danflanagan8Oops. Missing some quotes. Trying again.
Comment #10
longwaveUnfortunately the Jenkins log shows it did run the test 100 times but it succeeded each time - either 100 times is not enough or there is some other trigger for this bug :(
Comment #11
danflanagan8And I ran it a second 100 times! I'm going to pump this up. The related issue I added tested 400 times. I'm going to give that a shot.
Comment #12
danflanagan8Well that's 600 consecutive passes of the randomly failing test. This may be a tough one to trigger and then convincingly consider fixed.
If it is indeed randomly failing, it seems like it simply must be due to the relatively short wait being used for certain ajax. That's what the MR changes, for what it's worth.
Comment #13
longwaveTheory: due to parallelism in the test runner, the random fail only happens when certain other heavyweight tests are running at the same time that cause enough CPU load for this particular test to slow down and not react in time. But because this is effectively random due to the the various combinations of JavaScript tests that *could* be running at once, triggering this on demand is going to be difficult.
I don't see how increasing the timeout can hurt, it doesn't mean the test will take longer, it just means that it will consider waiting longer if the element is not there yet - and the existing values seem to be somewhat arbitrary. Therefore I'm going to mark this RTBC even though we couldn't prove that it fixes the issue.
Comment #14
longwaveIn fact let's try bumping concurrency to 30 and repeat to 100 and see what happens - unsure if it will run out of memory doing this though.
Comment #15
dwwThanks, @danflanagan8 and @longwave for moving this forward!
Drat, #14 didn't trigger this, either.
However, the theory in #13 sounds totally plausible. We haven't made any other real progress tracking this down or conclusively proving where the race condition lies.
The MR indeed only removes the custom timeouts for
waitForElementVisible()
so they use the default of 10000 milliseconds. Those were added in commit 6280478a1a from #2994699: Create a CKEditor plugin to select and embed a media item from the Media Library when this test was first added to core. Searching that issue for "waitForElementVisible", there are a few hits, but no explanations as to why those shorter timeouts were chosen. Therefore, +1 to @longwave's characterization of them as "arbitrary". Agreed that using the default value won't always cause this test to run longer, just that it'll potentially wait longer if it needs to, to avoid the random fail. Seems like a win to me.We could potentially try #14 again, but artificially shrink those timeouts to only 10 milliseconds each or something, but I don't think it's worth the trouble. We clearly have race conditions in this test, and letting the test wait longer for the elements it's looking for is a good way to reduce or eliminate those races.
Therefore, I think we should commit this. RTBC.
Thanks again!
Comment #19
catchWondering if this was an increase when it was first committed, then the default was increased later, but see no reason to artificially shorten the timeouts.
Committed/pushed to 9.2.x and cherry-picked to 9.1.x and 8.9.x, thanks!
Comment #20
danflanagan8Thanks everyone!
Comment #22
xjm