Problem/Motivation
There seem to be numerous random fails in ImageStylesPathAndUrlTest
testImageStylePrivateWithConversion()
https://git.drupalcode.org/issue/drupal-3483501/-/jobs/3348830
There was 1 failure:
1)
Drupal\Tests\image\Functional\ImageStylesPathAndUrlTest::testImageStylePrivateWithConversion
Behat\Mink\Exception\ExpectationException: Current response status code is
403, but 200 expected.
/builds/issue/drupal-3483501/vendor/behat/mink/src/WebAssert.php:888
/builds/issue/drupal-3483501/vendor/behat/mink/src/WebAssert.php:145
/builds/issue/drupal-3483501/core/modules/image/tests/src/Functional/ImageStylesPathAndUrlTest.php:315
/builds/issue/drupal-3483501/core/modules/image/tests/src/Functional/ImageStylesPathAndUrlTest.php:136
FAILURES!
Tests: 10, Assertions: 240, Failures: 1.
testImageStyleUrlAndPathPrivate()
https://git.drupalcode.org/issue/drupal-2630732/-/jobs/3305640
There was 1 failure:
1)
Drupal\Tests\image\Functional\ImageStylesPathAndUrlTest::testImageStyleUrlAndPathPrivate
Behat\Mink\Exception\ExpectationException: Current response status code is
403, but 200 expected.
/builds/issue/drupal-2630732/vendor/behat/mink/src/WebAssert.php:888
/builds/issue/drupal-2630732/vendor/behat/mink/src/WebAssert.php:145
/builds/issue/drupal-2630732/core/modules/image/tests/src/Functional/ImageStylesPathAndUrlTest.php:315
/builds/issue/drupal-2630732/core/modules/image/tests/src/Functional/ImageStylesPathAndUrlTest.php:85
FAILURES!
Tests: 10, Assertions: 240, Failures: 1.
...
Steps to reproduce
Proposed resolution
The root cause of this issue is discussed in #3496257: Race conditions in CacheCollector/State (again)
Remaining tasks
User interface changes
Introduced terminology
API changes
Data model changes
Release notes snippet
Issue fork drupal-3487371
Show commands
Start within a Git clone of the project using the version control instructions.
Or, if you do not have SSH keys set up on git.drupalcode.org:
- 3487371-11.1.x
changes, plain diff MR !10718
- 11.x
compare
- 3487371-2500x-ImageStylesPathAndUrlTest-as-is
changes, plain diff MR !10701
- 3487371-2500x-ImageStylesPathAndUrlTest-key-value
changes, plain diff MR !10702
- 3487371-ImageStylesPathAndUrlTest-fix
changes, plain diff MR !10703
- 3487371-debug-failing-test
changes, plain diff MR !10626
- 3487371-random-test-failure
compare
Comments
Comment #2
donquixote commentedThis use of randomString() looks suspicious.
EDIT: But does not explain why the test would fail further up.
Comment #4
spokjeComment #9
spokjeMR !10701 (which is
Drupal\Tests\image\Functional\ImageStylesPathAndUrlTestand some pipeline changes to make it, and only it, run 1750 times) gives us 13 fails out of the 1750 runs.These 13 fails breaks down to:
2x
testImageStyleUrlAndPathPrivateUnclean()10x
testImageStyleUrlAndPathPrivateLanguage()1x
testImageStyleUrlAndPathPrivate()MR !10702 shows that changing \Drupal::state()->set()/get() to Drupal::keyValue()->set()/get(); as suggested in the Slack thread passes a 1750 times run without errors.
The only reason this MR shows as failed is the fact that the artifacts are too big to upload:
Comment #11
spokjeMR!10703 contains the changes that should be committed after review.
Comment #12
spokjeComment #13
smustgrave commentedBased on the findings in #9, and I re-ran 10703 a few times and never got these random failures, when before I probably least got 1 each time.
Comment #14
dwwFantastic work! Great find.
Agree that the changes to
ImageStylesPathAndUrlTestlook good. That much is indeed RTBC.However, grep is finding more
\Drupal::state()in core/modules/image/tests/src/* and I'm wondering if we want to expand the scope here to fixImageStyleFlushTestandImageEffectsTestwhile we're at it. I don't remember seeing those randomly fail as often, but it seems like we shouldn't just do this conversion when we happen to be nailed by the trouble. I'm in favor of preemptively fixing all the tests to stop using state(). So we should probably fix more than 1 test class per issue.Comment #15
spokjeStrongly disagree with bundling more issues together, but feel free to add-n-fix whatever you like.
Problem here IMHO is where the scope-creep ends, also more tests to test 2500x, or at least as-many-times-fit-in-an-hour makes things very slow, distinct tests run lower and fail rate/ success prove drop.
Doubt that it's OK to just create an MR that changes state to keyvalue, without the above test-runs, since the mostly (very) low failure rate will probably mean a single run will pass the tests.
To me that would prove absolutely nothing about the need for the fix, but n=1, YMMV etc, etc :)
Let the powers-that-be decide on this one.
In the mean time I will continue to create single test fix issues, and surely not for the easy core credits, which I really don't need in the first place.
Comment #16
nicxvan commentedIn general I would agree with @dww's comment to expand, but here I think @spokje's reasoning is sound. These types of test failures require really long running multiple tests so it makes sense to fix one at a time.
Further there is a separate issue to address the race condition and anecdotally I've never seen this other tests fail.
Comment #17
dwwI was thinking that if release managers are saying “we shouldn’t use state in tests, only keyvalue, that we could do bulk.
But yes, strongly appreciate the long time it takes to do even single tests carefully. Wouldn’t dream of accusing you of lots of issues for credit. You’re a prolific contributor!
I was only wondering if we could add more scope without as much effort and rigor if we want to convert entirely. It’s a bit weird having a mix of both, so I’m hoping to minimize the time it takes to convert.
All that said, yes, following the issue where the underlying race condition in State API is hopefully going to be fixed. So maybe it’s moot to do this conversation?
Yeah, I’m torn, I guess we’ll leave it for the core committers to decide. Back to RTBC.
Thanks!
-Derek
Comment #18
donquixote commentedI'd say this issue needs an explanation why the problem occurs with state, but not with keyvalue.
From the comments here I understand it is a race condition, and from looking at the code, I assume it is caused by the cache layer in state, because otherwise state is just a wrapper around keyvalue.
So, this should be in the issue summary..
I do find this issue from April 2024, which might be related:
#3438424: [random test failures] Race condition in state when individual keys are set with an empty cache
Comment #19
spokje@donquixote There's a problem with multiple issues: When you're working on a lot of them, you forget that other people didn't. :)
Here's the issue where the root cause is attacked, hope that helps?
#3496257: Race conditions in CacheCollector/State (again)
Comment #20
catchGenerally on the scope:
I would essentially commit any quick fix to
ImageStylesPathAndUrlTestto stop it random failing including skipping it, because it is so annoying when it fails all the time.Fixing it instead of skipping it is much better.
I do think we should switch all test usage from state to key/value because it massively simplifies things, however it's currently useful that we haven't done that because it's the only way I know to reproduce #3496257: Race conditions in CacheCollector/State (again). So would be quite happy to commit individual fixes for the frequently random tests to make the pipelines more reliable, then handle other issues which aren't randomly failing (as often) together in one issue later.
Comment #22
catchCommitted/pushed to 11.x, thanks!
This doesn't apply to any previous branches including 11.1.x. I think it would be worth backporting to 10.5, 11.1, and 10.4 (11.0 and 10.3 if it's an easy cherry pick) so that we get less random failures on branch runs. Probably a backport to one branch would cherry-pick to all the others.
Comment #29
spokjeNot sure what either I or this issue has done to deserve waiingt 2:42 minutes for a full spellcheck on all files, but here we are...
Comment #31
catchCherry-picked back through to 10.3.x, thanks!
Comment #37
quietone commentedUpdated the proposed resolution with the issue where the root cause was discussed.