Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
Problem/Motivation
Drupal\config\Tests\ConfigFormOverrideTest
and Drupal\system\Tests\DrupalKernel\ContainerRebuildWebTest
have 5 random failures.
See https://qa.drupal.org/pifr/test/1119568.
#2516918: Prevent mobile browsers from zooming on all form inputs is a CSS-only patch, yet still has these failures. Also seen at #2476407-71: Use CacheableResponseInterface to determine which responses should be cached, and several other places.
Proposed resolution
Remaining tasks
Find the root cause.
User interface changes
None.
API changes
None.
Data model changes
None.
Comments
Comment #2
Wim LeersTwo more cases of this problem.
Comment #3
Wim LeersAnd another.
Comment #4
Wim Leers@borisson_ used
git bisect
to find the root cause, over at #2504139-69: Blocks containing a form include the form action in the cache, so they always submit to the first URL the form was viewed at. He thinks the culprit is #2501931: Remove SafeMarkup::set in twig_render_template() and ThemeManager and FieldPluginBase:advancedRender.Comment #5
Wim LeersTBH, I don't yet see how #2501931: Remove SafeMarkup::set in twig_render_template() and ThemeManager and FieldPluginBase:advancedRender could cause this. The failures in both tests seem related to some container/settings/config override, in any case, "boot stuff". #2501931 didn't touch that at all.
Comment #6
Wim LeersCuriously, it seems that only Classic TestBot is failing, not DrupalCI!
Comment #7
alexpottI've run
Drupal\config\Tests\ConfigFormOverrideTest
test 200 times locally - no fails.Comment #8
alexpottComment #9
olli CreditAttribution: olli commentedAnother one in #2539222-9: Exception when deleting a translation when there is no canonical link template. Critical?
Comment #10
justAChris CreditAttribution: justAChris as a volunteer commentedAnother, I think:
#2550985-10: Remove SafeMarkup::set in _batch_test_finished_helper()
Comment #11
davidhernandezJust saw ConfigFormOverrideTest fails on this one #2363423: views-view-fields.html.twig gets escaped.
Comment #12
Wim LeersAnd another: #2552013-11: Follow-up for #2481453: ensure the 'url.query_args': MainContentViewSubscriber::WRAPPER_FORMAT cache context is set.
Comment #13
Wim LeersAdding #11's.
Comment #14
davidhernandezI'm not sure how you could bisect this to a commit unless you can reliably trigger the test fails. Is anyone actually able to do that?
The asserts that are failing are right after a call to writeSettings(). Some random environment problem causing a write problem? Those aren't the only tests calling writeSettings() or doing any writing, so that doesn't make sense. \Drupal\config\Tests\ConfigEntityFormOverrideTest is doing some similar things and it isn't failing.
Comment #15
borisson_The bisect I did, that @Wim Leers mentioned in #4 is not related to these random failures.
There was a misunderstanding.
Comment #16
Wim LeersI've been working with Alex Pott, Mixologic and amateescu to find the root cause of this problem.
Mixologic and I quickly came to suspect PIFR, because of #6. Over the weekend, Mixologic already implemented #2551309: Increase opcache.max_accelerated_files (but had not yet updated that issue). Since MWDS happened, PIFR had built up quite the queue of tests (DrupalCI doesn't suffer from this problem since it has autoscaling). On top of that, there are >=1 patches that caused testbot to enter an endless loop (hence reducing the effective number of testbots). So, to help reduce the queue, Mixologic spawned a whole bunch of additional (PIFR) testbots.
Those new testbots all have the changes from #2551309: Increase opcache.max_accelerated_files. We don't know yet with 100% certainty why those are causing test failures, but we strongly suspect that the failing tests are doing writes in quick succession, which causes the written files'
mtime
to be the same, which causes PHP 5.5's OpCache to continue to use the old (but modified-in-the-same-second) cached files to be used.And it's because of that mix of majority-newly-spun-up, minority-already-running PIFR testbots that most (but not all!) PIFR testbots are showing these failures.
Over at #2126447-72: Patch testing, @amateescu did an experiment: he posted patches that reverted all 3 commits of today. That could've indicated that HEAD was broken, but it showed the opposite: that it must have been testbot.
Mixologic is now fixing these newly spun up testbot instances.
Comment #17
Wim LeersMore.
Comment #18
Wim LeersMixologic has updated the existing classic (PIFR) testbots, and now they're no longer causing these failures. Hurray! :)