Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
Problem/Motivation
There is a race condition when cache tables are recreated during a truncate. This causes exceptions to be thrown.
The original report
after upgrading to 8.6.12 from 8.6.10 seeing this error.
see screenshots
how to create.
click the "clear cache" button in admin/content/config
how to fix.
restart database server (mariadb on Fedora 25 linux)
recurrence
seems to be about every other day.
Proposed resolution
Catch a more generic exception if there are still problems the system will crash in other ways.
Remaining tasks
User interface changes
None
API changes
None
Data model changes
None
Release notes snippet
N/a
Comment | File | Size | Author |
---|---|---|---|
#15 | 3043907-15.patch | 8.94 KB | alexpott |
#15 | 5-15-interdiff.txt | 1.48 KB | alexpott |
#5 | 3043907-5.patch | 8.87 KB | dawehner |
error2.png | 329.49 KB | starlightE | |
error1.png | 130.65 KB | starlightE |
Comments
Comment #2
cilefen CreditAttribution: cilefen as a volunteer commentedI recommend putting the exception text in the issue summary so someone may find this. Contributors are busy and may not look at screenshots.
Also I recommend retitling this to “Cache clear occasionally throws” followed by a condensed bit of the exception message.
Comment #3
cilefen CreditAttribution: cilefen as a volunteer commentedThe class containing ensureBinExists was last changed in December 2017.
I find it interesting that ensureBinExists tries to catch a SchemaObjectExistsException however Connection::handleQueryException will never produce a SchemaObjectExistsException.
The cache clearing codepath executes "TRUNCATE TABLE" on each cache table. "Truncate operations drop and re-create the table, which is much faster than deleting rows one by one, particularly for large tables", according to the MySQL docs.
So I think that ensureBinExists isn't performing as intended because it cannot catch the correct exception in the event another process has already created the table.
Comment #4
catch#3046962: Race condition causing 500 on cache table creation was duplicate.
Comment #5
dawehner@catch
Thank you for finding the duplicated issue.
It seems there are two things we can do:
I've implemented the second option as it seems to be pragmatic.
Comment #6
catchAgreed with catching the less specific exception, it could potentially lead to silently not creating the table, but I cant really think of a situation where that would happen permanently and not affect anything else that would log/show an error.
Comment #7
johnwebdev CreditAttribution: johnwebdev commentedThat seems like a work-around though...? Can we reproduce this in a test? We should handle this and not gracefully ignore it IMO. Lock mechanism seems more valid to me.
Comment #8
dawehnerit is really hard to write reliable/useful tests for race conditions. I can't think of a good one, but I might miss something.
Comment #9
alexpottI've discussed the solution with @dawehner and I agree with @catch that this seems like the best solution. I don't think we can use the lock system because the lock system is affected by this :)
I've checked that all code that does this on demand table creation is fixed and it appears to be. I think we should file a non-blocking follow-up to try to move some of this auto table creation logic to a trait.
Comment #10
alexpottI found the followup already in the issue queue :) #2721271: Simplify exception handling for lazy-load pattern
Comment #11
johnwebdev CreditAttribution: johnwebdev commentedYeah, not sure. OWASP suggests black box testing. not sure if it's ever been done in Drupal, but it's definitely a interesting approach to see if more unexpected behaviour can be found by concurrency issues.
Comment #12
dawehner@johndevman I'd be interested in seeing whether this is actually possible. That would be cool.
Comment #13
Wim LeersWe could write a test that manually throws the exception that should have been caught. Though that would again lead to a false sense of safety.
Given that, I think we can move forward with the current patch.
Comment #14
cilefen CreditAttribution: cilefen as a volunteer commentedUpdating credit...
Comment #15
alexpottRemoving some unused uses.
Comment #17
alexpottUnrelated Javascript test fail.
Comment #18
catchCommitted/pushed to 8.8.x and cherry-picked to 8.7.x, thanks!