Problem/Motivation
DatabaseBackendUnitTest fails currently with PostgreSQL as database backend. Note: #2443647: PostgreSQL: Fix system\Tests\Cache\ChainedFastBackendUnitTest fails for the same reason, so no matter which one we work on first, it will fix the other issue too.
This is because DatabaseCacheTagsChecksum does a select query via the query method, which we do not wrap transactions around. Normally all select queries get the savepoint added and released, which would allow the cachetags table to get created and Drupal can continue on.
Options:
- Change DatabaseCacheTagsChecksum to use select class?
- Add transaction handling in DatabaseCacheTagsChecksum (I made a small patch, but seemed to have more failures).
- Add transaction handling, but only for preg match of select queries in Connection::query() in pgsql driver.
Proposed resolution
Identify and fix the failing tests.
Remaining tasks
Write patch.
User interface changes
None.
API changes
None.
Comment | File | Size | Author |
---|---|---|---|
#27 | 2443651.27.patch | 3.24 KB | alexpott |
#27 | 20-27-interdiff.txt | 1.25 KB | alexpott |
#20 | 2443651.20.patch | 2.97 KB | alexpott |
#9 | postgresql_fix-2443651-9.patch | 1.15 KB | mradcliffe |
#9 | interdiff-2443659-5.txt | 1.34 KB | mradcliffe |
Comments
Comment #1
mradcliffeComment #2
bzrudi71 CreditAttribution: bzrudi71 commented+1 for: change DatabaseCacheTagsChecksum to use select class. Seems like the cleanest and best solution to me but still open for other solutions :-)
Comment #3
BerdirThis is in the critical path just like cache get, where we explicitly avoid using the select method ( see DatabaseBackend).
Why is this a problem but that one isn't? Why doesn't every single test fail because of this? Maybe they implicitly rely on set() being called first?
Comment #4
daffie CreditAttribution: daffie commentedIt all looks good to me.
I can confirm that the test fails for postgreSQL and with the patch the test passes for postgreSQL.
As far as I can see the code changes but does the same.
So for me it is RTBC.
Comment #5
BerdirMy questions have not been answered yet.
Comment #6
amateescu CreditAttribution: amateescu commentedThis would've been fixed by #2371709: Move the on-demand-table creation into the database API :/
Comment #7
bzrudi71 CreditAttribution: bzrudi71 commentedYep, #2371709: Move the on-demand-table creation into the database API seems to be the root cause and should fix this issue without doing quirks within the pgdriver itself. If the try->catch stuff is gone, PG should pass tests...
Thanks @amateescu for the link to the other issue.
Comment #8
andypostAlso using select here is performance regression because query alter overhead.
The change should be measured.
Comment #9
mradcliffeWould wrapping it in a transaction be worse for performance? If the query is not guaranteed to succeed, then it should be wrapped in a transaction for PostgreSQL (and MySQL strict mode).
I looked into driver-specific code, but I think that's too expensive for PostgreSQL because I had to preg_match all select queries run through query. This is just one query. High-performance sites hopefully will be running a non-database backend for caching.
I think band-aid fixing this now is fine because the other issue stalled out.
Comment #10
bzrudi71 CreditAttribution: bzrudi71 commentedUsually transactions are not that worse in terms of performance as one would think, so this seems the way to go before we get rid of the root cause in [#2371707]. And please let us not even think of preg_matching in queries any more :-) To make sure this one is good to get in I will start a complete bot run tomorrow before RTBC.
Comment #11
BerdirAs commented before, I would still like to understand why this fails but the actual backend works, which uses an absolutely identical pattern? And why does it *only* fail in that kernel test, and not everywhere? We have to run into this scenario after *every* installation?
If you look at GenericCacheBackendUnitTestBase::getCacheBackend(), that explicitly calls deleteAll(), which should be not needed, because we should be able to assume that any backend will respect the test environment and be empty already. However, if you look at deleteAll(), it does a truncate query and contains the table-creation logic as well, my guess is that creates the cache table?
But that doesn't explain what creates the for web tests and real installations?
Comment #12
mradcliffeI spent about 9 months going around and around debugging the install sending garbage queries before I gave up and wrapped Select queries in implicit commit save points (#2001350: [meta] Drupal cannot be installed on PostgreSQL and #2181291: Prevent a query from aborting the entire transaction in pgsql).
If I had to guess I think that because KernelTestBase doesn't run through the install it doesn't emulate the same behavior, and then it tries to get cachetags table, which fails, aborts the transaction because there's no implicit commit, and the test fails. DatabaseBackendTagTest calls db_select() on cachetags which will be wrapped in the savepoint handling by Select class so that's why that does not fail.
I think you're correct and we could also add a transaction around the actual cache table select to be safe, but the less amount of these transactions the better in my opinion. Could it be possible that a contrib module that wants to help manage cache could be in the same situation, switch to database back end, and try to get cache table immediately? I don't see a specific test for swapping out the cache backends and immediately trying to get cache, but I don't know if we already have coverage for that somewhere else.
I'd rather stick with the approach of adding transaction/savepoint for now until the cache backend schema situation gets a permanent solution (probably not until after 8.1.x development cycle).
Comment #13
bzrudi71 CreditAttribution: bzrudi71 commentedI vote for this band-aid fix for now. A todo is added and I have no idea how long it will take to fix the root cause. Guess that we will not have it fixed for 8.0.0 :-(
Nitpick: gauranteed -> guaranteed
Comment #14
BerdirNot sure, but anyway, we need to profile this, for example with the internal page cache.
As mentioned, this is one of the 2-3 queries that is executed there, even small changes can have quite an impact there, and it is also a query that is called a lot.
Comment #15
mradcliffeI'll try and get setup with Apache Benchmark and YSlow for phantom.js on Saturday and Sunday. Next step would be to add the transaction handling around the query('select' in the cache DatabaseBackend class as well.
Comment #16
mradcliffeI talked with @Crell yesterday about this, and another solution may be to create tagged services for PostgreSQL, MySQL, and SQLite for these back ends so that each backend can extend parent for performance / niche bugs.
Comment #17
BerdirYes, possibly, but that will require manual configuration to actually use those replacement services AFAIK, it won't happen automatically.
It's easier if there's a factory, which is the case for the cache DatabaseBackend (We're adding a mysql specific implementation in #2336627: Deadlock on cache_config (DatabaseBackend::setMultiple()))
Comment #18
andypost@mradcliffe please elaborate the idea with tagged services.
I don't get how
cachetags
table creation relates to cache backend... does it mean a sort ofPgsqlCacheBackend
that could be useful somehowComment #19
alexpottThese fails are caused in Postgres because
setMultiple()
is called before anything else withintestSetMultiple
. If$backend = $this->getCacheBackend();
didn't delete all data from the cache bins under test then we'd get an error on creating the cache table too.Comment #20
alexpottPatch attached fixes the fails in POstgres and I think will be performance neutral.
Different approach to the earlier patches so no interdiff.
Comment #21
BerdirNot sure I understand the array_values(), I get that we don't need the keys but it doesn't hurt if we pass them in?
If that fixes PostgreSQL, then that's fine with me.
Comment #22
alexpott@Berdir it is a performance optimisation - save unnecessary looping...
If we pass in the keys we have to loop for every call to ->values(). Seems pointless here.
Comment #23
bzrudi71 CreditAttribution: bzrudi71 commentedNice, we have pass now! @Berdirs concerns seem addressed and I found nothing obvious wrong while reviewing the new approach. Let's move forward here. RTBC
Comment #24
amateescu CreditAttribution: amateescu commentedI ran
Drupal\system\Tests\Cache\DatabaseBackendUnitTest
on SQLite with this patch applied and we're good :)Comment #25
bzrudi71 CreditAttribution: bzrudi71 commentedThanks @amateescu for the additional SQLite test :-)
Comment #26
alexpottThanks @amateescu - note that this also fixes #2443647: PostgreSQL: Fix system\Tests\Cache\ChainedFastBackendUnitTest.
Comment #27
alexpottAfter discussing with @catch improved the comments to explain why this function is as it is. Just in case anyone gets the urge to optimise before we have PostgreSQL testbots.
Comment #28
xjm+1 for the comments and thanks for the SQLite run.
Seems like a catch patch.
Comment #29
catchCommitted/pushed to 8.0.x, thanks!