Problem/Motivation
In #2664322: key_value table is only used by a core service but it depends on system install there were test failures with PostgreSQL in ConnectionFailureTest that seemed to be caused by the Connection class having an incorrect statementClass. This was created as a follow-up as it should not block that issue.
Proposed resolution
The issue seems to have come from an inconsistency between the mysql driver and the pgsql driver in the tableExists method. The pg_tables view may be too slow. Find an alternative that is faster. See comment #17. Testing is covered under the ConnectionFailureTest class per @daffie.
A follow-up issue was created to address similar issues with findTables in #2921855: Default Database Schema::findTables() is slow for PostgreSQL.
Remaining tasks
-
User interface changes
None
API changes
None
Data model changes
None
| Comment | File | Size | Author |
|---|---|---|---|
| #55 | 2847855-nr-bot.txt | 146 bytes | needs-review-queue-bot |
| #48 | interdiff-2847855-43-48.txt | 3.29 KB | daffie |
| #48 | 2847855-48.patch | 9.01 KB | daffie |
Comments
Comment #2
benjifisherSee comments #74 through #99 (and perhaps more recent) for the initial discussion of this issue. Credit should go to @mradcliffe and @almaudoh.
I am attaching a patch based on 2664322-94.patch, taking only the part that applies to this issue. I will be surprised if any tests fail on this patch.
We still need to create a test that is fixed by this patch.
Comment #3
benjifisherComment #4
mradcliffeI'm wondering if we could make the $connection variable an explicit reference of a private variable that holds the PDO connection object?
Theoretical patch attached. Probably breaks horribly.
Comment #5
mradcliffeFlip back to needs work.
Comment #6
benjifisher@mradcliffe:
I am glad to see you working on this issue~ I think you have a better idea than I what is going on here.
Can you explain what you meant in #2664322-78: key_value table is only used by a core service but it depends on system install when you said,
The patch being tested makes the
key_valuetable lazy-loaded, so that seems like a good place to look, but I do not know where you see a query to that table. I think you are talking about this test (long comment snipped):Should I dig into the first line of the test to see what you mean?
Comment #7
mradcliffe@benjifisher:
Although I can't recall the specific thing I did to notice that, what I had been doing was to add some debug code to check what class/interface the connection class's statement was using. I think I needed to add the debug code because I was running the one test directly on the command line through run-tests script. Perhaps I put that code in Connection::destroy right before that condition, but I'm not sure exactly why I said that about key_value table query. I
I'm actually surprised the patch I posted in #4 was as successful as it was in the Postgresql run.
Let's move my logic into the pgsql driver instead and then check NULL in Database::openConnection for whether a new connection should be opened as well as isset condition.
Comment #8
mradcliffeMaybe also setting
$this->connection = &$this->actualConnection;would be safer as well even if there are no explicit test failures.Comment #10
benjifisherI have been doing some old-fashioned debugging, trying to come up with a minimal version of the patch from #2664322: key_value table is only used by a core service but it depends on system install that still produces the test failure. The attached patch is what I have come up with. Locally, it passes with MySQL and fails with PostgresQL. I hope the testbot confirms this!
One odd thing is that, when I run Xdebug, I never get to the line of code that throws an uncaught exception. Is that why you suspect a race condition, @mradcliffe?
The problem seems to come from
Drupal\Core\KeyValueStore\DatabaseStorage::getMultiple().Comment #12
daffie commentedMaybe the problem is not the PostgreSQL database, but the different implementations of the method tableExists. This patch make the implementation of PostgreSQL version the same as that of the MySQL version. Lets see if that makes any difference.
Comment #14
cilefen commentedThe priority may be higher: https://www.drupal.org/core/issue-priority
Comment #15
benjifisher@cilefen: I am not sure why you think the priority may be higher than "normal". Are you thinking of the criterion "Cause test failures in environments not supported by the automated testing platform" for "major" priority? The test failure is supported by the testing platform, although the test for PostreSQL is not run on all patches.
@daffie: I think we are getting close! If I did not need to take a break for sleep, I would have dug into
tableExists()myself.For the sake of readers who have not studied the test failures: the "fail" patch in #10 introduces a bunch of new failures on both MySQL and PostgreSQL, and it succeeds in triggering the failure in
ConnectionFailureTest()only with PostgreSQL. I consider this "fail" patch a success. ;) The "fail" patch in #12 only triggers the new failures, giving the same behavior for MySQL and PostgreSQL.There are a bunch of things going on here, and maybe they should be handled in separate issues. (I have not checked yet: some of those issues may already exist.)
PDOStatementinstead ofStatement. This causes an exception inDrupal\Core\KeyValueStore\DatabaseStorage::getMultiple().Drupal\Core\KeyValueStore\DatabaseStorage::getMultiple()is very lax.tableExists()differently.Comment #16
cilefen commentedSeriously I'm just on a phone and saw "race condition".
Comment #17
mradcliffeOh, wow, nice find @daffie. I didn't know there was a difference.
Updated issue summary.
This needs to be wrapped in a transaction (probably using the connection::addSavepoint() method iirc) so that a failure does not stop any subsequent writes.
I think a somewhat faster query than pg_tables could be to join pg_catalog.pg_namespace and pg_catalog.pg_class. But this may not be as fast as a transaction-wrapped select on the table itself.
SELECT 1 FROM pg_catalog.pg_class c INNER JOIN pg_catalog.pg_namespace n ON (n.oid = c.relnamespace) WHERE n.nspname='public' AND c.relkind = 'r' AND relname = :tablenameComment #18
daffie commentedThere are 4 different versions of the method Schema::tableExists(). One version for all databases and three other that all override the base version. This patch adds the code for each version in a try-catch statement. I have added two patch: one with the fix only and one with the fix and the patch from #2664322-111: key_value table is only used by a core service but it depends on system install. Both patches should pass the testbot.
@mradcliffe: Making the code faster is a great idea, but lets do that in another issue.
Comment #19
MixologicI just spent the entire day yesterday digging into the performance issues on postgres on the testbots, because we dont really want postgres to be that much slower in testing, and therefore cost us more resources.
So, what I discovered is that huge majority of the time spend in queries in postgres are to the information_schema tables. In #postgresql yesterday I was made aware that the queries to the information_schema table are *very* slow, :
Which tells me that we should avoid hitting information_schema at all. pg_class and pg_type are the two system catalogs that end up getting hammered with disk writes.
During a mysql test, there is 7.8 GB of total disk writes, and during a pgsql test, there is about 180 GB of disk writing, and almost all of it is to the two files that comrprise pg_class and pg_type.
Comment #20
daffie commentedChanged the PostgreSQL query for Schema::tableExists() so that it will be faster. Using the query from @mradcliffe in comment #17.
Removing the "Needs tests" tag, because the test
ConnectionFailureTestdoes the testing we need for this problem.Comment #21
daffie commentedForgot to add the transaction savepoints.
Comment #24
daffie commentedOnly adding transaction savepoints to the PostgreSQL method Schema::tableExists().
Comment #25
benjifisher@daffie, it looks as though you already suggested simplifying the implementations of
tableExists()in #2797141: Remove the methods tableExists() and fieldExists() from Drupal\Core\Database\Driver\mysql\Schema. I am adding that as a related issue.Comment #29
mradcliffeI updated the issue summary based on comments and added the tests for a spin on 8.7.x. I'm sorry for taking so long.
I looked at the latest patch in #24 and that makes sense. I think we're covering all the bases here by wrapping in our savepoint transaction.
Comment #30
mradcliffeNeeds a re-roll.
Comment #31
nigelcunningham commentedI'm intending to pick this up tomorrow as it and https://www.drupal.org/project/drupal/issues/2664322 are causing me issues with a custom profile install. Assigning to self for that reason.
Comment #32
nigelcunningham commentedUpdating to 8.7-dev at the time of writing (88ad0137b5d0cd9267160e29).
Comment #33
nigelcunningham commentedComment #36
daffie commentedRerolled the patch and added testing.
Comment #38
benjifisherThe failures are expected, so I am setting the status back to NR.
Order matters. When uploading multiple patches, one of which is expected to fail, add a patch that is expected to pass last.
Comment #39
andypostJust quickly skimmed the patch and it looks sqlite could be affected by related
Better to use ::class notation, it is shorter and better navigable in IDEs
Comment #40
daffie commentedDid not know that, thought the ordering was random. @benjifisher: Thanks, I learned something new.
Addressed the remark made by @andypost in comment #39.
Comment #42
mradcliffeShouldn't class doc block have a full sentence before @group?
Shouldn't there be a full sentence before @covers or any other annotation?
I was a little confused by the "table_does_not_extst" as my mind wants to look at it as a typo "table_does_not_exist". It doesn't really matter, but it might be confusing to someone else.
Comment #43
daffie commentedThank you for your review @mradcliffe.
1. Fixed.
2. I am not sure if there needs to be a full sentence before @covers. I followed the example in https://www.drupal.org/node/2116043#s-examples--3. The example is about @coversDefaultClass and applied the same idea to @covers. If you think that adding a sentence is an improvement then I will add one.
3. Fixed. Had to read your comment twice, before I saw my own typo.
Comment #45
mradcliffeWorks for me.
Comment #46
benjifisherOnce an issue is RTBC, the latest patch is scheduled for daily re-testing. There may be exceptions for patches that were not scheduled for test when uploaded, or have the magic string "-do-not-test"; I am not sure. The string "-tests-only-should-fail" is not magic, and I see "re-tests daily" next to the test that we expect to fail.
See also the discussion in #38 and #40.
I am re-uploading the "right" patch from #43.
Comment #47
alexpottIs there a more specific exception to catch here - ie should we only catching \PDOException exceptions? Catching \Exception is very generic.
Comment #48
daffie commentedI see no problem in changing the use of
\Exceptionto\PDOException.Comment #49
xjmSince we should fix this bug in D8 too, I'm filing it against 8.8.x (which is the current bugfix support branch). Patches can be tested against other branches as needed when they're uploaded.
The issue will be automatically updated to 8.9.x after the last 8.8.x bugfix release. Thanks!
Comment #55
needs-review-queue-bot commentedThe Needs Review Queue Bot tested this issue. It either no longer applies to Drupal core, or fails the Drupal core commit checks. Therefore, this issue status is now "Needs work".
Apart from a re-roll or rebase, this issue may need more work to address feedback in the issue or MR comments. To progress an issue, incorporate this feedback as part of the process of updating the issue. This helps other contributors to know what is outstanding.
Consult the Drupal Contributor Guide to find step-by-step guides for working with issues.