Problem/Motivation

A random test failure will occur for Drupal\aggregator\Tests\Views\AggregatorItemViewsFieldAccessTest when it is run as the first test in the test suite because the database installation tasks are not run, and config's DatabaseStorage gets a stale connection.

Steps to reproduce:

  1. Execute ALTER DATABASE {db_name} SET standard_conforming_strings = 'on';
  2. Execute ALTER DATABASE {db_name} SET standard_conforming_strings = 'off';
  3. Run Drupal\aggregator\Tests\Views\AggregatorItemViewsFieldAccessTest it will fail.
  4. Run it again it will pass.

Proposed resolution

Modify KernelTestBase and make sure that database installation tasks are run in that test class.

Remaining tasks

  • Write patch (done by alexpott)
  • Testing infrastructure information (provided by Mixologic/mradcliffe)
  • Review patch (done by dawehner)
  • Manually tested (done by plach)

User interface changes

None.

API changes

None.

Data model changes

None.

Beta phase evaluation

Reference: https://www.drupal.org/core/beta-changes
Issue category Bug report because this is a random test failure due to not being caught by run-tests concurrency.
Issue priority Critical because HEAD is broken for postgresql

Comments

alexpott created an issue. See original summary.

alexpott’s picture

Status: Active » Needs review
StatusFileSize
new2 KB

I've fixed the old KernelTestBase and tested WebTestBase and KernelTestBase(tng) and they both pass if they are the first test.

Reverting #1031122: postgres changeField() is unable to convert to bytea column type correctly fixes this - I don't think we should revert that because it was critical. I'm not sure why DrupalCI tests did not fail in that issue - that needs investigation too.

alexpott’s picture

So in the last green run on #1031122: postgres changeField() is unable to convert to bytea column type correctly - https://dispatcher.drupalci.org/job/default/12472/consoleFull - this test does not appear in the output. I suspect a DrupalCI bug fix.

mradcliffe’s picture

I think that test is run, and it passed on DrupalCI. The order of test results is not guaranteed because of concurrency > 1.

09:30:47 Drupal\aggregator\Tests\FeedValidationTest                    12 passes                                      
09:30:48 Drupal\system\Tests\Action\ActionUnitTest                     15 passes                                      
09:30:48 Drupal\aggregator\Tests\Views\AggregatorFeedViewsFieldAccess  15 passes                                      
09:30:48 Drupal\aggregator\Tests\Views\AggregatorItemViewsFieldAccess  20 passes                                      
09:30:52 Drupal\aggregator\Tests\Views\IntegrationTest                 38 passes                                      
Mixologic’s picture

I ran this test all by itself and it fails:

https://dispatcher.drupalci.org/job/DCI_XML/100/

This isn't an issue with drupalci - they do fail if its the first test. The issue is that there is a totally random order to run-tests.sh every time.

This test just happens to pass when some other test has already paved the database way for it.

xjm’s picture

This isn't an issue with drupalci - they do fail if its the first test. The issue is that there is a totally random order to run-tests.sh every time.

But then why did the test never run in https://dispatcher.drupalci.org/job/default/12472/consoleFull (per @alexpott)? That would point at a problem in test discovery.

Edit: Or is the idea that the test did actually run and that comment was erroneous? It does actually seem to be listed on that page.

alexpott’s picture

Alexpott-- I can't find stuff on a webpage. If a webtestbase was first it would then pass cause the db would be fine.

dawehner’s picture

+++ b/core/modules/simpletest/src/KernelTestBase.php
@@ -193,6 +193,20 @@ protected function setUp() {
     }
     $this->kernel->boot();
 
+    // Ensure database tasks have been run.
+    require_once __DIR__ . '/../../../includes/install.inc';
+    $connection = Database::getConnection();
+    $errors = db_installer_object($connection->driver())->runTasks();
+    if (!empty($errors)) {
+      $this->fail('Failed to run installer database tasks: ' . implode(', ', $errors));
+    }
+
+    // After running the database tasks we have to reboot the container so that
+    // a new database connection is created.
+    $this->kernel->shutdown();
+    $this->kernel->boot();

This feels potentially fragile. Can't we call $connection->close(); $connection = Database::getConnection() or something like that?

Mixologic’s picture

@alexpott: to be fair, Drupal\aggregator\Tests\Views\AggregatorItemViewsFieldAccessTest is too long to show up in run-tests.sh default columnar output, so it gets cut off at Drupal\aggregator\Tests\Views\AggregatorItemViewsFieldAccess. Truncated output achievement unlocked.

dawehner’s picture

Title: AggregatorItemViewsFieldAccessTest fails on postgres because it is the very first test » First test fails on postgres because of a stale connection
alexpott’s picture

StatusFileSize
new4.99 KB

Here is an alternate fix. Basically we have a problem with config's database storage being constructed by booting the kernel. Then Postgres's installer tasks come along and use Database::closeConnection() which completely messes up the connection in config's database storage (and the container). There are other methods on the Database class that don't play nice with the container either.

alexpott’s picture

StatusFileSize
new718 bytes
new2.18 KB

Talked to @dawehner on IRC we both agreed that the scope of #11 seems quite big and whilst it might be desirable long term it's not necessary to fix this critical. Uploading #2 with improved docs.

alexpott’s picture

StatusFileSize
new560 bytes
new1.87 KB

Removed unnecessary change.

plach’s picture

Issue summary: View changes
Issue tags: +D8 Accelerate

I manually tested this (updated the STR) and the patch fixes the issue. Given how tests are run in our infrastructure, it's not possible to provide a reliably failing test for this, so manual testing should be enough.

dawehner’s picture

+++ b/core/modules/simpletest/src/KernelTestBase.php
@@ -193,6 +193,22 @@ protected function setUp() {
+    // After running the database tasks we have to reboot the container so that

It still IMHO doesn't explain exactly why we need a new database connection (because the tasks potentially close them?)

alexpott’s picture

StatusFileSize
new1.41 KB
new1.89 KB

Better comment.

plach’s picture

Status: Needs review » Reviewed & tested by the community

Previous patch is green, this is just a doc change, so no need to wait for the bot :)

mradcliffe’s picture

Issue summary: View changes

Updated issue summary with beta evaluation template.

catch’s picture

Status: Reviewed & tested by the community » Fixed

Committed/pushed to 8.0.x, thanks!

  • catch committed d7509fa on 8.0.x
    Issue #2565241 by alexpott: First test fails on postgres because of a...
bzrudi71’s picture

Sad to report that this seems not completely fixed. Over in #2426579: Change method name for pgsql driver's ensureIdentifiersLength() to accurately describe what the method is doing the first test was failing during one of the PG bot runs :(
Please see: https://www.drupal.org/pift-ci-job/31911

alexpott’s picture

@bzrudi71 we at least it is a different type of failure. Either @catch or @Mixologic told me there might be a DtupalCI timing issue with postgres db not being quite ready by the time the job starts. Maybe that caused that.

Mixologic’s picture

We had hacked in a 62 second wait (used to be 5) hoping that would be enough. Will add this to our 'to look at list'

bzrudi71’s picture

Ah yes, thanks! And I just discovered #2564659: pgsql does not always turn on where @isntall already committed a first patch, let's see :)

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.