Problem

Upstream issue: https://bugs.xdebug.org/view.php?id=2222

Drupal does auto-commit transactions on shutdown, as discussed at #1608374: Database autocommit disabling at shutdown. At the moment, quite some code relies on that behavior.

For that to work, Drupal\Core\Database\Transaction::__destruct must run before Drupal\mysql\Driver\Database\mysql\Connection::__destruct. If the order is reversed, this results in nasty fatal errors like the following:

Fatal error: Uncaught Error: Call to a member function inTransaction() on null in /var/www/web/core/modules/mysql/src/Driver/Database/mysql/Connection.php on line 483

Error: Call to a member function inTransaction() on null in /var/www/web/core/modules/mysql/src/Driver/Database/mysql/Connection.php on line 483

Call Stack:
15.7059 86985504 1. Drupal\Core\Database\Transaction->__destruct() /var/www/web/core/lib/Drupal/Core/Database/Transaction.php:0
15.7059 86985504 2. Drupal\Core\Database\Connection->popTransaction($name = 'drupal_transaction') /var/www/web/core/lib/Drupal/Core/Database/Transaction.php:71
15.7059 86985504 3. Drupal\mysql\Driver\Database\mysql\Connection->popCommittableTransactions() /var/www/web/core/lib/Drupal/Core/Database/Connection.php:1447
15.7059 86985880 4. Drupal\mysql\Driver\Database\mysql\Connection->doCommit() /var/www/web/core/modules/mysql/src/Driver/Database/mysql/Connection.php:415

As a result the transaction is not committed. In our project, this suddenly (=zero change) popped up and reliable killed every "drush cr" or simple drush commands like "drush user:unblock".

According to https://www.php.net/manual/en/language.oop5.decon.php the object destruction order during shutdown is random:

PHP possesses a destructor concept similar to that of other object-oriented languages, such as C++. The destructor method will be called as soon as there are no other references to a particular object, or in any order during the shutdown sequence.

Steps to reproduce

See steps to reproduce in #146

We ran into this at drunomics on a rather large project, on CI builds, suddenly without changes to code or the docker-controlled environment. The error only happened during CLI invocations, but was reproducable also without drush.

It turns out that most of the time the object destruction order is correct and no problems arise. But somehow for CLI invocations the order can become reliable different, thus causing the problem.

Proposed resolution

When a transaction is created register a shutdown function to ensure all transactions are committed before PHP starts to destroy objects.

Workarounds

- ensure that develop is not in your xdebug.mode value
- pin xdebug in CI image to xdebug 3.2.2

API changes

Add \Drupal\Core\Database\Database::commitAllOnAllConnections()
Add \Drupal\Core\Database\Connection::commitAll()
Add \Drupal\Core\Database\Transaction\TransactionManagerBase::commitAll()

All functions are additions and marked as @internal

Data model changes

None

Release notes snippet

N/a - sites do not need to make any changes.

Issue fork drupal-3405976

Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

fago created an issue. See original summary.

fago’s picture

Not sure what went wrong with this MR - here a patch file.

fago’s picture

Issue summary: View changes
fago’s picture

Status: Needs review » Needs work
petar_basic’s picture

Version: 10.2.x-dev » 10.1.x-dev
mglaman’s picture

We're hitting this same order and stack trace! I can only reproduce this on our CI environment. And only when executing Drush commands.

Fatal error: Uncaught Error: Call to a member function commit() on null in /app/docroot/core/lib/Drupal/Core/Database/Connection.php on line 1508

Error: Call to a member function commit() on null in /app/docroot/core/lib/Drupal/Core/Database/Connection.php on line 1508
Call Stack:
    0.5549   49134872   1. Drupal\Core\Database\Transaction->__destruct() /app/docroot/core/lib/Drupal/Core/Database/Transaction.php:0
    0.5549   49134872   2. Drupal\Core\Database\Connection->popTransaction($name = 'drupal_transaction') /app/docroot/core/lib/Drupal/Core/Database/Transaction.php:71
    0.5549   49134872   3. Drupal\Core\Database\Connection->popCommittableTransactions() /app/docroot/core/lib/Drupal/Core/Database/Connection.php:1447
    0.5549   49135248   4. Drupal\Core\Database\Connection->doCommit() /app/docroot/core/lib/Drupal/Core/Database/Connection.php:1494
mglaman’s picture

Note: we're hitting this on MySQL and SQLite drivers.

fago’s picture

Status: Needs work » Needs review

The previous patch was not correctly committing the transactions. Fixed it, now changes actually persist as the should.

> We're hitting this same order and stack trace! I can only reproduce this on our CI environment. And only when executing Drush commands.

ha, same for us. But it was reproduable with php-cli without drush also + on multiple CI servers.

I was already thinking the problem is probably in the other drivers as well. Let's make sure the mysql works fine, then we can port the fix to all drivers + to 10.2.x. we'll do some more testing with it

fago’s picture

Status: Needs review » Needs work
mglaman’s picture

For 11.x we'll need to refactor for this change record: https://www.drupal.org/node/3381002

And then roll a 10.1.x specific fix, it seems

Edit: I also wish I could know _what_ changed to surface this. I rolled back from an 8.1.26 container to 8.1.25 and it still occurred. I'm really confused.

mondrake’s picture

IMHO, another reason we should do #3398767: [PP-1] Allow explicit COMMIT in Transaction objects and eventually get rid of commits on objects destruction.

fago’s picture

> Edit: I also wish I could know _what_ changed to surface this. I rolled back from an 8.1.26 container to 8.1.25 and it still occurred. I'm really confused.

I can totally second that. We rolled back all changes, made sure images did not change, tried multiple servers, everything. No idea what suddenly caused this still, I'm quite puzzled about that. We've seen the issue to pop up on 01.12 morning, since it's reliably failing on CI.

mglaman’s picture

I tried on 10.2, same error just different flavor of text:

php vendor/bin/drush ucrt foobarbaz '--mail=foobarbaz@example.com' '--password=foobarbaz' '--format=json' -vvv

Fatal error: Uncaught AssertionError: Transaction $stack was not empty. Active stack: 656e425e435885.60641187\drupal_transaction in /app/docroot/core/lib/Drupal/Core/Database/Transaction/TransactionManagerBase.php on line 99
AssertionError: Transaction $stack was not empty. Active stack: 656e425e435885.60641187\drupal_transaction in /app/docroot/core/lib/Drupal/Core/Database/Transaction/TransactionManagerBase.php on line 99

Call Stack:
    0.6608   50491696   1. Drupal\Core\Database\Transaction\TransactionManagerBase->__destruct() /app/docroot/core/lib/Drupal/Core/Database/Transaction/TransactionManagerBase.php:0
    0.6608   50491824   2. assert($assertion = FALSE, $description = 'Transaction $stack was not empty. Active stack: 656e425e435885.60641187\\drupal_transaction') /app/docroot/core/lib/Drupal/Core/Database/Transaction/TransactionManagerBase.php:99
mondrake’s picture

What #15 says for D10.2 is that the TransactionManager object is being destructed while the root Transaction object in the stack is still active.

I guess you know that.

In the hope that this is not because of serialization/deserialization effects, the only reason I can imagine is unexpected destruction order.

While I strongly believe that #3398767: [PP-1] Allow explicit COMMIT in Transaction objects and later issues to remove commits during object destruction should be the master way forward, I think that a temporary fix (for D10.2) could be to commit the client transaction (is this the wanted behavior?) during TransactionManager destruction if the stack is not empty, instead of the current assert.

In fact when I wrote the TransactionManager code I left the assert there to see if there were side effects. This is one.

EDIT - Suggestion above still remains a last resort, TBH, if this cannot be solved in calling code. Is drush-only code being called in the traces reported? If so, would it be possible to check if a $transaction is being opened anywhere, and if so, can it be explicitly committed with unset($transaction) before letting the object go out of scope?

mglaman’s picture

@fago said they could reproduce without Drush

The error only happened during CLI invocations, but was reproducable also without drush.

In my case, the Drush command only has this code interacting with Drupal core https://github.com/drush-ops/drush/blob/12.x/src/Commands/core/UserComma...

        if (!$this->getConfig()->simulate()) {
            if ($account = User::create($new_user)) {
                $account->save();
mglaman’s picture

For now, in our CI, I'm manually applying this patch. It's MR #5677 but on the parent Connection class. And we're getting builds to proceed.

diff --git a/core/lib/Drupal/Core/Database/Connection.php b/core/lib/Drupal/Core/Database/Connection.php
index 7e405e1b1b..868bc8aaf5 100644
--- a/core/lib/Drupal/Core/Database/Connection.php
+++ b/core/lib/Drupal/Core/Database/Connection.php
@@ -270,6 +270,17 @@ abstract public static function open(array &$connection_options = []);
    * Ensures that the client connection can be garbage collected.
    */
   public function __destruct() {
+
+    // Ensure all still-open transactions get auto-committed. Usually, this
+    // happens when the Transaction::__destruct() method is invoked, but during
+    // shutdown the object transaction order is unreliable. If the connection
+    // is destroyed first, we need to make sure to auto-commit all still-open
+    // transactions.
+    // Also see https://www.drupal.org/project/drupal/issues/1608374.
+    foreach (array_reverse($this->transactionLayers) as $name => $active) {
+      $this->popTransaction($name);
+    }
+
     // Ensure that the circular reference caused by Connection::__construct()
     // using $this in the call to set the statement class can be garbage
     // collected.

mondrake’s picture

Can you just try this (in D10.2)?

diff --git a/core/lib/Drupal/Core/Database/Transaction/TransactionManagerBase.php b/core/lib/Drupal/Core/Database/Transaction/TransactionManagerBase.php
index 0205f7dd65..e7a0721361 100644
--- a/core/lib/Drupal/Core/Database/Transaction/TransactionManagerBase.php
+++ b/core/lib/Drupal/Core/Database/Transaction/TransactionManagerBase.php
@@ -96,7 +96,9 @@ public function __construct(
    * When destructing, $stack must have been already emptied.
    */
   public function __destruct() {
-    assert($this->stack === [], "Transaction \$stack was not empty. Active stack: " . $this->dumpStackItemsAsString());
+    if ($this->stack !== []) {
+      $this->processRootCommit();
+    }
   }
 
   /**

even though, now that I look more closely, there's a risk that this occurs with post-transaction callbacks still to be called. But if we are in this predicament, then it's very possible that those callbacks have been destructed already. Fun.

mglaman’s picture

I'm signing off for the workday, tomorrow AM CDT I'll bump to 10.2 and try the patch.

mondrake’s picture

#17 so drush it's not explicitly opening a transaction, which is likely being opened when saving the user entity by Drupal core. Looks like it's not drush, but definitely seems like restrained on CLI? Is it reproducible at all on webserver PHP?

mondrake’s picture

#17 alternatively it would be interesting to unset $account and gc_collect_cycles() when it's no longer necessary within that method and before returning from it

fago’s picture

> Is it reproducible at all on webserver PHP?
Not for me. Only via CLI, with and without drush.

alexpott changed the visibility of the branch 3405976-transaction-autocommit-during to hidden.

Driskell’s picture

Can you try disabled XDEBUG?
I have this error but only on a test environment. Disabling XDEBUG resolves it so it seems XDEBUG is doing something in my case.

In general within PHP, __destruct should call immediately as a variable leaves scope (it has always worked like this for a long time AFAIK - and is related to the "when no more references"). AFAIK only a circular reference is deferred beyond the scope until a collection happens that can work it out.

It looks like perhaps something is causing an interference here and causing a transaction object to not get immediately collected.

mondrake’s picture

Looking at #17 again, can anyone try this either? This implicitly COMMITS the transaction that saves the user entity just created.

diff --git a/core/lib/Drupal/Core/Entity/Sql/SqlContentEntityStorage.php b/core/lib/Drupal/Core/Entity/Sql/SqlContentEntityStorage.php
index c65c623fef..9675a12370 100644
--- a/core/lib/Drupal/Core/Entity/Sql/SqlContentEntityStorage.php
+++ b/core/lib/Drupal/Core/Entity/Sql/SqlContentEntityStorage.php
@@ -751,6 +751,7 @@ public function delete(array $entities) {
     try {
       $transaction = $this->database->startTransaction();
       parent::delete($entities);
+      unset($transaction);
 
       // Ignore replica server temporarily.
       \Drupal::service('database.replica_kill_switch')->trigger();
@@ -804,6 +805,7 @@ public function save(EntityInterface $entity) {
     try {
       $transaction = $this->database->startTransaction();
       $return = parent::save($entity);
+      unset($transaction);
 
       // Ignore replica server temporarily.
       \Drupal::service('database.replica_kill_switch')->trigger();
@@ -855,6 +857,8 @@ public function restore(EntityInterface $entity) {
       // Insert the entity data in the dedicated tables.
       $this->saveToDedicatedTables($entity, FALSE, []);
 
+      unset($transaction);
+
       // Ignore replica server temporarily.
       \Drupal::service('database.replica_kill_switch')->trigger();
     }
mglaman’s picture

Can you try disabled XDEBUG?

Oh! We have it turned on in CI for code coverage. And Xdebug had a new release on 2023-11-30 https://xdebug.org/download/historical

I'm in trainings but I'll see if I can have someone test and verify this.

Driskell’s picture

@mglaman

I managed to dive into my specific example and did indeed find that a specific exception that was handled somewhere - if I changed how it handled it it seems to work again.

I did produce a reproduction for Xdebug on this: https://bugs.xdebug.org/view.php?id=2222 (ignore the summary - I tend to write fast without thinking and then update after reading - and it's a rather assumptive title! the reproduction is clear though)
So I think this is likely an issue in 3.3.0 of Xdebug.

catch’s picture

If this is an xdebug regression, we could still potentially commit one of the workarounds here with a @todo to remove, or just add this to the 'known issues' for 10.2.0. I'm not sure which is best and it probably depends how quickly it's fixed upstream too.

mglaman’s picture

For our CI image we pinned Xdebug and all is fixed:

- RUN pecl install xdebug \
-   && docker-php-ext-enable xdebug
+ RUN pecl install xdebug-3.2.2

And then when running PHPUnit we've always had XDEBUG_MODE=coverage php vendor/bin/phpunit. So we stopped enabling the extension entirely.

RE #29 I say we mark it as a known issue, it affects all versions of Drupal and will be fixed in a patch release of Xdebug. Or by downgrading to the last minor version.

catch’s picture

Issue tags: +10.2.0 release notes

Tagging for release notes, we can add the snippet directly into the draft.

derickr’s picture

Although this is perhaps caused by a change in Xdebug, the real culprit is lined out in the initial report:

According to https://www.php.net/manual/en/language.oop5.decon.php the object destruction order during shutdown is random:

PHP possesses a destructor concept similar to that of other object-oriented languages, such as C++. The destructor method will be called as soon as there are no other references to a particular object, or in any order during the shutdown sequence.

alexpott’s picture

@derickr thanks for commenting here. I agree that we should probably fix things here. But I also think that xdebug should not be changing the order objects are destructed. This is a behaviour change that can make debugging things really hard.

alexpott’s picture

What's interesting is that I cannot reproduce the bug. I'm on

PHP 8.2.12 (cli) (built: Oct 24 2023 19:22:16) (NTS)
Copyright (c) The PHP Group
Zend Engine v4.2.12, Copyright (c) Zend Technologies
    with Xdebug v3.3.0, Copyright (c) 2002-2023, by Derick Rethans
    with Zend OPcache v8.2.12, Copyright (c), by Zend Technologies

I've got 10.1.x checked out, installed drush and have installed the standard install profile.

I do vendor/bin/drush ucrt test and everything works just fine. Same when I repeat the steps on 10.2.x. I wonder why that is.

Driskell’s picture

@derickr Reading the statement I believe the way Drupal is currently using transaction objects is fine. It assigns one and then allows it to go out of scope to commit automatically.

The statement "The destructor method will be called as soon as there are no other references to a particular object" covers this. The key phrase here is "as soon as" which leaves nothing to assumption. This occurs the moment the method ends, or when the stack unwinds during an exception.

Having said that, perhaps readability is improved by not having this paradigm in Drupal as this maybe relies on more advanced knowledge of object destruction. Though I think one could argue the approach in PHP currently is implied and the most reasonable assumption, and only circular references are the advanced concept.

@alexpott All my tests were with PHP 8.1 so perhaps the behaviour changes in 8.2? In my specific case it was also during menu link building in cache clear so could be related to the menu and content in my case too.

alexpott’s picture

@Driskell tried PHP 8.1 too...

$ vendor/bin/drush cr
 [success] Cache rebuild complete.

$ vendor/bin/drush ucrt test
 [success] Created a new user with uid 2
 --------- ----------- ----------- --------------- -------------
  User ID   User name   User mail   User roles      User status
 --------- ----------- ----------- --------------- -------------
  2         test                    authenticated   1
 --------- ----------- ----------- --------------- -------------

$ php -v
PHP 8.1.25 (cli) (built: Oct 24 2023 21:51:16) (NTS)
Copyright (c) The PHP Group
Zend Engine v4.1.25, Copyright (c) Zend Technologies
    with Xdebug v3.3.0, Copyright (c) 2002-2023, by Derick Rethans
    with Zend OPcache v8.1.25, Copyright (c), by Zend Technologies
mglaman’s picture

I could not reproduce locally – host, VM, Docker, only on our CI instance. I don't know how or why. But rolling back to 3.2.2 did fix it. However, I am on macOS and don't have a Linux device available. But I don't think @fago could reproduce on a Linux machine, either.

@Driskell, do you have information about the environment you could reproduce it with? Since our CI is hosted via a cloud provider, I'm pretty disconnected from knowing more of those details beyond the fact that it's a Linux machine running and executing Docker.

derickr’s picture

It probably depends on which values `xdebug.mode` has. It should include `develop` for Xdebug to cache exceptions, their stack traces, and a reference to local variables (which is why they're kept in scope longer).

mglaman’s picture

Ah, that makes sense. I didn't check what mode docker-php-ext-enable xdebug set. And would explain why PHPUnit didn't crash in XDEBUG_MODE=coverage.

alexpott’s picture

Status: Needs work » Needs review
Issue tags: -10.2.0 release notes

I've created an MR for 11.x/10.2.x that fixes the issue for those branches. We have a bit more to do because of the new TransactionManager.

I've tested this locally on xdebug 3.3.0 with develop in the xdebug.mode setting and without it drush ucrt is broken and with the fix it works as expected.

I've removed the 10.2.0 release notes tag because this affects all versions of Drupal including Drupal 7.

mondrake’s picture

Status: Needs review » Needs work

Well at least for TransactionManager I do not think this is right. We’re in a very unpredictable situation here, it could be the TransactionManager object is already gone when destroying the Connection one.

Also, differently from popTransaction, removeStackItem is actually doing nothing on the db.

To me, here we should just commit the client transaction on the db (note: if there is still a client connection open, which may not be the case if we anticipate that the destruction order is unpredictable), and forget about the stack itself, which is gone anyway after destruction.

So for me #19 is still the best fix if destroying TransactionManager before the Transaction objects. If we are destroying the Connection before anything else, then a straight commit on the client connection.

This seems rather untestable, too?

olegiv’s picture

I have temporarily disabled xdebug for CLI mode – it fixed the issue in my case.
It looks like the issue started after the recent xdebug update on 2023-11-30.

alexpott’s picture

@mondrake if the transaction manager object is gone then we're fine. The stack will have been emptied all ready and the transactions committed. In my opinion, that doesn't matter at all. The problem here is that we have to ensure that the stack is empty before closing the connection. If the manager doesn't exist or the stack is already empty then there is not problem.

I think regardless of xdebug's behaviour we need to fix this in some way because we should not be relying on the order of object desctruction.

alexpott’s picture

Status: Needs work » Needs review

Changed the code to call unpile each member of the stack.

mondrake’s picture

Assigned: Unassigned » mondrake
Status: Needs review » Needs work
Related issues: +#1025314: Transactions should be allowed to be committed explicitly

#44

if the transaction manager object is gone then we're fine. The stack will have been emptied all ready and the transactions committed.

Only if all the individual Transaction objects would have been destructed BEFORE destruction of TransactionManager. Otherwise, the stack would be still active. #15 tells us that.

The problem here is that we have to ensure that the stack is empty before closing the connection.

Why? The state of the stack is irrelevant here IMHO. The TransactionManager is already destroyed or will be destroyed as part of the shutdown anyway. What matters is that if a client transaction is still open on the db at this stage, we should commit it (hence enforcing auto-commit mode) rather than letting it rollback as the standard behaviour would:

When the script ends or when a connection is about to be closed, if you have an outstanding transaction, PDO will automatically roll it back. https://www.php.net/manual/en/pdo.transactions.php

and we need to commit it from the Connection object, if we cannot rely on the TransactionManager to exist.

Working on an alternative proposal, will post a patch.

More in general, I'm more and more convinced we should abandon implicit commits, do #3398767: [PP-1] Allow explicit COMMIT in Transaction objects / #1025314: Transactions should be allowed to be committed explicitly, make core code explicitly commit transactions, and deprecate committing as part of object destruction. We would not be here if we will.

eastdrive’s picture

I'm seeing this when trying to delete nodes in the admin UI, singly or batched.

Drupal 10.1.6
PHP 8.1.6
MySQL 8.0.32

Running locally with docker compose.

PHP Fatal error:  Uncaught Error: Call to a member function inTransaction() on null in /var/www/web/core/modules/mysql/src/Driver/Database/mysql/Connection.php:483
#0 /var/www/web/core/modules/mysql/src/Driver/Database/mysql/Connection.php(415): Drupal\\mysql\\Driver\\Database\\mysql\\Connection->doCommit()
#1 /var/www/web/core/lib/Drupal/Core/Database/Connection.php(1447): Drupal\\mysql\\Driver\\Database\\mysql\\Connection->popCommittableTransactions()
#2 /var/www/web/core/lib/Drupal/Core/Database/Transaction.php(71): Drupal\\Core\\Database\\Connection->popTransaction()
#3 [internal function]: Drupal\\Core\\Database\\Transaction->__destruct()
#4 {main} thrown in /var/www/web/core/modules/mysql/src/Driver/Database/mysql/Connection.php on line 483, referer: https://test-domain.something:888/batch?id=50&op=start
Driskell’s picture

> and we need to commit it from the Connection object, if we cannot rely on the TransactionManager to exist.

I don't think you can assume the transaction is complete. Maybe you're in this destruction because of an error and it is incomplete. This could potentially commit a partial transaction and remove all usefulness of transactions.

> I think regardless of xdebug's behaviour we need to fix this in some way because we should not be relying on the order of object desctruction.

During normal execution there is no issue with destruction order from what I can see. When it goes out of scope and last reference is gone, it is destructed and freed, as per the PHP documentation. Ordering is only involved when you get to shutdown and things are left behind and that in itself, before we even worry about ordering, is a problem - such as in this case where Xdebug is preventing destruction of objects - or if somebody stored a transaction object somewhere.

Perhaps here it just needs a better exception - throw from connection if an open transaction and roll it back - in transaction just ignore missing connection as you'll know it will only be missing if you're in shutdown, and a transaction should never be destructing there. It'll make the error message clearer - you won't get an inTransaction on null - you'll get a "leaked transaction" message which is what Xdebug is doing here. Of course, doesn't workaround the Xdebug issue.

Issue with any change to destruction code is that I don't think you could ever workaround the Xdebug issue safely. If you think about it - if transaction objects are getting held open, every transaction is going to become one. Then you get to shutdown and collection, and you have absolutely no idea if the transaction was successful. And you still have the issue of one massive transaction when it should have been individual.

> More in general, I'm more and more convinced we should abandon implicit commits

I think this makes most sense.

derickr’s picture

Assigned: mondrake » Unassigned

FWIW, I reject that this is an Xdebug **issue**. It might well cause this problem, but Xdebug isn't doing anything wrong here. What happens here is that Xdebug now keeps a reference to the last 8 exceptions, including their stack trace (With local variables). Any of these will therefore have their destructor called when Xdebug releases these exceptions. (A workaround therefore could be to just throw and catch exceptions (I joke)).

> During normal execution there is no issue with destruction order from what I can see.

From an earlier comment is a bit strange, as "destruction order" isn't defined as being reliable (from https://www.php.net/destruct#language.oop5.decon.destructor):

> The destructor method will be called as soon as there are no other references to a particular object, or in any order during the shutdown sequence.

That also means, that your workaround (ie fix) could be to keep a reference yourself? Or as you say, it's better to not rely on ```__destruct``` at all.

From the Xdebug side, I don't know how to fix this; except for disabling the improvements to handling transactions while debugging and through the `xdebug_get_function_stack()` improvements, and I am not really willing to do that. Alternatively, I could add a function to clear out these cached exceptions, but that would also mean that you'd have to conditionally call this in places.

FWIW, I don't actually understand *why* you have these exceptions with references to these connection/transaction objects though. Unless you're using exceptions for code flow, which you surely did not expect?

cheers,
Derick

izus’s picture

In our case we :

1) Disabled xdebug
2) Restarted php-fmp

and it Worked :)

Driskell’s picture

@derickr

> FWIW, I don't actually understand *why* you have these exceptions with references to these connection/transaction objects though. Unless you're using exceptions for code flow, which you surely did not expect?

Please see reproduction case in: https://bugs.xdebug.org/view.php?id=2222

There is nothing in code adding the variable "$destructing" to an exception. Just merely the presence of the exception is causing this. I think you'd be absolutely perfectly fine to store a reference to the exception and anything inside it. Just here it seems to be somehow grabbing everything in scope in each method in the call chain, which I am sure must be unintentional and I hope, fixable.

derickr’s picture

Although there is no code that adds that variable to an exception, that doesn't mean that it doesn't get added.

Each exception has an exception trace (so that you can call `getTrace()` on the exception), and for each stack element it keeps a reference to each of the arguments, and each of the local variables.

The only alternative would be to 1. not keep a "reference" to local variables (which means I can't show them), or 2. convert them to a string representation (which is also not great).

A script to demonstrate what's going on (copied from https://bugs.xdebug.org/view.php?id=2222):

<?php
class destructingclass
{
    public function __destruct()
    {
        echo 'DESTRUCTION', PHP_EOL;
    }
}

function methodcall()
{
    $destructing = new destructingclass();
    try {
        throw new Exception("TEST");
    } catch (Exception $e) {
    }
}

methodcall();

echo "Throwing 10 exceptions: \n";
for ($i = 0; $i < 10; $i++) {
    try {
        echo "Throwning exception $i\n";
        throw new Exception("Dummy");
    } catch (Exception $e) {
    }
}
echo "Done throwing 10 exceptions\n";
echo 'METHOD RETURNED', PHP_EOL;

Outputs:

Throwing 10 exceptions: 
Throwning exception 0
Throwning exception 1
Throwning exception 2
Throwning exception 3
Throwning exception 4
Throwning exception 5
Throwning exception 6
Throwning exception 7
DESTRUCTION
Throwning exception 8
Throwning exception 9
Done throwing 5 exceptions
METHOD RETURNED
Driskell’s picture

@derickr Thanks - I asked my question on the bug tracker as it's probably not for here.

mondrake’s picture

Assigned: Unassigned » mondrake

#49 I am working on a patch and will post soon, so I'll keep this assigned to me.

#48

I don't think you can assume the transaction is complete. Maybe you're in this destruction because of an error and it is incomplete. This could potentially commit a partial transaction and remove all usefulness of transactions.

I agree 200%, but here we need to keep supporting current behavior. Reviewing #3398767: [PP-1] Allow explicit COMMIT in Transaction objects and getting it in would be a first step to remove the implicit behavior on destruction, which is causing this trouble.

And you still have the issue of one massive transaction when it should have been individual.

Yes and no. Yes multiple 'drupal' transaction objects. No in fact there's only one REAL db transaction active, the nesting is a code logic thing, not a corresponding db thing. If you COMMIT that REAL db transaction, you're done vs current behavior. BUT - there's no way to safely execute 'drupal' post-transaction callbacks at this stage, they might have been already destructed.

mondrake’s picture

Assigned: mondrake » Unassigned
Status: Needs work » Needs review
FileSize
8.66 KB

This patch is what I would suggest doing. Covers independently destruction of both Connection and TransactionManager. As noted, not much can be done to try and perform post-transaction callbacks.

In general, my suggestion would be to won't fix this, and start working toward changing from implicit to explicit commits. I think this could happen by D11 if we do not lose time.

Driskell’s picture

I agree with @mandrake that I think best approach is to move to explicit commit.

I previously before this edit put I don’t think the patch is feasible but I think I just realised this auto commit is happening anyway due to the Transaction destruct so this just continues that. (Sorry - a bit slow to this!)

I guess I just still concerned. Such as this collapsing all transactions into a single transaction that is auto committed at the end and I’m sure this might have implications - you could be sending a response and it not be guaranteed when it is received that the transaction committed. More edge case tho now so maybe the patch still reasonable. But perhaps it should be logging some errror at least?

Driskell’s picture

Probably worth an IS update regarding the Xdebug impact and that in normal scenario transaction destruction is safe as it happens way before shutdown?

Wim Leers’s picture

Note that this also affects Drupal 9. See #3406976: GitLab CI pipeline broke due to wodby/php:8.1-dev 4.41.0 installing XDebug 3.3 for example.

EDIT: I see now that has already been said 👍

mondrake’s picture

andreic’s picture

The patch in #55 fixed the same error I had on a 10.1.7 installation.

alfattal’s picture

The patch in #55 failed to apply with the following error:

Unhandled promise rejection with ArgumentCountError: Too few arguments to function React\Promise\Promise::React\Promise\{closure}(), 0 passed in phar:///usr/local/bin/composer/src/Composer/Installer/InstallationManager.php on line 660 and exactly 1 expected in phar:///usr/local/bin/composer/vendor/react/promise/src/Promise.php:277
Stack trace:
#0 phar:///usr/local/bin/composer/src/Composer/Installer/InstallationManager.php(660): React\Promise\Promise::React\Promise\{closure}()
#1 phar:///usr/local/bin/composer/vendor/react/promise/src/Internal/FulfilledPromise.php(47): Composer\Installer\InstallationManager::Composer\Installer\{closure}()
#2 phar:///usr/local/bin/composer/src/Composer/Installer/InstallationManager.php(661): React\Promise\Internal\FulfilledPromise->then()
#3 phar:///usr/local/bin/composer/vendor/react/promise/src/Promise.php(288): Composer\Installer\InstallationManager::Composer\Installer\{closure}()
#4 phar:///usr/local/bin/composer/vendor/react/promise/src/Promise.php(37): React\Promise\Promise->call()
#5 phar:///usr/local/bin/composer/src/Composer/Installer/InstallationManager.php(663): React\Promise\Promise->__construct()
#6 phar:///usr/local/bin/composer/src/Composer/Installer/InstallationManager.php(221): Composer\Installer\InstallationManager->runCleanup()
#7 phar:///usr/local/bin/composer/src/Composer/Installer.php(791): Composer\Installer\InstallationManager->execute()
#8 phar:///usr/local/bin/composer/src/Composer/Installer.php(292): Composer\Installer->doInstall()
#9 phar:///usr/local/bin/composer/src/Composer/Command/InstallCommand.php(147): Composer\Installer->run()
#10 phar:///usr/local/bin/composer/vendor/symfony/console/Command/Command.php(298): Composer\Command\InstallCommand->execute()
#11 phar:///usr/local/bin/composer/vendor/symfony/console/Application.php(1040): Symfony\Component\Console\Command\Command->run()
#12 phar:///usr/local/bin/composer/vendor/symfony/console/Application.php(301): Symfony\Component\Console\Application->doRunCommand()
#13 phar:///usr/local/bin/composer/src/Composer/Console/Application.php(382): Symfony\Component\Console\Application->doRun()
#14 phar:///usr/local/bin/composer/vendor/symfony/console/Application.php(171): Composer\Console\Application->doRun()
#15 phar:///usr/local/bin/composer/src/Composer/Console/Application.php(145): Symfony\Component\Console\Application->run()
#16 phar:///usr/local/bin/composer/bin/composer(88): Composer\Console\Application->run()
#17 /usr/local/bin/composer(29): require('...')
#18 {main}

It also failed in the test for (10.1.x: PHP 8.1 & MySQL 8).

alfattal’s picture

Status: Needs review » Needs work

alexpott changed the visibility of the branch 3405976-transactions to hidden.

ashetkar’s picture

With #3 Patch CI pipeline working fine for Drupal core 10.1.5 but patch #55 failing for drupal core 10.1.5

alexpott’s picture

Status: Needs work » Needs review

I turned @mondrake's patch into an MR against 11.x.

Has anyone got instructions on how to reproduce because I can't reproduce this against the standard install. There is no exception triggered for communication purposes when doing a user create or cache clear.

alexpott’s picture

Hiding the files so the MR is the thing to review.

mondrake’s picture

Added some comments inline.

The more I see this, the sicker I feel... dumping state across objects for use during destruction...

astoker88’s picture

For anyone struggling short term here, whilst the patch in 55 isn't working (Drupal 10.1.7, PHP 8.2
MySQL 8.0) for a lot of folks and work is ongoing - i can confirm the issue is solved for me by deleting the xdebug.ini.

alexpott’s picture

@astoker88 the simplest way to fix this is to ensure that develop is not set in xdebug.mode. You can still debug :)

akoe’s picture

We ran into the same issue with our CI containers that were failing only when run in pipeline. And we weren't able to reproduce this locally with the same containers in any way.
Our stack: MySQL 5.7; PHP 8.1 and Drupal 9 and 10

In the pipeline we could narrow it down to a Drush cache clear for the router cache: drush cache:clear router
and also running the router rebuild method directly triggered the error: drush ev "\Drupal::service('router.builder')->rebuild();"

Thankfully the workaround disabling xdebug via xdebug.mode=off is working for our usecase. And we will be testing patches for D10 also.

alfattal’s picture

@akoe May I ask what patches you're testing for Drupal 10? Are they local patches or from other issues? Thank you!

xjm’s picture

Version: 10.1.x-dev » 11.x-dev

The MR should be made against 11.x FWIW (which means filing the issue against 11.x too nowadays). We'll need to decide based on the eventual solution how far it's safe to backport it. This issue sounds bad enough that could even be backported as far as 10.1 (at release manager discretion) if we are able to come up with a safe fix.

alfattal’s picture

@alexpott Not sure why you changed the status to Needs Review, you opened a MR based on a patch that does NOT apply and thus, it needs works!

alexpott’s picture

@alfattal the MR does apply to 11.x which is where this has to be fixed first. The fix for 10.1.x is a little different because there is no transaction manager. If you want a fix for 10.1.x - maybe the patch in #3 will work for you but the final fix for 10.1.x is likely to look a little different. In my opinion the best fix you can do right now is ensure that develop is not in your xdebug.mode value as then you don't need a patch at all.

YesCT’s picture

Like @mglaman in comment #30 https://www.drupal.org/project/drupal/issues/3405976#comment-15346751

we were able to fix the sudden failing of our build step in behat, unit, and functional phpunit tests by updating the image CI was using in our webhead-ci/Dockerfile with this change:

-  && pecl install xdebug \
+  && pecl install xdebug-3.2.2 \

Thank you folks for sharing what symptoms you were seeing and your workarounds!!

---

We are running D10. We did try a patch and it helped get our unit tests to pass, but we were still getting different errors with behat and functtional phpunit tests with the patch. Pinning xdebug got all our tests passing.

YesCT’s picture

Issue summary: View changes

I added workarounds and the upstream issue to this summary.

Julien Tekrane’s picture

#75 resolved my problem. Thanks alexpott

OIOIOOOI’s picture

Been using Drupal 10.1.6 and stumbled upon an issue that eventually led me to this thread.

Clean installation under PHP8.2 and xdebug 3.3.0, the entities created (nodes/terms) stubbornly were not deleted when doing so in UI.

After enabling the database logging it turned out that after the whole set of DELETE queries, a ROLLBACK was issued.
Step debugging this brought me to the __destruct() method being talked here.

Disabling xdebug extension fixes the issue, however believe the patch in #75 should also do the trick for those who use xdebug.

mferanda’s picture

Disabling xdebug resolved the issue on 9.5.11 and 10.1.7

pgrandeg’s picture

I faced te same problem while deleting content with Drupal 10.1.7 and latest wodby/drupal-php. Disabling xdebug configuration from docker-compose.yml fixed the issue

daffie’s picture

Status: Needs review » Needs work

I see a lot of changes in the MR from @mondrake, which adds code that relies on deprecated code. When we do that we cannot remove the deprecated code. So, can we please not do that. :)

If I understand the problem correctly is that it only a problem when the database connection desctucted, before the Transaction object is destructed. The logical solution for me would be to have Connection->__destruct() check through the transaction manager if there is still an active transaction(s) and if so then to commit that transaction(s). If it is more complicated or if my solution does not work, can you explain why I am wrong.

Driskell’s picture

In a way ideally Xdebug would make the exception tracking it’s doing opt-in or at least have an opt-out (the tracking keeps the transaction objects in scope if exception is thrown and caught so there’s a history of exceptions stored during debugging)

That would help for all use cases and if it’s runtime configurable allow Drupal as a framework to opt out until it removes reliance on the transaction object destructor to commit transactions.

mondrake’s picture

@daffie the issue reported here is present in all Drupal versions, both in new and in deprecated code path, so a fix here would need to cover both paths - yes, that means also fixing deprecated code. At least, if we want to port it back to 10.2 and potentially below. Given this is critical, I assume it's reasonable to expect a backport.

To me, as I said earlier, the future though is to get rid of commit on destruction, and go for explicit commit. At least to me, the path to get there is set: #3398767: [PP-1] Allow explicit COMMIT in Transaction objects, then #3406985: [PP-1] Convert all transactions to explicit COMMIT, then a to-be issue to deprecate commit on destruct.

We are currently blocked by #3384999: Introduce a Connection::executeDdlStatement method.

derickr’s picture

@ #83 : I don't know how to make this opt-in without also not adding a setting. FWIW, relying on a specific destruction order has always a problem with Drupal. It was never something that was guaranteed to happen.

alexpott’s picture

I've just hit this problem while upgrading Thunder to use core 10.2.1 (https://github.com/thunder/thunder-distribution/pull/762)

I'm quite concerned because:

  • The fails only happen with 10.2.x. Thunder is passing on 10.1.x with exactly the same environment. This means the fail is due to the 10.2.x transaction management changes.
  • Xdebug is not installed. This means it could be happening on live sites.
  • The error only occurs during the destructor and does not get into Drupal's logs so it is quite hard to track down.

The good news is that 10.2.x MR fixes the problem.

Carlos Romero’s picture

Hello everyone,
It is happening to me on a drupal 9.5.11 with php 8.0.30 and xdebug 3.3.1, when I deactivate xdebug the error disappears, I have activated it again and with patch #3 it no longer gives an error.

Thanks.

Cyberschorsch changed the visibility of the branch 3405976-transaction-autocommit-during to active.

Cyberschorsch changed the visibility of the branch 3405976-transaction-autocommit-during to hidden.

fvorillion’s picture

I have the same issue on Drupal 10.2.2, with Docker, PHP 8.2.14 and XDebug 3.3.1 enabled with default configuration.

If I force to disable XDebug, the problem was solved :

[xdebug]
xdebug.mode=off

It works also with the good xdebug config for my env :

[xdebug]
xdebug.mode=debug
xdebug.start_with_request=yes
xdebug.discover_client_host=true
xdebug.remote_handler=dbgp
xdebug.max_nesting_level=1500
xdebug.client_host=host.docker.internal

And without any of the two solutions above applied, the MR on 10.2.x branch solved also the problem : https://git.drupalcode.org/project/drupal/-/merge_requests/5774
- composer require cweagans/composer-patches
- add patch on extra :

    "extra": {
        "patches": {
            "drupal/core": {
                "Transaction autocommit during shutdown relies on unreliable object destruction order": "https://git.drupalcode.org/project/drupal/-/merge_requests/5774.patch"
            }
        },
longwave’s picture

@alexpott re #86 I don't think this is specific to 10.2, other users are reporting issues on 9.5 in this issue, and someone at my organisation has run into this today in ddev with xdebug on a 9.5 site - interestingly I work on the same site and have not run into it yet.

In the without-xdebug case I wonder if there is something more complex in Thunder and also the transaction manager changes in 10.2 mean that an object reference is held longer/differently and so the destruct order is accidentally changed, but it's very difficult to work out.

solideogloria’s picture

@fvorillion Just make sure you don't actually use the MR's patch URL in the composer file. MR patch files will change anytime a person commits a change, and the change might break or compromise your site. You should download the patch to a folder in your project and use ./patches/5774.patch or something similar.

solideogloria’s picture

After downloading and applying the current MR patch, I still get errors for any transaction while xdebug is enabled.

The errors also contain the serialized DB connection with a bunch of sensitive info like DB, username, password, so I have redacted that information (even though this is from a local DDEV instance).

$ ddev drush updb
 [success] No pending updates.
PHP Fatal error:  Uncaught TypeError: Drupal\Core\Database\StatementWrapperIterator::__construct(): Argument #2 ($clientConnection) must be of type object, null given, called in /var/www/html/web/core/lib/Drupal/Core/Database/Connection.php on line 577 and defined in /var/www/html/web/core/lib/Drupal/Core/Database/StatementWrapperIterator.php:54
Stack trace:
#0 /var/www/html/web/core/lib/Drupal/Core/Database/Connection.php(577): Drupal\Core\Database\StatementWrapperIterator->__construct()
#1 /var/www/html/web/core/lib/Drupal/Core/Database/Connection.php(843): Drupal\Core\Database\Connection->prepareStatement()
#2 /var/www/html/web/core/lib/Drupal/Core/Config/DatabaseStorage.php(111): Drupal\Core\Database\Connection->query()
#3 /var/www/html/web/core/lib/Drupal/Core/Config/CachedStorage.php(95): Drupal\Core\Config\DatabaseStorage->readMultiple()
#4 /var/www/html/web/core/lib/Drupal/Core/Config/ConfigFactory.php(165): Drupal\Core\Config\CachedStorage->readMultiple()
#5 /var/www/html/web/core/lib/Drupal/Core/Config/ConfigFactory.php(104): Drupal\Core\Config\ConfigFactory->doLoadMultiple()
#6 /var/www/html/web/core/lib/Drupal/Core/Config/ConfigFactory.php(89): Drupal\Core\Config\ConfigFactory->doGet()
#7 /var/www/html/web/core/lib/Drupal.php(411): Drupal\Core\Config\ConfigFactory->get()
#8 /var/www/html/web/core/includes/errors.inc(323): Drupal::config()
#9 /var/www/html/web/core/includes/errors.inc(123): _drupal_get_error_level()
#10 /var/www/html/web/core/includes/bootstrap.inc(202): error_displayable()
#11 /var/www/html/web/core/includes/bootstrap.inc(186): _drupal_exception_handler_additional()
#12 [internal function]: _drupal_exception_handler()
#13 {main}
  thrown in /var/www/html/web/core/lib/Drupal/Core/Database/StatementWrapperIterator.php on line 54

Fatal error: Uncaught TypeError: Drupal\Core\Database\StatementWrapperIterator::__construct(): Argument #2 ($clientConnection) must be of type object, null given, called in /var/www/html/web/core/lib/Drupal/Core/Database/Connection.php on line 577 and defined in /var/www/html/web/core/lib/Drupal/Core/Database/StatementWrapperIterator.php on line 54

TypeError: Drupal\Core\Database\StatementWrapperIterator::__construct(): Argument #2 ($clientConnection) must be of type object, null given, called in /var/www/html/web/core/lib/Drupal/Core/Database/Connection.php on line 577 in /var/www/html/web/core/lib/Drupal/Core/Database/StatementWrapperIterator.php on line 54

Call Stack:
    6.0352   75949336   1. _drupal_exception_handler($exception = class TypeError { protected $message = 'Drupal\\Core\\Database\\Connection::getClientConnection(): Return value must be of type object, null returned'; private string ${Error}string = ''; protected $code = 0; protected string $file = '/var/www/html/web/core/lib/Drupal/Core/Database/Connection.php'; protected int $line = 326; private array ${Error}trace = [0 => [...], 1 => [...], 2 => [...]]; private ?Throwable ${Error}previous = NULL }) /var/www/html/web/core/includes/bootstrap.inc:0
    6.0361   75910456   2. _drupal_exception_handler_additional($exception = class TypeError { protected $message = 'Drupal\\Core\\Database\\Connection::getClientConnection(): Return value must be of type object, null returned'; private string ${Error}string = ''; protected $code = 0; protected string $file = '/var/www/html/web/core/lib/Drupal/Core/Database/Connection.php'; protected int $line = 326; private array ${Error}trace = [0 => [...], 1 => [...], 2 => [...]]; private ?Throwable ${Error}previous = NULL }, $exception2 = class TypeError { protected $message = 'Drupal\\Core\\Database\\StatementWrapperIterator::__construct(): Argument #2 ($clientConnection) must be of type object, null given, called in /var/www/html/web/core/lib/Drupal/Core/Database/Connection.php on line 577'; private string ${Error}string = ''; protected $code = 0; protected string $file = '/var/www/html/web/core/lib/Drupal/Core/Database/StatementWrapperIterator.php'; protected int $line = 54; private array ${Error}trace = [0 => [...], 1 => [...], 2 => [...], 3 => [...], 4 => [...], 5 => [...], 6 => [...], 7 => [...], 8 => [...]]; private ?Throwable ${Error}previous = NULL }) /var/www/html/web/core/includes/bootstrap.inc:186
    6.0361   75910456   3. error_displayable($error = ???) /var/www/html/web/core/includes/bootstrap.inc:202
    6.0361   75910456   4. _drupal_get_error_level() /var/www/html/web/core/includes/errors.inc:123
    6.0361   75910456   5. Drupal::config($name = 'system.logging') /var/www/html/web/core/includes/errors.inc:323
    6.0361   75910456   6. Drupal\Core\Config\ConfigFactory->get($name = 'system.logging') /var/www/html/web/core/lib/Drupal.php:411
    6.0361   75910456   7. Drupal\Core\Config\ConfigFactory->doGet($name = 'system.logging', $immutable = ???) /var/www/html/web/core/lib/Drupal/Core/Config/ConfigFactory.php:89
    6.0361   75910672   8. Drupal\Core\Config\ConfigFactory->doLoadMultiple($names = [0 => 'system.logging'], $immutable = TRUE) /var/www/html/web/core/lib/Drupal/Core/Config/ConfigFactory.php:104
    6.0361   75910784   9. Drupal\Core\Config\CachedStorage->readMultiple($names = [0 => 'system.logging']) /var/www/html/web/core/lib/Drupal/Core/Config/ConfigFactory.php:165
    6.0361   75911624  10. Drupal\Core\Config\DatabaseStorage->readMultiple($names = [0 => 'system.logging']) /var/www/html/web/core/lib/Drupal/Core/Config/CachedStorage.php:95
    6.0361   75912128  11. Drupal\Core\Database\Connection->query($query = 'SELECT [name], [data] FROM {config} WHERE [collection] = :collection AND [name] IN ( :names[] )', $args = [':collection' => '', ':names[]' => [0 => 'system.logging']], $options = []) /var/www/html/web/core/lib/Drupal/Core/Config/DatabaseStorage.php:111
    6.0361   75913112  12. Drupal\Core\Database\Connection->prepareStatement($query = 'SELECT [name], [data] FROM {config} WHERE [collection] = :collection AND [name] IN ( :names__0 )', $options = ['fetch' => 5, 'allow_delimiter_in_query' => FALSE, 'allow_square_brackets' => FALSE, 'pdo' => []], $allow_row_count = ???) /var/www/html/web/core/lib/Drupal/Core/Database/Connection.php:843
    6.0361   75913432  13. Drupal\Core\Database\StatementWrapperIterator->__construct($connection = class Drupal\mysql\Driver\Database\mysql\Connection { REDACTED }, $clientConnection = NULL, $query = 'SELECT "name", "data" FROM "config" WHERE "collection" = :collection AND "name" IN ( :names__0 )', $options = [], $rowCountEnabled = FALSE) /var/www/html/web/core/lib/Drupal/Core/Database/Connection.php:577
larowlan’s picture

alexpott’s picture

@longwave yes I'm aware that this is in not 10.2 only when xdebug is involved but I've got the first instance of this issue being a problem where xdebug is not involved and the situation where this occurs is 10.2 only. I think this is a much much bigger problem as running xdebug in production is not recommended. Thunder doesn't do anything special on the DB layer and yes, investigating and reproducing this issue in the situation is not easy. I wish we had an MR for 10.2.x (and earlier Drupal versions) that had no API change so we can easy fix the problem. And then implement the better solutions being proposed. I think we need a pragmatic solution.

Mile23’s picture

From my MR comment:

It looks like the Connection object can have a reference to the transaction manager, which means it's not being destroyed, right? So if ($this->$transactionManager) { ... then use it, otherwise call connection->commit(). We'd look at the property rather than calling the service method ($this->transactionManager()) because otherwise we run into to the problem mentioned... The service method could end up generating a new transaction manager. Ideal solution: Actually inject the service instead of lazily generating one, so we always have a reference to it and it's not being destroyed before the Connection object.

I'm not sure about the internals of the DB layer, so maybe it's naive to say. :-) But just keeping a reference to things you expect to exist in your class seems like a non-breaking change that could be applied to past versions.

rfay’s picture

Title: Transaction autocommit during shutdown relies on unreliable object destruction order » Transaction autocommit during shutdown relies on unreliable object destruction order (xdebug 3.3+ enabled)

jrglasgow made their first commit to this issue’s fork.

jrglasgow’s picture

I was unable to get either merge request (10.1.x and 11.x) to apply to my 10.2.x sites, I hand applied the patch and created a new branch to be merged into 10.2.x. It is working for me on my local.

Mile23’s picture

Destruct doesn’t happen in order during script shutdown so even if you had a reference to something it might have been destroyed already and become NULL.

Like this:

abstract class Connection {
  public function __construct(object $connection, array $connection_options) {
    // etc. etc.....
    $this->transactionManager = $this->driverTransactionManager();
  }
}

Now you always have a reference to the transaction manager for the life of the object, and the transaction manager will not be garbage-collected before the Connection is destroyed.

If PHP has decided that a referenced object can be destroyed through GC before there are 0 references to it, then we have a problem we can not solve here.

Driskell’s picture

> If PHP has decided that a referenced object can be destroyed through GC before there are 0 references to it, then we have a problem we can not solve here.

With XDebug develop mode since 3.2+ if a function throws then XDebug will capture the stack trace along with all in scope variables in all frames. This means the transaction object never reaches 0 at end of function as XDebug holds a reference. Then end of script occurs and AFAIK XDebug doesn’t free - so end of script cleanup kicks in and picks at random to free (as far as PHP is concerned the only way the objects are still alive at this point is circular reference). So it’s less that they free before hitting 0 and more than they will never reach 0 so PHP just picks them off at random.

I kind of agree that this is somewhat unfixable and even if it was fixed it would be just too complex. The only way forward is probably to remove implicit transaction objects entirely or throw warning of no compatibility for develop mode in XDebug.

alexpott’s picture

Status: Needs work » Needs review

I've opened an MR that uses service destruction and the kernel to work around this bug. It's a more minimal approach to resolving the issue in 10.2 and 10.1 that results in me being to run the \Drupal\Tests\mysql\Kernel\mysql\TransactionTest successfully locally with xdebug 3.3.0 in develop mode. If I run against HEAD it is broken horribly.

Yes it would be great to fix in a more explicit way but Drupal's transaction committing has been implicit for a very long time so I think moving away from the properly is going to take time and a lot of work.

@mondrake what do think about doing something like https://git.drupalcode.org/project/drupal/-/merge_requests/6614 as stop gap and then trying to fix this probably in another issue.

alexpott’s picture

TIL the kernel is not terminated in the interactive installer during installation. Wowzers.

alexpott’s picture

Okay so Thunder's tests pass with the minimal mr with all the installer kernel termination fixes... https://github.com/thunder/thunder-distribution/pull/773

mondrake’s picture

Cool. Left one comment inline in the MR.

alexpott’s picture

Thanks for the review @mondrake. I left replies to your comments on the MR.

mondrake changed the visibility of the branch 3405976-mondrake-approach-10.2.x to hidden.

mondrake changed the visibility of the branch 3405976-mondrake-approach to hidden.

mondrake changed the visibility of the branch 10.1.x to hidden.

mondrake’s picture

Status: Needs review » Reviewed & tested by the community

Tricky stuff. Needs pipeline execution for the other dbs, I can not run those.

While this also fixed some things along, IMHO the future should be along the lines of #84. Hopefully we'll get there, all these dances at destruction time seem a bit fragile.

catch’s picture

Status: Reviewed & tested by the community » Needs review

One comment (in two suggestions) on the MR.

solideogloria’s picture

I still have the error in #93 when using xdebug after applying the latest MR diff to 10.2.3.

Can someone review if this is related or if the error should be a different issue or something?

alexpott’s picture

Thanks for the review @catch - I've made those changes.

@solideogloria can you re-test with the latest MR. If it doesn't work can you give as much information as possible so we can work out why. Thanks!

longwave’s picture

I hate to ask but is there any way we could add some form of test coverage to this, to avoid us refactoring it again in the future and breaking it again?

Also this fix only applies to the default connection via the database service, I assume anyone doing anything with other connections is on their own. Would they leave a dangling transaction to be auto committed? Should we document this for them somehow?

alexpott’s picture

We have fails in Drupal\Tests\standard\FunctionalJavascript\StandardPerformanceTest to work why this code is causing extra queries there... but not on SQLite or MySQL...

solideogloria’s picture

Status: Needs review » Needs work

The latest commit broke something.

PHP Fatal error:  Uncaught Error: Call to a member function get() on null in /var/www/html/web/core/lib/Drupal/Core/DrupalKernel.php:1360
Stack trace:
#0 /var/www/html/web/core/lib/Drupal/Core/DrupalKernel.php(1244): Drupal\Core\DrupalKernel->initializeServiceProviders()
#1 /var/www/html/web/core/lib/Drupal/Core/DrupalKernel.php(937): Drupal\Core\DrupalKernel->compileContainer()
#2 /var/www/html/web/core/lib/Drupal/Core/DrupalKernel.php(494): Drupal\Core\DrupalKernel->initializeContainer()
#3 /var/www/html/web/core/includes/utility.inc(34): Drupal\Core\DrupalKernel->boot()
#4 /var/www/html/vendor/drush/drush/src/Commands/core/CacheRebuildCommands.php(66): drupal_rebuild()
#5 [internal function]: Drush\Commands\core\CacheRebuildCommands->rebuild()

The line of the error:

      // Special-case the database service so that transactions started by other
      // destructable services are closed.
      $this->container->get('database')->destruct();
catch’s picture

Also this fix only applies to the default connection via the database service, I assume anyone doing anything with other connections is on their own.

We could probably loop over all database connections and commit their transactions, but it'd need a diversion from the service approach then.

solideogloria’s picture

I don't know if it's relevant, but I do have external DB connections defined. Let me know if you need any additional info for debugging in addition to what I've given.

alexpott’s picture

FileSize
6.91 KB

@solideogloria can you supply a full stack trace \Drupal\Core\DrupalKernel::initializeServiceProviders() does not call terminate. Plus can you provide more information about what you're doing when things do not work. As far as I can see you are doing a drush cache rebuild... this works fine for me. Are you sure the patch as applied correctly? We're not changing any line near 1360... I'm attaching a patch for 10.2.x that allows me to install Drupal and rebuild the cache using Drush as the MR does not apply to 10.2.x. Please try this instead.

alexpott changed the visibility of the branch 3405976-with-3421164 to hidden.

alexpott’s picture

Status: Needs work » Needs review

The postgres fails are nothing to do with this issue - they're happening in HEAD - see #3420401-10: StandardPerformanceTest fails randomly on MySQL and consistently on Postgres...

@solideogloria / re #118 I'm pretty sure that was a patch applying issue - got exactly the same fail on Thunder (see https://github.com/thunder/thunder-distribution/actions/runs/7920011867/...). The patch in #122 fixes that.

solideogloria’s picture

The MR diff applies to 10.2.3 for me. Note that the .diff can sometimes apply differently than the .patch; maybe that's why.

Full stack trace from using 6614.diff from #118 and clearing the cache w/ Drush.

$ ddev drush cr
PHP Fatal error:  Uncaught Error: Call to a member function get() on null in /var/www/html/web/core/lib/Drupal/Core/DrupalKernel.php:1360
Stack trace:
#0 /var/www/html/web/core/lib/Drupal/Core/DrupalKernel.php(1244): Drupal\Core\DrupalKernel->initializeServiceProviders()
#1 /var/www/html/web/core/lib/Drupal/Core/DrupalKernel.php(937): Drupal\Core\DrupalKernel->compileContainer()
#2 /var/www/html/web/core/lib/Drupal/Core/DrupalKernel.php(494): Drupal\Core\DrupalKernel->initializeContainer()
#3 /var/www/html/web/core/includes/utility.inc(34): Drupal\Core\DrupalKernel->boot()
#4 /var/www/html/vendor/drush/drush/src/Commands/core/CacheRebuildCommands.php(66): drupal_rebuild()
#5 [internal function]: Drush\Commands\core\CacheRebuildCommands->rebuild()
#6 /var/www/html/vendor/consolidation/annotated-command/src/CommandProcessor.php(276): call_user_func_array()
#7 /var/www/html/vendor/consolidation/annotated-command/src/CommandProcessor.php(212): Consolidation\AnnotatedCommand\CommandProcessor->runCommandCallback()
#8 /var/www/html/vendor/consolidation/annotated-command/src/CommandProcessor.php(175): Consolidation\AnnotatedCommand\CommandProcessor->validateRunAndAlter()
#9 /var/www/html/vendor/consolidation/annotated-command/src/AnnotatedCommand.php(387): Consolidation\AnnotatedCommand\CommandProcessor->process()
#10 /var/www/html/vendor/symfony/console/Command/Command.php(326): Consolidation\AnnotatedCommand\AnnotatedCommand->execute()
#11 /var/www/html/vendor/symfony/console/Application.php(1096): Symfony\Component\Console\Command\Command->run()
#12 /var/www/html/vendor/symfony/console/Application.php(324): Symfony\Component\Console\Application->doRunCommand()
#13 /var/www/html/vendor/symfony/console/Application.php(175): Symfony\Component\Console\Application->doRun()
#14 /var/www/html/vendor/drush/drush/src/Runtime/Runtime.php(110): Symfony\Component\Console\Application->run()
#15 /var/www/html/vendor/drush/drush/src/Runtime/Runtime.php(40): Drush\Runtime\Runtime->doRun()
#16 /var/www/html/vendor/drush/drush/drush.php(139): Drush\Runtime\Runtime->run()
#17 /var/www/html/vendor/drush/drush/drush(4): require('...')
#18 /var/www/html/vendor/bin/drush(119): include('...')
#19 {main}
  thrown in /var/www/html/web/core/lib/Drupal/Core/DrupalKernel.php on line 1360

 [warning] Drush command terminated abnormally.

It does look like the .diff applied incorrectly, though it did "succeed". It put the following code in the wrong function

    // Special-case the database service so that transactions started by other
    // destructable services are closed.
    $this->container->get('database')->destruct();
solideogloria’s picture

When using the #122 patch and drush cr, I get the same error in #93

solideogloria’s picture

Here is some additional debugging. The error in #93 occurs for an INSERT INTO "watchdog" query. $this->connection is NULL, and the query is being used to log an error, but since the connection is NULL, an exception occurs instead of the logging succeeding.

Since the actual exception logging fails, here's the call stack prior to the error:

Drupal\Core\Database\Connection->prepareStatement (...\web\core\lib\Drupal\Core\Database\Connection.php: 577)
Drupal\mysql\Driver\Database\mysql\Insert->execute (...\web\core\modules\mysql\src\Driver\Database\mysql\Insert.php: 42)
Drupal\dblog\Logger\DbLog->log (...\web\core\modules\dblog\src\Logger\DbLog.php: 78)
Drupal\Core\Logger\LoggerChannel->log (...\web\core\lib\Drupal\Core\Logger\LoggerChannel.php: 127)
_drupal_log_error (...\web\core\includes\errors.inc: 175)
_drupal_exception_handler (...\web\core\includes\bootstrap.inc: 182)

And the exception it was trying to log:

%type: "AssertionError"
@message: "Transaction $stack was not empty. Active stack: 65cf8c6fcd9c88.03805306\drupal_transaction"
%function: "assert()"
%file: "/var/www/html/web/core/lib/Drupal/Core/Database/Transaction/TransactionManagerBase.php"
%line: 99
@backtrace_string:
"#0 /var/www/html/web/core/lib/Drupal/Core/Database/Transaction/TransactionManagerBase.php(99): assert()
#1 [internal function]: Drupal\Core\Database\Transaction\TransactionManagerBase->__destruct()
#2 {main}"
exception: AssertionError
severity_level: 3
channel: "php"
alexpott’s picture

FileSize
7.37 KB

@solideogloria I've got a couple of questions about #93. Can you confirm that the bug goes away if the php.ini setting xdebug.mode does not contain develop. I.e. it is something like xdebug.mode=debug

Also can you try with the patch attached as this might reveal the error that's causing the error... it makes a small change to _drupal_get_error_level() so errors in error handling don't get in the way.

solideogloria’s picture

Trying with the updated patch and with xdebug.mode=debug, I still get the error. The cache rebuild says is completed, but the error happens after that:

$ ddev drush cr
 [success] Cache rebuild complete.
<h1>Additional uncaught exception thrown while handling exception.</h1><h2>Original</h2><p><em class="placeholder">AssertionError</em>: Transaction $stack was not empty. Active stack: 65cf912f0554b3.60082165\drupal_transaction in <em class="placeholder">assert()</em> (line <em class="placeholder">99</em> of <em class="placeholder">/var/www/html/web/core/lib/Drupal/Core/Database/Transaction/TransactionManagerBase.php</em>). <pre class="backtrace">assert() (Line: 99)
Drupal\Core\Database\Transaction\TransactionManagerBase-&gt;__destruct()
</pre></p><h2>Additional</h2><p><em class="placeholder">TypeError</em>: Drupal\Core\Database\StatementWrapperIterator::__construct(): Argument #2 ($clientConnection) must be of type object, null given, called in /var/www/html/web/core/lib/Drupal/Core/Database/Connection.php on line 577 in <em class="placeholder">Drupal\Core\Database\StatementWrapperIterator-&gt;__construct()</em> (line <em class="placeholder">54</em> of <em class="placeholder">/var/www/html/web/core/lib/Drupal/Core/Database/StatementWrapperIterator.php</em>). <pre class="backtrace">Drupal\Core\Database\StatementWrapperIterator-&gt;__construct() (Line: 577)
Drupal\Core\Database\Connection-&gt;prepareStatement() (Line: 42)
Drupal\mysql\Driver\Database\mysql\Insert-&gt;execute() (Line: 78)
Drupal\dblog\Logger\DbLog-&gt;log() (Line: 101)
Drupal\Core\Logger\LoggerChannel-&gt;log() (Line: 175)
_drupal_log_error() (Line: 182)
_drupal_exception_handler()
</pre></p><hr />PHP Fatal error:  Uncaught TypeError: Drupal\Core\Database\Connection::getClientConnection(): Return value must be of type object, null returned in /var/www/html/web/core/lib/Drupal/Core/Database/Connection.php:326
Stack trace:
#0 /var/www/html/web/core/modules/mysql/src/Driver/Database/mysql/TransactionManager.php(31): Drupal\Core\Database\Connection->getClientConnection()
#1 /var/www/html/web/core/lib/Drupal/Core/Database/Transaction/TransactionManagerBase.php(290): Drupal\mysql\Driver\Database\mysql\TransactionManager->processRootCommit()
#2 /var/www/html/web/core/lib/Drupal/Core/Database/Transaction.php(88): Drupal\Core\Database\Transaction\TransactionManagerBase->unpile()
#3 [internal function]: Drupal\Core\Database\Transaction->__destruct()
#4 {main}
  thrown in /var/www/html/web/core/lib/Drupal/Core/Database/Connection.php on line 326

Fatal error: Uncaught TypeError: Drupal\Core\Database\Connection::getClientConnection(): Return value must be of type object, null returned in /var/www/html/web/core/lib/Drupal/Core/Database/Connection.php on line 326

TypeError: Drupal\Core\Database\Connection::getClientConnection(): Return value must be of type object, null returned in /var/www/html/web/core/lib/Drupal/Core/Database/Connection.php on line 326

Call Stack:
   10.2068   71972432   1. _drupal_exception_handler() /var/www/html/web/core/includes/bootstrap.inc:0
   10.9371   71954480   2. _drupal_exception_handler_additional() /var/www/html/web/core/includes/bootstrap.inc:186
   10.9371   71954480   3. error_displayable() /var/www/html/web/core/includes/bootstrap.inc:202
   10.9371   71954480   4. _drupal_get_error_level() /var/www/html/web/core/includes/errors.inc:123
   10.9371   71954480   5. Drupal::config() /var/www/html/web/core/includes/errors.inc:323
   10.9371   71954480   6. Drupal\Core\Config\ConfigFactory->get() /var/www/html/web/core/lib/Drupal.php:411
   10.9371   71954480   7. Drupal\Core\Config\ConfigFactory->doGet() /var/www/html/web/core/lib/Drupal/Core/Config/ConfigFactory.php:89
   10.9371   71954696   8. Drupal\Core\Config\ConfigFactory->doLoadMultiple() /var/www/html/web/core/lib/Drupal/Core/Config/ConfigFactory.php:104
   10.9371   71954808   9. Drupal\Core\Config\CachedStorage->readMultiple() /var/www/html/web/core/lib/Drupal/Core/Config/ConfigFactory.php:165
   10.9371   71955432  10. Drupal\Core\Cache\ChainedFastBackend->getMultiple() /var/www/html/web/core/lib/Drupal/Core/Config/CachedStorage.php:87
   10.9371   71955432  11. Drupal\Core\Cache\DatabaseBackend->getMultiple() /var/www/html/web/core/lib/Drupal/Core/Cache/ChainedFastBackend.php:167
   10.9371   71956592  12. Drupal\mysql\Driver\Database\mysql\Connection->query() /var/www/html/web/core/lib/Drupal/Core/Cache/DatabaseBackend.php:118
   10.9372   71957640  13. Drupal\mysql\Driver\Database\mysql\Connection->prepareStatement() /var/www/html/web/core/lib/Drupal/Core/Database/Connection.php:843
   11.4784   71958024  14. Drupal\Core\Database\StatementWrapperIterator->__construct() /var/www/html/web/core/lib/Drupal/Core/Database/Connection.php:577
alexpott’s picture

FileSize
7.78 KB

Here's another patch for 10.2.x that has the latest changes that changes thing to do a commit on all open db connections. @solideogloria it's really interesting that disabling xdebug seems to fix your problem but not changing the mode. That's the first reported instance of this. Can you confirm that you only get the errors with xdebug enbled? Also can you try the latest patch attached here as you did mention that you have multiple connections.

solideogloria’s picture

I was trying to change the mode with ini_set('xdebug.mode', 'debug'); in settings.php, but apparently that doesn't work, as ddev drush eval "echo ini_get('xdebug.mode');" still shows debug,develop.

I looked up how to change the mode with DDEV and now I don't get the error after properly setting the mode to debug.

solideogloria’s picture

Using #130, I do still get an error, only when xdebug is enabled with develop. The error is very similar to #93, but I will include it:

$ ddev drush cr
 [success] Cache rebuild complete.
PHP Fatal error:  Uncaught TypeError: Drupal\Core\Database\StatementWrapperIterator::__construct(): Argument #2 ($clientConnection) must be of type object, null given, called in /var/www/html/web/core/lib/Drupal/Core/Database/Connection.php on line 580 and defined in /var/www/html/web/core/lib/Drupal/Core/Database/StatementWrapperIterator.php:54
Stack trace:
#0 /var/www/html/web/core/lib/Drupal/Core/Database/Connection.php(580): Drupal\Core\Database\StatementWrapperIterator->__construct()
#1 /var/www/html/web/core/lib/Drupal/Core/Database/Connection.php(846): Drupal\Core\Database\Connection->prepareStatement()
#2 /var/www/html/web/core/lib/Drupal/Core/Cache/DatabaseBackend.php(118): Drupal\Core\Database\Connection->query()
#3 /var/www/html/web/core/lib/Drupal/Core/Cache/ChainedFastBackend.php(167): Drupal\Core\Cache\DatabaseBackend->getMultiple()
#4 /var/www/html/web/core/lib/Drupal/Core/Config/CachedStorage.php(87): Drupal\Core\Cache\ChainedFastBackend->getMultiple()
#5 /var/www/html/web/core/lib/Drupal/Core/Config/ConfigFactory.php(165): Drupal\Core\Config\CachedStorage->readMultiple()
#6 /var/www/html/web/core/lib/Drupal/Core/Config/ConfigFactory.php(104): Drupal\Core\Config\ConfigFactory->doLoadMultiple()
#7 /var/www/html/web/core/lib/Drupal/Core/Config/ConfigFactory.php(89): Drupal\Core\Config\ConfigFactory->doGet()
#8 /var/www/html/web/core/lib/Drupal.php(411): Drupal\Core\Config\ConfigFactory->get()
#9 /var/www/html/web/core/includes/errors.inc(323): Drupal::config()
#10 /var/www/html/web/core/includes/errors.inc(123): _drupal_get_error_level()
#11 /var/www/html/web/core/includes/bootstrap.inc(202): error_displayable()
#12 /var/www/html/web/core/includes/bootstrap.inc(186): _drupal_exception_handler_additional()
#13 [internal function]: _drupal_exception_handler()
#14 {main}
  thrown in /var/www/html/web/core/lib/Drupal/Core/Database/StatementWrapperIterator.php on line 54

Fatal error: Uncaught TypeError: Drupal\Core\Database\StatementWrapperIterator::__construct(): Argument #2 ($clientConnection) must be of type object, null given, called in /var/www/html/web/core/lib/Drupal/Core/Database/Connection.php on line 580 and defined in /var/www/html/web/core/lib/Drupal/Core/Database/StatementWrapperIterator.php on line 54

TypeError: Drupal\Core\Database\StatementWrapperIterator::__construct(): Argument #2 ($clientConnection) must be of type object, null given, called in /var/www/html/web/core/lib/Drupal/Core/Database/Connection.php on line 580 in /var/www/html/web/core/lib/Drupal/Core/Database/StatementWrapperIterator.php on line 54

Call Stack:
    7.0476   71920152   1. _drupal_exception_handler($exception = class AssertionError { protected $message = 'Transaction $stack was not empty. Active stack: 65cf9bfc172788.99424958\\drupal_transaction'; private string ${Error}string = ''; protected $code = 1; protected string $file = '/var/www/html/web/core/lib/Drupal/Core/Database/Transaction/TransactionManagerBase.php'; protected int $line = 99; private array ${Error}trace = [0 => [...], 1 => [...]]; private ?Throwable ${Error}previous = NULL }) /var/www/html/web/core/includes/bootstrap.inc:0
    8.3110   71904080   2. _drupal_exception_handler_additional($exception = class AssertionError { protected $message = 'Transaction $stack was not empty. Active stack: 65cf9bfc172788.99424958\\drupal_transaction'; private string ${Error}string = ''; protected $code = 1; protected string $file = '/var/www/html/web/core/lib/Drupal/Core/Database/Transaction/TransactionManagerBase.php'; protected int $line = 99; private array ${Error}trace = [0 => [...], 1 => [...]]; private ?Throwable ${Error}previous = NULL }, $exception2 = class TypeError { protected $message = 'Drupal\\Core\\Database\\StatementWrapperIterator::__construct(): Argument #2 ($clientConnection) must be of type object, null given, called in /var/www/html/web/core/lib/Drupal/Core/Database/Connection.php on line 580'; private string ${Error}string = ''; protected $code = 0; protected string $file = '/var/www/html/web/core/lib/Drupal/Core/Database/StatementWrapperIterator.php'; protected int $line = 54; private array ${Error}trace = [0 => [...], 1 => [...], 2 => [...], 3 => [...], 4 => [...], 5 => [...], 6 => [...], 7 => [...], 8 => [...]]; private ?Throwable ${Error}previous = NULL }) /var/www/html/web/core/includes/bootstrap.inc:186
    8.3110   71904080   3. error_displayable($error = ???) /var/www/html/web/core/includes/bootstrap.inc:202
    8.3110   71904080   4. _drupal_get_error_level() /var/www/html/web/core/includes/errors.inc:123
    8.3110   71904080   5. Drupal::config($name = 'system.logging') /var/www/html/web/core/includes/errors.inc:323
    8.3110   71904080   6. Drupal\Core\Config\ConfigFactory->get($name = 'system.logging') /var/www/html/web/core/lib/Drupal.php:411
    8.3110   71904080   7. Drupal\Core\Config\ConfigFactory->doGet($name = 'system.logging', $immutable = ???) /var/www/html/web/core/lib/Drupal/Core/Config/ConfigFactory.php:89
    8.3110   71904296   8. Drupal\Core\Config\ConfigFactory->doLoadMultiple($names = [0 => 'system.logging'], $immutable = TRUE) /var/www/html/web/core/lib/Drupal/Core/Config/ConfigFactory.php:104
    8.3110   71904408   9. Drupal\Core\Config\CachedStorage->readMultiple($names = [0 => 'system.logging']) /var/www/html/web/core/lib/Drupal/Core/Config/ConfigFactory.php:165
    8.3110   71905032  10. Drupal\Core\Cache\ChainedFastBackend->getMultiple($cids = [0 => 'system.logging'], $allow_invalid = ???) /var/www/html/web/core/lib/Drupal/Core/Config/CachedStorage.php:87
    8.3110   71905032  11. Drupal\Core\Cache\DatabaseBackend->getMultiple($cids = [0 => 'system.logging'], $allow_invalid = FALSE) /var/www/html/web/core/lib/Drupal/Core/Cache/ChainedFastBackend.php:167
    8.3110   71906192  12. Drupal\Core\Database\Connection->query($query = 'SELECT [cid], [data], [created], [expire], [serialized], [tags], [checksum] FROM {cache_config} WHERE [cid] IN ( :cids[] ) ORDER BY [cid]', $args = [':cids[]' => [0 => 'system.logging']], $options = ???) /var/www/html/web/core/lib/Drupal/Core/Cache/DatabaseBackend.php:118
    8.3111   71907240  13. Drupal\Core\Database\Connection->prepareStatement($query = 'SELECT [cid], [data], [created], [expire], [serialized], [tags], [checksum] FROM {cache_config} WHERE [cid] IN ( :cids__0 ) ORDER BY [cid]', $options = ['fetch' => 5, 'allow_delimiter_in_query' => FALSE, 'allow_square_brackets' => FALSE, 'pdo' => []], $allow_row_count = ???) /var/www/html/web/core/lib/Drupal/Core/Database/Connection.php:846
    8.9296   71907624  14. Drupal\Core\Database\StatementWrapperIterator->__construct($connection = class Drupal\mysql\Driver\Database\mysql\Connection { protected $target = 'default'; protected $key = 'default'; protected $logger = NULL; protected $transactionLayers = []; protected $driverClasses = []; protected $statementWrapperClass = 'Drupal\\Core\\Database\\StatementWrapperIterator'; protected $transactionalDDLSupport = FALSE; protected $connection = NULL; protected $connectionOptions = REDACTED; REDACTED...}, $clientConnection = NULL, $query = 'SELECT "cid", "data", "created", "expire", "serialized", "tags", "checksum" FROM "cache_config" WHERE "cid" IN ( :cids__0 ) ORDER BY "cid"', $options = [], $rowCountEnabled = FALSE) /var/www/html/web/core/lib/Drupal/Core/Database/Connection.php:580
solideogloria’s picture

Also, even after commenting out my other DB connections, I still have the error... but now the error only happens roughly every other time I clear the cache..., alternating a success and a failure.

solideogloria’s picture

Even so, the cache clear does succeed sometimes, with all my connections enabled (though I know the connections themselves are being used for queries during a cache clear). So it seems to be something unreliable that causes the error.

Does it perhaps matter which order things are destructed?

It appears that whether I have all my custom connections enabled or not, the error happens roughly 50-60% of the time either way.

solideogloria’s picture

I tried on another site using #130, and it works fine (though I can't reproduce the error on that site). I'm going to try to narrow down if there's a certain patch or something I'm using that is causing an issue.

solideogloria’s picture

I have no idea. I removed all other patches I'm using from both contrib and core, and I still get the issue, and I can reliably reproduce it on the site.

Is there a way to debug this further or look into the transaction stack or destruct order?

solideogloria’s picture

Also, is it possible for the logging of DB credentials during serialization of the connection object to be prevented?

alexpott’s picture

@solideogloria yeah if you can find out what's causing this that'd be awesome. If you could share your composer.json and core.extension configuration with me that might help. Might some contrib that's causing the problem. Or maybe (even better) there's some way you could share your build with me. Are you on slack?

solideogloria’s picture

I am not on slack. I am going to try one more thing. I uninstalled all custom code. That didn't fix it, so it's in core or contrib. I will continue uninstalling modules and clearing the cache afterwards until I see it fixed or run out of modules to uninstall. That will hopefully narrow it down.

I probably won't finish this step until next week.

alexpott’s picture

@solideogloria does the bug only happen when you clear the caches with Drush? Do you all get the error when you clear via the UI. Just wondering if it is something special about using Drush?

alexpott’s picture

I've moved the commits even later during execution - moved them to shutdown functions. This has some interesting effects on kernel tests which mean we need to commit transactions when we close connections - which makes sense to me. That in turn makes \Drupal\KernelTests\Core\Database\DriverSpecificTransactionTestBase::testTransactionManagerFailureOnPendingStackItems() pretty meaningless in its current form.

@solideogloria does this change in approach fix things for you? New 10.2.x patch attached.

solideogloria’s picture

The issue only happens when using Drush. I don't see it when clearing the cache via the UI. I am using Drush 12.

When using #141, I still get the error with the same frequency. Same trace as #132

solideogloria’s picture

Okay, after uninstalling a lot of modules a few at a time and deleting content in bulk, I discovered the source of the issue.

The error stops occurring if I delete all custom menu links by navigating to /admin/modules/uninstall/entity/menu_link_content (route system.prepare_modules_entity_uninstall).

This is content for the "Custom Menu Links" core module (menu_link_content)

solideogloria’s picture

I am going to update this comment with further debugging steps taken.

  1. Commenting out the line $this->menuLinksRebuild(); in MenuRouterRebuildSubscriber::onRouterRebuild fixes the issue.
  2. For the times when the cache rebuild succeeds, the following functions are not called, because of the lock acquired in/on MenuRouterRebuildSubscriber::menuLinksRebuild().
  3. Commenting out the line $this->menuLinkManager->rebuild(); in MenuRouterRebuildSubscriber::menuLinksRebuild fixes the issue.
  4. Commenting out the every line except $definitions = $this->getDefinitions(); in MenuLinkManager::rebuild does not fix the issue.
  5. Traced to $definitions = $this->getDiscovery()->getDefinitions(); in MenuLinkManager::getDefinitions, which is calling DerivativeDiscoveryDecorator::getDefinitions
  6. Returning an empty array in DerivativeDiscoveryDecorator::getDerivatives fixes the issue, if you only return an empty array when it's called for menu links. I used if (count($base_plugin_definitions) === X) { return []; } with X set to the inspected length of the array.
  7. I used if (count($base_plugin_definitions) === X) { return $this->getDerivativesDebug($base_plugin_definitions); }, with getDerivativesDebug being a copy of the original function. That way I can set breakpoints only for this occurrence. I couldn't find a nice way to do it with conditional breakpoints.
  8. I traced the problem to inside the if ($deriver) { block.
  9. Commenting out the foreach loop inside the if statement has no effect, so the issue is with $deriver->getDerivativeDefinitions($plugin_definition);.
  10. The $base_plugin_id the error occurs for:menu_link_content, and only this one. Change the if statement to if ($deriver && $base_plugin_id !== 'menu_link_content') { fixes the issue. This traces the problem to MenuLinkContentDeriver::getDerivativeDefinitions
  11. Traced the problem to the line $plugin_definitions[$menu_link_content->uuid()] = $menu_link_content->getPluginDefinition();. So loading the menu links is not the issue.
  12. I traced the problem to a single menu link, and I now know exactly what causes the issue!
solideogloria’s picture

@alexpott The issue is caused by having a menu link that, in the "Link" field, has a path that is a node's redirect.

To reproduce my issue:

  1. Enable xdebug with develop in the mode. ddev xdebug on
  2. Create a node and give it a path alias /test1. Save.
  3. Edit the node, and change the alias to /test2. Save. This step and the next step can be in any order.
  4. Create a new menu link, with the link going to the string "/test1", not to the node ID, and save it.
  5. drush cache:rebuild
  6. If the cache rebuild worked, try again, on the off chance that the menu wasn't rebuilt due to a wait lock. See #145 item 2. drush cache:rebuild
  7. See an error like #132. The error may include sensitive connection info from settings.php, including database, username, password, etc. Redact any errors before pasting online.

A workaround for the issue

Edit any menu links that have a link to a redirect path and use the canonical path to the node instead.

This still should be fixed for real. I deliberately wanted the menu link to go to the redirect, though I can change it if it causes an issue that won't be fixed.

alexpott’s picture

Thanks @solideogloria - excellent information - I can now reproduce the bug you're seeing and can fix it. Patch incoming!

alexpott’s picture

FileSize
9.66 KB

Here's a patch that should hopefully filx @solideogloria's problem and call terminate consistently on kernel's in core. It's also going to solve locks not being released by drush cache-rebuild - so that will be improved too... going to look for an existing issue about drupal_rebuild not releasing locks because my guess it that there must be one!

Going to drop the shutdown approach because it adds complexity and if you are doing database transaction in shutdown functions then I think that is quite a special case.

alexpott changed the visibility of the branch 3405976-minimal-approach-shutdown-functions to hidden.

alexpott’s picture

Issue summary: View changes

Improved the issue summary.

solideogloria’s picture

I confirmed that the patch #148 fixes the issue.

Would it be possible to write a test following the steps in #146? Or is that not necessary?

alexpott’s picture

@solideogloria we can't test the xdebug mode issue because we don't have xdebug running on gitlabci - but what we can do is test that locks are released after calling drupal_rebuild().

solideogloria’s picture

we can't test the xdebug mode issue because we don't have xdebug running on gitlabci

Ah, that's right. I didn't think of that.

alexpott’s picture

Unfortunately we can't test the locks being release either as that is caused by the transactions not being committed- so everything is caused by PHP's object destruction not occurring as we expect - and we have no way to cause this other than with xdebug in develop mode.

So I don't think we can add a truly meaningful test here. What we can test is that calling terminate or shutdown causes database transactions to be committed.

alexpott changed the visibility of the branch 3405976-minimal-approach-shutdown-functions to active.

alexpott’s picture

After sleeping on this I realised that we can fix this in a more reliable way - we can trigger a shutdown function if a transaction is created. This way we don't need to worry about kernel termination. Re-opened that MR and here's a 10.2.x patch for that approach.

alexpott’s picture

Issue summary: View changes

Discussed the two approaches with @catch and @longwave and we all preferred the shutdown function approach because:

  • It is more self-contained
  • fixes all the situations where this can occur without having to mandate that kernel terminate is called
  • Does not require changes to custom or contrib code.

Updated issue summary accordingly.

alexpott changed the visibility of the branch 3405976-minimal-approach to hidden.

alexpott’s picture

We have MRs for 11.x, 10.2.x and 10.1.x

I think this is as good as we can make it.

longwave’s picture

+1 to the approach and the patch looks good but I haven't had time yet to try and reproduce again with XDebug enabled.

@solideogloria @Driskell if you can confirm that this latest set of patches fix your issues here that would be helpful to getting this committed.

mondrake’s picture

+1, I like this... besides being more self-contained, it will also be simpler to adjust if in the future we will change default commit from implicit to explicit. Left a couple of questions on the MR, not sure how feasible/relevant.

alexpott’s picture

Thanks @mondrake for the review - I've responded on the MR to your questions / points.

mondrake’s picture

Great work... I'd RTBC this but think the honours better be done by someone who can test in the specific context of using xdebug.

solideogloria’s picture

Status: Needs review » Reviewed & tested by the community

I tested #156, and the updated approach fixes the issues. (I tested both the generic issue, and the issue with menu links to redirects)

mferanda’s picture

Tested the patch against Drupal 10.2.x: Successfully fixed the issue.

For my test, I re-enabled xDebug and verified that the old issue was happening. After applying the patch, the issue seemed to be resolved.

  • catch committed b09cf3ba on 10.2.x
    Issue #3405976 by alexpott, jrglasgow, fago, catch, mondrake,...

  • catch committed 61530261 on 10.3.x
    Issue #3405976 by alexpott, jrglasgow, fago, catch, mondrake,...

  • catch committed 26308296 on 11.x
    Issue #3405976 by alexpott, jrglasgow, fago, catch, mondrake,...
catch’s picture

Version: 11.x-dev » 10.2.x-dev
Status: Reviewed & tested by the community » Fixed

I wrote some code for one of the MRs here but none of it got used in the final version so I think I'm fine to commit. I think this is the best we can do, it makes things more robust and it makes the weirdness of the current implementation more explicit with the workaround, which isn't bad thing even if it's not pretty.

Committed/pushed to 11.x, cherry-picked to 10.3.x, and the separate MR to 10.2.x

I haven't committed the 10.1.x MR to 10.1.x, it's in security-only mode so we'd need to do a 'bonus' patch release or bundle the commit with a security release for it to actually be useful to people. Tempted to leave things there and people can apply it manually if they need it, so marking fixed, but if someone feels really strongly about a backport to 10.1.x, please re-open.

solideogloria’s picture

Thank you everyone for helping fix this issue.

catch’s picture

Version: 10.2.x-dev » 11.x-dev
Status: Fixed » Active

This broke pgsql tests, should've thought to kick off an MR run on the other databases. https://git.drupalcode.org/project/drupal/-/jobs/874457

Not rolling back yet since it's explicitly the pgsql transaction coverage that's failing so maybe we just need to fix an assumption? But if it's tricky we should roll back and re-commit with that fixed.

alexpott changed the visibility of the branch 3405976-10.1.x-shutdown-functions to hidden.

alexpott changed the visibility of the branch 3405976-10.2.x-shutdown-functions to hidden.

alexpott changed the visibility of the branch 3405976-minimal-approach-shutdown-functions to hidden.

alexpott’s picture

Status: Active » Needs review

Created two MRs one for 11.x and 10.3.x and one for 10.2.x that fix the postgres test. That particular test is pretty icky - I did nearly remove it at one point :(

alexpott changed the visibility of the branch 3405976-minimal-approach-shutdown-functions to active.

alexpott changed the visibility of the branch 3405976-10.1.x-shutdown-functions to active.

alexpott’s picture

Tests on the postgres mrs re green on all db drivers. Yay

  • catch committed 21ff0c87 on 10.2.x
    Issue #3405976 by alexpott, jrglasgow, fago, catch, mondrake,...

  • catch committed 95c48c33 on 10.3.x
    Issue #3405976 by alexpott, jrglasgow, fago, catch, mondrake,...

  • catch committed df0d8ea5 on 11.x
    Issue #3405976 by alexpott, jrglasgow, fago, catch, mondrake,...
catch’s picture

Version: 11.x-dev » 10.2.x-dev
Status: Needs review » Fixed
Issue tags: -Needs issue summary update

Committing the test change from needs review given HEAD is failing and the adjustments are minimal, agreed the test is not fun.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

derickr’s picture

#131 : You can't set `xdebug.mode` with ini_set, only in php.ini (or related files).

herved’s picture

Hello, and sorry to comment on a closed issue, but is this whole xdebug issue considered work in progress? I see some follow-ups.
I stumbled on a really nasty issue where after submitting a form, status messages would randomly not be displayed.
Session data would not get committed in time (after \Drupal\Core\Session\SessionHandler::write) before the next GET request retrieves the session data.
Or even users would not be created in time/fast enough for the next event to succeed.
All kind of inexplicable events, absolute horror to debug.

Even after updating to core 10.2.4 which includes this issue, the same happens.
Only the workarounds from the IS work (ensure that develop is not in your xdebug.mode value OR pin xdebug in CI image to xdebug 3.2.2).

Edit: We also stumbled on this in Drush https://github.com/drush-ops/drush/pull/5869 causing batches to fail (again, all kind of crazy/inexplicable issues, even on core 10.2.4 or 11.x-dev).

mondrake’s picture

Yes, this fix was not sufficient for Drush, see https://github.com/drush-ops/drush/issues/5892.

Only disabling xdebug worked.

I am repeating myself, but I think this will only be solved by getting rid of the implicit commit on destruct.

NikolaAt changed the visibility of the branch 3405976-transactions to active.

NikolaAt changed the visibility of the branch 3405976-mondrake-approach to active.