Problem/Motivation

In #3364706: Refactor transactions we introduced a new TransactionManager, refactoring the internals of the Database Transaction API.

Now I noticed a couple of edge cases that require fixing:

  1. If you open multiple transactions (e.g. root, savepoint_1, savepoint_2, savepoint_3) and you scope out an intermediate one (e.g. savepoint_1), which is a legitimate operation that translates into a 'RELEASE SAVEPOINT savepoint_1', the later Transaction objects remain hanging and when they get out of scope they try to release their corresponding savepoint but this leads to failures. Also, the transaction stack is left in an inconsistent state.
  2. The same happens if you commit a root transaction while there are still active savepoints. Address that in #3384995: Committing a transaction while there are still active savepoints leaves the stack in inconsistent state.
  3. If a DDL statement breaks an active transaction, we currently can clean up the stack (but we will not be able to do it in the mysqli driver that does not have a way to check if a transaction is active), however the Transaction objects remain in scope and can potentially interfere with later transactions when they go out of scope (e.g. because a savepoint_1 Transaction object opened BEFORE the DDL statement may try to release a savepoint_1 opened AFTER the DDL statement).

Steps to reproduce

Proposed resolution

Adjust the internal TransactionManager stack and the Transaction objects to carry a unique ID, and ensure that when a Transaction object is unpiled or rolled back, both Transaction id and name match the info on the stack.

When unpiling a savepoint, remove from the stack the savepoint items opened after that savepoint.

Remaining tasks

User interface changes

API changes

Data model changes

Release notes snippet

Issue fork drupal-3384960

Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Comments

mondrake created an issue. See original summary.

mondrake’s picture

Status: Active » Needs review

Fixes.

mondrake’s picture

Status: Needs review » Needs work

Need to check if 1) from the IS also applies when you commit the root transaction while there are active savepoints.

mondrake’s picture

Issue summary: View changes

Yes, #4 is a case to fix too.

mondrake’s picture

So the fix for #4, which is correct IMHO, leads to a failure in existing tests. This means a change in behaviour, which needs separate discussion and assessment of all ramifications. Filed a separate #3384995: Committing a transaction while there are still active savepoints leaves the stack in inconsistent state issue for that, and reverted the fix here.

mondrake’s picture

Issue summary: View changes
daffie’s picture

@mondrake: When I look at the changes in the MR, we are changing the class variable TransactionManagerBase::stack to be used in order. Only its documentation is saying that it is not a LILO stack and yet after the MR it is going to be a used as a LILO stack. Should we add some order in those save points and changes the class variable to be a real LILO stack?

mondrake’s picture

Well in fact we are not changing, the stack is still not a LIFO one.

A real LIFO (push/pop) stack should be:

Operation            Result
---------            ------
push A               A
push B               A > B
push C               A > B > C
push D               A > B > C > D
pop B                fail, B is not last
pop D                A > B > C
pop C                A > B
pop B                A
pop A                nil

The current API allows to release an item that is in between the stack, but what we have in HEAD now is something like:

Operation            Result
---------            ------
push A               A
push B               A > B
push C               A > B > C
push D               A > B > C > D
unpile B             A > C > D
unpile A             C > D

as you can see, this leaves gaps in the sequence and hanging items once you believe you have released the root item.

With this MR, the logic is changed to

Operation            Result
---------            ------
push A               A
push B               A > B
push C               A > B > C
push D               A > B > C > D
unpile B             A
unpile A             nil

which ensures no gaps.

All this is complicated by the fact that the stack does not contain the Transaction objects themselves, but a non-binding reference to them. Because of how the Transaction implementation is done, such items can only live in the scope of the code that requires them - we cannot put a pointer to them in the stack otherwise they would not be destructed when they get out of scope in the calling code, which is the trigger for the release/commit.

daffie’s picture

@mondrake: According to with this MR the logic changes to:

Operation            Result
---------            ------
push A               A
push B               A > B
push C               A > B > C
push D               A > B > C > D
unpile B             A
unpile A             nil

And I agree that this is the right change.

My point is that when I look at the implementation of unpile(), it is using the class variable TransactionManagerBase::stack is a LIFO stack.

The code from the method unpile() is:

  public function unpile(string $name, string $id): void {
    // If the $id does not correspond to the one in the stack for that $name,
    // we are facing an orphaned Transaction object (for example in case of a
    // DDL statement breaking an active transaction), so there is nothing more
    // to do.
    if (!isset($this->stack()[$id]) || $this->stack()[$id]->name !== $name) {
      return;
    }

    // If unpiling a savepoint, but that does not exist on the stack, the stack
    // got corrupted.
    if ($name !== 'drupal_transaction' && !$this->has($name)) {
      throw new TransactionOutOfOrderException();
    }

    // Release the client transaction savepoint in case the Drupal transaction
    // is not a root one.
    if (
      $this->has($name)
      && $this->stack()[$id]->type === StackItemType::Savepoint
      && $this->getConnectionTransactionState() === ClientConnectionTransactionState::Active
    ) {
      // If we are not releasing the last savepoint but an earlier one, all
      // subsequent savepoints will have been released as well. The stack must
      // be diminished accordingly.
      while (($i = array_key_last($this->stack())) !== $id) {
        $this->removeStackItem($i);
      }
      $this->releaseClientSavepoint($name);
    }

    // Remove the transaction from the stack.
    $this->removeStackItem($id);

    // If this was the last Drupal transaction open, we can commit the client
    // transaction.
    if (
      $this->stackDepth() === 0
      && $this->getConnectionTransactionState() === ClientConnectionTransactionState::Active
    ) {
      $this->processRootCommit();
    }
  }

The while loop is popping StackItems of the stack until it reaches the required StackItem. It is using the class variable TransactionManagerBase::stack is a LIFO stack! Can we at least document the class variable TransactionManagerBase::stack that it is used as a LIFO stack. Maybe do some more to make sure that something else does not changes the order in the class variable. The transactions themselfs are not LIFO, just the class variable TransactionManagerBase::stack.

The rest of the patch is great.

mondrake’s picture

Ah I see you point now @daffie, thanks for that. Tried to clarify in the docblock.

mondrake’s picture

'descoping' looks like a valid word to me, https://en.wiktionary.org/wiki/descoping

daffie’s picture

Priority: Normal » Critical
Status: Needs review » Reviewed & tested by the community

All changes look good to me.
All my remarks have been addressed.
For me it is RTBC.

As the base transactions have already been merged with the D11 branch and without this fix there is the possibility of data loss for a site. I am changing the priority to critical.

@mondrake: Thank you for working on this!

catch’s picture

Status: Reviewed & tested by the community » Fixed

This is tricky stuff but it's great to see the documentation and test coverage improve as well as the code clean-up, feels like major refurbishment to the database system of the past year or two.

Committed/pushed to 11.x, thanks!

  • catch committed 70b607c3 on 11.x
    Issue #3384960 by mondrake, daffie: Strengthen TransactionManager
    
catch’s picture

catch’s picture

Status: Fixed » Needs work

Something wrong here:

    Install site\n
    TypeError:
    Drupal\Core\Database\Transaction\TransactionManagerBase::removeStackItem():
    Argument #1 ($id) must be of type string, int given, called in
    /builds/project/drupal/core/lib/Drupal/Core/Database/Transaction/TransactionManagerBase.php

See https://drupal-gitlab-job-artifacts.s3.us-west-2.amazonaws.com/c0/78/c07...

catch’s picture

Reverted for now.

  • catch committed 8c0b451f on 11.x
    Revert "Issue #3384960 by mondrake, daffie: Strengthen...
catch’s picture

Rebasing this on 11.x should hopefully show the test failure in the MR pipeline to help track down the differences.

mondrake’s picture

Sorry in which job is there a failure? is it possible to get a backtrace? the link does not work for me

catch’s picture

Sorry I think MR pipelines only allow access if you get push access, this should be changing overnight though according to slack.

Found a DrupalCI one, different to the test I linked but same error:

https://dispatcher.drupalci.org/job/drupal8_core_regression_tests/86504/...

1)  ✖ NightwatchAssertError
16:29:58    Command failed: php ./scripts/test-site.php install --setup-file "core/tests/Drupal/TestSite/TestSiteOliveroInstallTestScript.php" --install-profile "minimal"  --base-url http://php-apache-jenkins-drupal8-core-regression-tests-86504/subdirectory --db-url mysql://drupaltestbot:drupaltestbotpw@172.18.0.4/jenkins_drupal8_core_regression_tests_86504 --json
16:29:58 PHP Fatal error:  Uncaught TypeError: Drupal\Core\Database\Transaction\TransactionManagerBase::removeStackItem(): Argument #1 ($id) must be of type string, int given, called in /var/www/html/core/lib/Drupal/Core/Database/Transaction/TransactionManagerBase.php on line 228 and defined in /var/www/html/core/lib/Drupal/Core/Database/Transaction/TransactionManagerBase.php:145
16:29:58 Stack trace:
16:29:58 #0 /var/www/html/core/lib/Drupal/Core/Database/Transaction/TransactionManagerBase.php(228): Drupal\Core\Database\Transaction\TransactionManagerBase->removeStackItem(6503259739987)
16:29:58 #1 /var/www/html/core/lib/Drupal/Core/Database/Transaction.php(88): Drupal\Core\Database\Transaction\TransactionManagerBase->unpile('savepoint_1', '6503259739987')
16:29:58 #2 /var/www/html/core/lib/Drupal/Core/Menu/MenuTreeStorage.php(810): Drupal\Core\Database\Transaction->__destruct()
16:29:58 #3 /var/www/html/core/lib/Drupal/Core/Menu/MenuTreeStorage.php(143): Drupal\Core\Menu\MenuTreeStorage->saveRecursive('filter.tips_all', Array, Array)
16:29:58 #4 /var/www/html/core/lib/Drupal/Core/Menu/MenuLinkManager.php(158): Drupal\Core\Menu\MenuTreeStorage->rebuild(Array)
16:29:58 #5 /var/www/html/core/lib/Drupal/Core/EventSubscriber/MenuRouterRebuildSubscriber.php(90): Drupal\Core\Menu\MenuLinkManager->rebuild()
16:29:58 #6 /var/www/html/core/lib/Drupal/Core/EventSubscriber/MenuRouterRebuildSubscriber.php(78): Drupal\Core\EventSubscriber\MenuRouterRebuildSubscriber->menuLinksRebuild()
16:29:58 #7 [internal function]: Drupal\Core\EventSubscriber\MenuRouterRebuildSubscriber->onRouterRebuild(Object(Drupal\Component\EventDispatcher\Event), 'routing.route_f...', Object(Drupal\Component\EventDispatcher\ContainerAwareEventDispatcher))
16:29:58 #8 /var/www/html/core/lib/Drupal/Component/EventDispatcher/ContainerAwareEventDispatcher.php(111): call_user_func(Array, Object(Drupal\Component\EventDispatcher\Event), 'routing.route_f...', Object(Drupal\Component\EventDispatcher\ContainerAwareEventDispatcher))
16:29:58 #9 /var/www/html/core/lib/Drupal/Core/Routing/RouteBuilder.php(197): Drupal\Component\EventDispatcher\ContainerAwareEventDispatcher->dispatch(Object(Drupal\Component\EventDispatcher\Event), 'routing.route_f...')
16:29:58 #10 /var/www/html/core/lib/Drupal/Core/ProxyClass/Routing/RouteBuilder.php(83): Drupal\Core\Routing\RouteBuilder->rebuild()
16:29:58 #11 /var/www/html/core/includes/install.core.inc(1878): Drupal\Core\ProxyClass\Routing\RouteBuilder->rebuild()
16:29:58 #12 /var/www/html/core/includes/install.core.inc(707): install_finished(Array)
16:29:58 #13 /var/www/html/core/includes/install.core.inc(578): install_run_task(Array, Array)
16:29:58 #14 /var/www/html/core/includes/install.core.inc(121): install_run_tasks(Array, NULL)
16:29:58 #15 /var/www/html/core/lib/Drupal/Core/Test/FunctionalTestSetupTrait.php(290): install_drupal(Object(Composer\Autoload\ClassLoader), Array)
16:29:58 #16 /var/www/html/core/tests/Drupal/TestSite/Commands/TestSiteInstallCommand.php(235): Drupal\TestSite\Commands\TestSiteInstallCommand->doInstall()
16:29:58 #17 /var/www/html/core/tests/Drupal/TestSite/Commands/TestSiteInstallCommand.php(225): Drupal\TestSite\Commands\TestSiteInstallCommand->installDrupal()
16:29:58 #18 /var/www/html/core/tests/Drupal/TestSite/Commands/TestSiteInstallCommand.php(123): Drupal\TestSite\Commands\TestSiteInstallCommand->setup('minimal', 'Drupal\\TestSite...', 'en')
16:29:58 #19 /var/www/html/vendor/symfony/console/Command/Command.php(326): Drupal\TestSite\Commands\TestSiteInstallCommand->execute(Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
16:29:58 #20 /var/www/html/vendor/symfony/console/Application.php(1063): Symfony\Component\Console\Command\Command->run(Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
16:29:58 #21 /var/www/html/vendor/symfony/console/Application.php(320): Symfony\Component\Console\Application->doRunCommand(Object(Drupal\TestSite\Commands\TestSiteInstallCommand), Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
16:29:58 #22 /var/www/html/vendor/symfony/console/Application.php(174): Symfony\Component\Console\Application->doRun(Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
16:29:58 #23 /var/www/html/core/scripts/test-site.php(24): Symfony\Component\Console\Application->run()
16:29:58 #24 {main}
16:29:58   thrown in /var/www/html/core/lib/Drupal/Core/Database/Transaction/TransactionManagerBase.php on line 145
16:29:58 
16:29:58    ✖ 17) Olivero/oliveroSearchFormTest
16:29:58 
mondrake’s picture

Status: Needs work » Needs review

So this fragment of the backtrace

16:29:58 #0 /var/www/html/core/lib/Drupal/Core/Database/Transaction/TransactionManagerBase.php(228): Drupal\Core\Database\Transaction\TransactionManagerBase->removeStackItem(6503259739987)
16:29:58 #1 /var/www/html/core/lib/Drupal/Core/Database/Transaction.php(88): Drupal\Core\Database\Transaction\TransactionManagerBase->unpile('savepoint_1', '6503259739987')

gave me the hint on the problem: variable type strictness. Either PHP internal storage of array keys is not type safe (i.e. if the key is representing a number it is stored as an int even if passed in as a string), or array_last_key() returns an int on the same circumstance. Both cases would be PHP's headaches though, so here I just made comparisons looser and ensured the value returned by array_last_key is always cast to string.

Strict typing can be best friend or worst enemy, even to PHP itself...

daffie’s picture

Status: Needs review » Reviewed & tested by the community

The problem with the testbot has been addressed.
Back to RTBC.

longwave’s picture

Re #23 see https://www.php.net/manual/en/language.types.array.php

Additionally the following key casts will occur:

Strings containing valid decimal ints, unless the number is preceded by a + sign, will be cast to the int type.

  • catch committed 6950cac3 on 11.x
    Issue #3384960 by mondrake, daffie, longwave: Strengthen...
catch’s picture

Status: Reviewed & tested by the community » Fixed

Committed/pushed to 11.x, thanks!

mondrake’s picture

@longwave thanks for #25.

Another puzzle is why uniqid() is returning ints, quite evidently on some PHP configurations only. In any of my testing I was getting an hex string.

Maybe we should think about using something alternative to uniqid(), there quite some literature about it… I picked it because it’s low in computation and we only need uniqueness within each request.

Thanks also @catch for #22.

mfb’s picture

uniqid() is based purely on the time with microseconds, so uniqid() won't be unique if you call it more than once per microsecond or the system time changes due to NTP clock synchronization etc. It should be possible to calculate how often it returns all decimal ints if someone is inclined :)

catch’s picture

It's already used elsewhere in the database API, but we should use the $more_entropy second argument in TransactionManagerBase.

*
Query/SelectExtender.php:    $this->uniqueIdentifier = uniqid('', TRUE);
Query/SelectExtender.php:    $this->uniqueIdentifier = uniqid('', TRUE);
Query/Query.php:    $this->uniqueIdentifier = uniqid('', TRUE);
Query/Query.php:    $this->uniqueIdentifier = uniqid('', TRUE);
Schema.php:    $this->uniqueIdentifier = uniqid('', TRUE);
Schema.php:    $this->uniqueIdentifier = uniqid('', TRUE);
Transaction/TransactionManagerBase.php:    $id = uniqid();

Opened #3387695: Use more entropy in TransactionManager

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.