
Problem/Motivation
In #3364706: Refactor transactions we introduced a new TransactionManager, refactoring the internals of the Database Transaction API.
Now I noticed a couple of edge cases that require fixing:
- If you open multiple transactions (e.g. root, savepoint_1, savepoint_2, savepoint_3) and you scope out an intermediate one (e.g. savepoint_1), which is a legitimate operation that translates into a 'RELEASE SAVEPOINT savepoint_1', the later Transaction objects remain hanging and when they get out of scope they try to release their corresponding savepoint but this leads to failures. Also, the transaction stack is left in an inconsistent state.
- The same happens if you commit a root transaction while there are still active savepoints. Address that in #3384995: Committing a transaction while there are still active savepoints leaves the stack in inconsistent state.
- If a DDL statement breaks an active transaction, we currently can clean up the stack (but we will not be able to do it in the mysqli driver that does not have a way to check if a transaction is active), however the Transaction objects remain in scope and can potentially interfere with later transactions when they go out of scope (e.g. because a savepoint_1 Transaction object opened BEFORE the DDL statement may try to release a savepoint_1 opened AFTER the DDL statement).
Steps to reproduce
Proposed resolution
Adjust the internal TransactionManager stack and the Transaction objects to carry a unique ID, and ensure that when a Transaction object is unpiled or rolled back, both Transaction id and name match the info on the stack.
When unpiling a savepoint, remove from the stack the savepoint items opened after that savepoint.
Remaining tasks
User interface changes
API changes
Data model changes
Release notes snippet
Issue fork drupal-3384960
Show commands
Start within a Git clone of the project using the version control instructions.
Or, if you do not have SSH keys set up on git.drupalcode.org:
- 3384960-strengthen-transactionmanager
changes, plain diff MR !4696
Comments
Comment #3
mondrakeFixes.
Comment #4
mondrakeNeed to check if 1) from the IS also applies when you commit the root transaction while there are active savepoints.
Comment #5
mondrakeYes, #4 is a case to fix too.
Comment #6
mondrakeSo the fix for #4, which is correct IMHO, leads to a failure in existing tests. This means a change in behaviour, which needs separate discussion and assessment of all ramifications. Filed a separate #3384995: Committing a transaction while there are still active savepoints leaves the stack in inconsistent state issue for that, and reverted the fix here.
Comment #7
mondrakeComment #8
daffie CreditAttribution: daffie at Finalist commented@mondrake: When I look at the changes in the MR, we are changing the class variable TransactionManagerBase::stack to be used in order. Only its documentation is saying that it is not a LILO stack and yet after the MR it is going to be a used as a LILO stack. Should we add some order in those save points and changes the class variable to be a real LILO stack?
Comment #9
mondrakeWell in fact we are not changing, the stack is still not a LIFO one.
A real LIFO (push/pop) stack should be:
The current API allows to release an item that is in between the stack, but what we have in HEAD now is something like:
as you can see, this leaves gaps in the sequence and hanging items once you believe you have released the root item.
With this MR, the logic is changed to
which ensures no gaps.
All this is complicated by the fact that the stack does not contain the Transaction objects themselves, but a non-binding reference to them. Because of how the Transaction implementation is done, such items can only live in the scope of the code that requires them - we cannot put a pointer to them in the stack otherwise they would not be destructed when they get out of scope in the calling code, which is the trigger for the release/commit.
Comment #10
daffie CreditAttribution: daffie at Finalist commented@mondrake: According to with this MR the logic changes to:
And I agree that this is the right change.
My point is that when I look at the implementation of
unpile()
, it is using the class variable TransactionManagerBase::stack is a LIFO stack.The code from the method
unpile()
is:The while loop is popping StackItems of the stack until it reaches the required StackItem. It is using the class variable TransactionManagerBase::stack is a LIFO stack! Can we at least document the class variable TransactionManagerBase::stack that it is used as a LIFO stack. Maybe do some more to make sure that something else does not changes the order in the class variable. The transactions themselfs are not LIFO, just the class variable TransactionManagerBase::stack.
The rest of the patch is great.
Comment #11
mondrakeAh I see you point now @daffie, thanks for that. Tried to clarify in the docblock.
Comment #12
mondrake'descoping' looks like a valid word to me, https://en.wiktionary.org/wiki/descoping
Comment #13
daffie CreditAttribution: daffie at Finalist commentedAll changes look good to me.
All my remarks have been addressed.
For me it is RTBC.
As the base transactions have already been merged with the D11 branch and without this fix there is the possibility of data loss for a site. I am changing the priority to critical.
@mondrake: Thank you for working on this!
Comment #14
catchThis is tricky stuff but it's great to see the documentation and test coverage improve as well as the code clean-up, feels like major refurbishment to the database system of the past year or two.
Committed/pushed to 11.x, thanks!
Comment #16
catchComment #17
catchSomething wrong here:
See https://drupal-gitlab-job-artifacts.s3.us-west-2.amazonaws.com/c0/78/c07...
Comment #18
catchReverted for now.
Comment #20
catchRebasing this on 11.x should hopefully show the test failure in the MR pipeline to help track down the differences.
Comment #21
mondrakeSorry in which job is there a failure? is it possible to get a backtrace? the link does not work for me
Comment #22
catchSorry I think MR pipelines only allow access if you get push access, this should be changing overnight though according to slack.
Found a DrupalCI one, different to the test I linked but same error:
https://dispatcher.drupalci.org/job/drupal8_core_regression_tests/86504/...
Comment #23
mondrakeSo this fragment of the backtrace
gave me the hint on the problem: variable type strictness. Either PHP internal storage of array keys is not type safe (i.e. if the key is representing a number it is stored as an int even if passed in as a string), or
array_last_key()
returns an int on the same circumstance. Both cases would be PHP's headaches though, so here I just made comparisons looser and ensured the value returned byarray_last_key
is always cast to string.Strict typing can be best friend or worst enemy, even to PHP itself...
Comment #24
daffie CreditAttribution: daffie at Finalist commentedThe problem with the testbot has been addressed.
Back to RTBC.
Comment #25
longwaveRe #23 see https://www.php.net/manual/en/language.types.array.php
Comment #27
catchCommitted/pushed to 11.x, thanks!
Comment #28
mondrake@longwave thanks for #25.
Another puzzle is why
uniqid()
is returning ints, quite evidently on some PHP configurations only. In any of my testing I was getting an hex string.Maybe we should think about using something alternative to
uniqid()
, there quite some literature about it… I picked it because it’s low in computation and we only need uniqueness within each request.Thanks also @catch for #22.
Comment #30
mfbuniqid() is based purely on the time with microseconds, so uniqid() won't be unique if you call it more than once per microsecond or the system time changes due to NTP clock synchronization etc. It should be possible to calculate how often it returns all decimal ints if someone is inclined :)
Comment #31
catchIt's already used elsewhere in the database API, but we should use the $more_entropy second argument in TransactionManagerBase.
Opened #3387695: Use more entropy in TransactionManager