Problem/Motivation
The method Drupal\Core\Database\Connection::nextId()
is programmed that it only works when it is used for a single sequence. It is not used in core since 10.2 #3337513: Fix batch process race conditions by making ‘bid (batch id)’ auto-increment. When a contrib or custom module starts using the method and sets the sequence ID for getting a new ID, the change is very high that sooner then later that the database will throw an error that insert a record with an existing ID.
Proposed resolution
Deprecate the method Drupal\Core\Database\Connection::nextId()
and the {sequences}
table. Modules should use autoincrement tables where necessary.
Remaining tasks
review/commit
User interface changes
no
API changes
-
The method
Drupal\Core\Database\Connection::nextId()
is deprecated and removed in Drupal 11; -
The method
Drupal\mysql\Driver\Database\mysql\nextIdDelete()
is deprecated and removed in Drupal 11; -
The
sequences
table is deprecated and removed in Drupal 12; -
Calling
KernelTestBase::installSchema()
for the table{sequences}
is deprecated.
Data model changes
Comment | File | Size | Author |
---|---|---|---|
#146 | 2665216-146.patch | 77.55 KB | andypost |
#146 | interdiff.txt | 707 bytes | andypost |
Comments
Comment #2
dawehnerComment #4
jibranComment #6
mohit_aghera CreditAttribution: mohit_aghera as a volunteer and at Axelerant commentedRe-rolling patch for 8.1.x-dev
Comment #7
mohit_aghera CreditAttribution: mohit_aghera as a volunteer and at Axelerant commentedComment #10
heykarthikwithuComment #11
heykarthikwithuComment #12
dawehner@heykarthikwithu Please ensure to provide interdiffs in the future.
Comment #15
dimaro CreditAttribution: dimaro at La Drupalera by Emergya commentedComment #16
dimaro CreditAttribution: dimaro at La Drupalera by Emergya commentedRerolled #11.
Simple rebase fixed it.
Comment #19
dimaro CreditAttribution: dimaro at La Drupalera by Emergya commentedTagging this issue.
Comment #20
jofitz CreditAttribution: jofitz at ComputerMinds commentedRe-roll.
Comment #22
jofitz CreditAttribution: jofitz at ComputerMinds commentedComment #24
dimaro CreditAttribution: dimaro at La Drupalera by Emergya commentedComment #25
ashishdalviWe will work on it at DCM2017
Comment #26
shashikant_chauhan CreditAttribution: shashikant_chauhan as a volunteer and at Iksula commentedRerolled the patch.
Comment #28
vijaycs85Comment #33
andypostreroll
Comment #34
andypostComment #36
andypostClean-up all new usage and tune a bit ID generation
Also removed useless method and renamed
ensureSequuencesTableExists()
Comment #38
andypostreroll & fix few new ones
Comment #40
andypostit should fix most of failed tests
Comment #42
andypostFix one more test
Comment #46
daffie CreditAttribution: daffie commentedQuick point: I do not think that PostgreSQL uses the sequences table.
Comment #47
andypost@daffie sadly all databases doing it, and onlyto generate user id
Comment #48
daffie CreditAttribution: daffie commentedThat maybe the case, only the PostgreSQL driver does not use the sequences table to store the sequence.
Comment #49
andypost@daffie as I see PG using sequence from this table instead of the table(
Comment #50
daffie CreditAttribution: daffie commented#2664322: key_value table is only used by a core service but it depends on system install has landed.
Comment #51
andypostre-roll but it needs work for pgsql driver, just wanna make sure what bot will tell
Comment #53
daffie CreditAttribution: daffie commentedThis patch should fix the testbot failures for MySQL.
Comment #54
daffie CreditAttribution: daffie commentedThis should fix the testbot failures for SQLite.
Comment #55
andypost@daffie Thank you a lot!
btw I'm thinking about implementation... most of databases does not need table to create sequence, the only usage of this table is user entity and batch, see #3028709: Change behavior of PostgreSQL to not share sequence for batch and users
Moreover this API (
\Drupal\Core\Database\Connection::nextId()
via sequences table) is more like workaround for limitations of each supported database so naming is still a question for me-
getSequencesSchema()
should it remain public? as we can't define sequence with schema-
ensureSequencesTableExists()
should it remain mention "table"? as other databases can create just sequences for batch and usersProbably both needs better docblocks
So only this thing left to decide, pgsql needs only named sequence but creates a table
Comment #56
daffie CreditAttribution: daffie commented@andypost: Thank you for the review!
Changes:
1. Changed the method
getSequencesSchema()
from public to protected.2. Updated the docblock of the method
ensureSequencesTableExists()
.3. Removed the create table part from the PostgreSQL driver.
Comment #57
daffie CreditAttribution: daffie commentedFixing the single coding standard violation.
Comment #58
andypostUsed to check for nextId() usage in contrib
There's only 7 usages, mostly database drivers (mongodb, oracle, sqlserv)
Maybe decouple this api into new service and deprecate the
nextId()
method in connection?It could use
key_value
asequences
collection instead of database specific implementationComment #61
daffie CreditAttribution: daffie commentedAdded CR and updated the Is for the fact that calling KernelTestBase::installSchema() for the table sequences is deprecated.
Rerolled the patch and added the deprecation.
Comment #62
mondrakeHow about not adding this to the base Connection class, but just to the MySql and SQLite implementations? Assuming that each RDBMS should have its own way to manage sequences. Yes, it's a bit boilerplate for MySql and SQLite, but would be cleaner in general.
EDIT - BTW then you won't even need a schema, just a 'CREATE TABLE' statement that can be driver specific at that point.
Comment #63
daffie CreditAttribution: daffie commented@mondrake Thank you for your review!
I can move the code into a new trait called "sequencesTrait.php" and add to trait to the connection classes of MySQL and SQLite. I that a good solution?
I know I can do that and I did not because in similar situations in lib/Drupal/Core have used the same split as in this patch. This solution is also used by Batch, Cache, Config, Flood, KeyValue, Lock, Menu, Queue and Routing.
I am only a bit worried about a BC break. This happens when you use a contrib database driver to create a new site. The problem that I do not see how this can be fixed in core. Maybe it would be better to fix this in the contrib database drivers. What do you think?
Comment #64
andypostI see no reason in trait and the table at all, it used to store only 1 row which could be stored in key value table - no extra API needed for that
Comment #65
mondrake#64 I'm not sure. That would mean replacing a DB feature (using sequences where supported by the RDBMS, like in PGSql and Oracle, or emulating them, like in MySql and SQLite) with a service which is much more overhead. IIUC this issue's focus is on cleaning up the DB API layer.
Separate discussion IMHO is if we need
Connection::nextId()
at all in core: it seems to be used only in UserStorage and form.inc, and mentioned in KeyValueEntityStorage but not actually used. If we can find alternatives there, we might deprecate nextId altogether, but I'd say this is for a separate issue.Comment #66
daffie CreditAttribution: daffie commentedMaybe it is better to look if we can deprecate the
Connection::nextId()
functionality.We have 2 instances in core that use the functionality:
I think we can replaced it here by just generating a random integer.
I do not understand we just do not use a serial field for the user id. It is an entity like all other entities. Or is it for user id zero is the anonymous user. Is that the problem? If so, is it still valid?
Can we solve it in another way, something specific in the user module and deprecate it in the database system? Let the user module store the value in the keyvalue table.
It would be great to remove this from the database system.
@andypost and @mondrake: Thank you for your help!
Comment #67
mondrakeBTW, this
$entity->uid->value = $this->database->nextId($this->database->query('SELECT MAX([uid]) FROM {' . $this->getBaseTable() . '}')->fetchField());
is rather bad because if anything else than UserStorage is using the sequence generated by nextId() and expecting it to be incrementing, this statement may (ok, in edge situations) reset the sequence to a lower value if the max(id) of the user base table is lower than the current value of the sequence.
So... now I am +1 for deprecating instead.
Comment #68
daffie CreditAttribution: daffie commented@mondrake If you like the patch then I will update the title, summary and the CR. I have deprecated the method
Connection::nextId()
and the table sequences. The table can be removed in D10. The usage of the method have been replaced in form.inc and userstorage.php.Comment #69
mondrakeIMHO #68 makes sense; we need an IS update to reflect the new direction of the issue.
Maybe overkilling here: would it make sense to use Lock::acquire() to lock the ID while in use by the batch?
Not sure using
@deprecated
is correct here. Maybe just open a followup for the removal in D10 and use @todo to refer to it. BTW I think in D10 we should also have an update function that will actually remove the{sequences}
table in live sites.I think this can be injected, by overriding the
::createInstance
and::__construct
methods.Is this still needed?
expectDeprecation
should be moved to after the block checking if the test should be skipped.Comment #70
daffie CreditAttribution: daffie commentedUpdated the title, the IS and the CR.
Comment #71
daffie CreditAttribution: daffie commented@mondrake: Thank you for your review.
For 69.1: I have changed the solution to use the keyvalue store to store the batch ID. The Lock::acquire() look to me as a bit of a overkill and such a lock does expire. Where a keyvalue store one does not.
For #69.2: I am not sure what the correct way is to deprecate a table. I do not think we have one. My idea was that the release manager would do a code base search for "@deprecated" before he/she would release D10.0.0 and therefor the comment would be found and fixed. I have less confidence that a @todo will be fixed in the D10.0.0 release. Technically I think that a @todo is better then a @deprecated. Maybe I should ask a release manager about this. ;-)
For #69.3: Done!
For #69.4: Removed!
For #69.5: Done!
Comment #72
mondrakeAssuming this is green, it's RTBC for me now.
I was wondering whether we should also deprecate
Connection::makeSequenceName()
. It's only used internally by PgSql so maybe it does no longer need to be public API. But that could be a followup.Comment #73
daffie CreditAttribution: daffie commentedFor the official BC policy (https://www.drupal.org/about/core/policies/core-change-policies/drupal-8...):
Comment #74
mondrakeComment #75
mondrakeAdjusting title, we are not deprecating
KernelTestBase::installSchema()
altogether.Comment #76
mondrakeEdited CR and filed #3221101: Deprecate Drupal\Core\Database\Connection::makeSequenceName(), make it internal to PostgreSQL .
Comment #77
alexpottWe need to make this optional and trigger a deprecation if it is not passed in. Just in case.
This change is not necessary - it is a legacy test so triggering a deprecation won't cause the test to fail and then we'll keep test coverage of system_schema()
Comment #78
andypost@alexpott Looking at #1650930: Use READ COMMITTED by default for MySQL transactions I'm not sure 4) is "highly performant"
It makes 3 queries instead of using auto-increment
https://git.drupalcode.org/project/drupal/-/blob/9.3.x/core/lib/Drupal/C...
Comment #79
alexpott@andypost the three queries don't happen in usually... normally it is just
$new_id = $this->query('INSERT INTO {sequences} () VALUES ()', [], ['return' => Database::RETURN_INSERT_ID]);
and an insert is definitely quicker than what we're doing with state... we're doing a select and update on every user creation. And the cleanup of the sequences table does not occur on user time but at shutdown.Comment #80
catchReading the recent comments got me thinking about trying to use
NO_AUTO_VALUE_ON_ZERO
and autoincrement for users.@alexpott found #838992: Change the uid field from integer to serial by leveraging NO_AUTO_VALUE_ON_ZERO on MySQL which I'll move to Drupal 9 now.
If that's possible, I think we'd need:
1. An issue to change the setting
2. An issue specifically for user storage to switch to autoincrement away from sequences
3. This issue minus the user bits.
Comment #81
alexpott#80 sounds like a great plan - also looking at the Batch stuff I;'m not sure what that uses this either - the queue table is a serial and maybe the memory stuff could create it's own ID... I think we should delegate ID creation to the batch backend and not use key value.
Comment #82
andypostUsed to dig history a bit
Looks first attempt by @chx is #49836: Getting rid of the sequences table, using db_last_insert_id() instead of db_next_id() and #55516: Remove database locking
Then #356074: Provide a sequences API
For #80 there's #2886441: Sequences API (mysql) kills MySQL server performance due to cleanup phase
Comment #83
Ghost of Drupal PastThe current implementation of the sequences API, while it has it's own problems, is atomic.
The new patch, on the other hand is race prone.
Eliminating the uses of this in core is valuable and adding a note discouraging users from it is also great. Moving the cleanup to cron to increase performance would be a separate fix. But I would not remove this useful API as there is no replacement for it.
Comment #84
Ghost of Drupal PastWhile it certainly returns only one sequence of integers it never reuses them so there's no way this can happen. If a module starts using this API where existing IDs exist the API provides for this. I do not understand this.
I do not understand this either. Every nextId changes the sequence value. That's what nextId does. It is the only thing it does.
So I think this issue should revert to #61 and also some other fixes as mentioned already would be valuable:
Comment #88
gidarai CreditAttribution: gidarai at Finalist commentedI have rerolled the patch for 10.1.x
Comment #89
gidarai CreditAttribution: gidarai at Finalist commentedfixing coding standards
Comment #91
gidarai CreditAttribution: gidarai at Finalist commentedStyle guide fix (previous comments where not added because of spam protection)
Comment #93
andypostNeeds codestyle fixes and deprecation version
it should be 10.1.0 and removed from 11.0.0
Comment #94
gidarai CreditAttribution: gidarai at Finalist commentedCode style fixes and deprecation version changes
Comment #95
gidarai CreditAttribution: gidarai at Finalist commentedmissing part of previous patch added to current patch
Comment #98
daffie CreditAttribution: daffie commentedThis text is not right. The part "The table is now lazy loaded and therefore will be installed automatically when used." can be removed.
Comment #99
_utsavsharma CreditAttribution: _utsavsharma at OpenSense Labs for DrupalFit commentedAddressed #98.
Please review.
Comment #100
gidarai CreditAttribution: gidarai at Finalist commentedFixed failing test (NextIdTest)
Comment #101
gidarai CreditAttribution: gidarai at Finalist commentedRemoved white space before a closing "}"
Comment #102
catchI think it's probably fine to start the batch ID from 0 again rather than having an update to copy over from sequences, but just noting in case we think it would be worth it? Everything looks good here otherwise.
We'll need an explicit Drupal 11 (or even Drupal 12?) follow-up to add an update to drop the table.
Comment #103
mondrakeThe only concern I have here is that if by chance this happens in the context of a transaction, and the transaction gets rolled back, the counter would be rolled back as well? But I think it's the same with the current API anyway. In the past I saw doctrine/dbal supporting a sequence proxy - but they were opening a separate connection just for the purpose of not meddling with transactional operations.
Comment #104
gidarai CreditAttribution: gidarai at Finalist commentedAdded the requested comment by @catch.
Comment #105
longwave@chx's concerns about race conditions here still apply, as far as I can see.
If modules that previously used sequences are expected to do this themselves, should we add an explicit API that can safely increment a value and return it? Other key value stores such as Memcache and Redis offer an atomic increment operation, for example. This could be part of the keyvalue API or a separate service.
Comment #106
longwaveAlternatively, why do non-progressive batches even need an ID, if it is never stored? If we can remove that then we can make
bid
a serial column and then we never need sequences in core at all.Comment #107
gidarai CreditAttribution: gidarai at Finalist commentedChanged patch to use the lock service to prevent batches from getting the same ID when multiple batch processes try to get a new batch ID by locking the batch process.
Comment #108
longwaveSeems unlikely, but we need to do something if we fail to acquire the lock, as we won't have an ID to use.
Comment #109
daffie CreditAttribution: daffie at Finalist commentedThe concern has been addressed.
Back to RTBC.
Comment #110
catchThis won't work - it would mean no batch ID is assigned at all if another process has a lock, which is as or more likely than the original race condition.
Comment #111
daffie CreditAttribution: daffie at Finalist commentedAs the solution with the lock service does not work, lets try the solution from @mondrake. See: https://www.drupal.org/project/drupal/issues/3257824#comment-14883465.
Comment #112
mondrakeHere's an updated version of the patch referenced in #111.
Unfortunately there is a problem with the update path here - the update itseltf is run in a batch, so using the new code with the old table before it is changed by the upate function is failing.
Comment #113
catchIt's not pretty, but for similar problems, we've hard-coded the database queries in the code that actually runs updates so that it's guaranteed to run before any updates do at all. Obviously this will run every time until that code is removed again in Drupal 11, so needs to be re-entrant and ideally do a minimal check so it can return early.
Comment #114
mondrakeConfused about why this is not failing on SQLite too.
Comment #115
mondrakeLet's see this. I think, however, we need a separate issue to do this one, and postpone this one on that.
Comment #116
mondrakeComment #117
mondrakeTried understanding why #112 was not failing on SQLite, and it looks like when an int is defined as primary key, it is automatically incremented when no value is passed for the PK in INSERT.
The {batch} table is created like
Then inserting
Yields for
SELECT * FROM "test35615194"."testmk2"
You never stop learning...
Anyway, for the purpose here this is just fine.
Comment #118
gidarai CreditAttribution: gidarai at Finalist commentedThis issue is postponed on #3337513: Fix batch process race conditions by making ‘bid (batch id)’ auto-increment.
Comment #119
gidarai CreditAttribution: gidarai at Finalist commentedChanged status to 'postponed' (didn't do it before)
Comment #120
catch#3337513: Fix batch process race conditions by making ‘bid (batch id)’ auto-increment is in.
Comment #121
andypostRe-roll for #107 and few changes
The test
core/modules/mysql/tests/src/Kernel/mysql/NextIdTest.php
is running only on mysql, so probably needs other base test to useComment #122
daffie CreditAttribution: daffie at Finalist commentedTiny nitpick: We are going to remove the sequences table in Drupal 12 not Drupal 11.
Comment #123
andypostFiled new CR (restored from revision) https://www.drupal.org/node/3349345 and said for removal in 12.0 and updated all deprecations
Comment #124
andypostUpdated IS
Comment #125
andypostand the last place
Comment #126
daffie CreditAttribution: daffie at Finalist commented@andypost: The plan is to remove the method Connection::nextId() in Drupal 11 and to remove the table "sequences" in Drupal 12. We wait with dropping the sequences table because we cannot deprecate a database table. That is the plan for now. Better ideas are welcome!
Comment #128
andypostI see no reason to hack into db layer to try throw when table is used, so need to revert 12 to 11 and keep in
KernelTestBase
Comment #129
andypostHere's fix for #128
Comment #130
daffie CreditAttribution: daffie at Finalist commentedComment #131
andypostbtw drush needs fix for batch too https://github.com/drush-ops/drush/blob/2b890ab5c2b5d70cae7b17ebda63ad57...
Comment #132
mondrake#3337513: Fix batch process race conditions by making ‘bid (batch id)’ auto-increment got reverted, this needs to wait for that to be fixed
Comment #133
mondrakeBlocker is in.
Comment #134
andypostre-roll for 10.2
Comment #135
andypostComment #136
daffie CreditAttribution: daffie at Finalist commentedThe testbot is failing.
Comment #137
andypostNew
WorkspaceAssociationTest
usage fixedComment #138
mondrakeFor
PostgreSQLMySql, shouldn'tpublic function nextIdDelete()
in its Connection class be deprecated as well?Comment #139
andypostI bet you meaning Mysql driver because there's only usage of this function could be found, which was added in #3129043: Move core database drivers to modules of their own
Added deprecation for #138
Comment #140
mondrake#139 - yes, and I need more coffee. Thanks!
Comment #141
andypostComment #142
daffie CreditAttribution: daffie at Finalist commentedAll code changes look good to me.
All deprecations have testing.
All use of the deprecated methods have been removed.
I have updated the IS and the CR for the deprecation of
Drupal\mysql\Driver\Database\mysql\nextIdDelete()
.For me it is RTBC.
Comment #144
daffie CreditAttribution: daffie at Finalist commentedComment #145
mondrakeNeeds a reroll. And, BTW, I think MySql Connection::$needsCleanup property should be deprecated too.
Comment #146
andypostTest
core/modules/file/tests/src/Kernel/Views/RelationshipUserFileDataTest.php
is fixed via #2628230: Adding File Usage "File" relationship results in broken/missing handlerAdded deprecation for #145 as the property protected, I don't think we need magic getter/setter to deprecate it
Comment #147
daffie CreditAttribution: daffie at Finalist commentedBack to RTBC.
Comment #148
catchRead back through this issue and saw all the comments relating to the batch and users table that we eventually split out to other issues with much better implementations, nice to finally get to the point where all this issue is doing is adding deprecations!!
Committed 6204ba9 and pushed to 11.x. Thanks!
Comment #150
catchComment #151
andypostThank you! I renamed follow-up from D12 to D11 - #3335756: Drop sequences table in Drupal 12
Comment #152
andypostCR looks ready to be published but I think we should remove the table in D12