Problem/Motivation
The sequences API was introduced in #356074: Provide a sequences API to solve the problem of some tables not having serial fields (and serial field implementations being broken on some databases at that time).
The MySQL implementation implements two functions nextId() and nextIdDelete().
The nextId() function inserts a value into the sequences table and reads the last insert ID and then sets $this->cleanup = TRUE.
On __destruct() of the function all values except the last one inserted are deleted using:
$max_id = $this->query('SELECT MAX(value) FROM {sequences}')->fetchField(); // We know we are using MySQL here, no need for the slower db_delete(). $this->query('DELETE FROM {sequences} WHERE value < :value', array(':value' => $max_id));
And this creates problems for scalability:
When many clients simultaneously need to insert new users (which uses db_next_id()), then many MySQL processes get occupied that simultaneously are deleting from the sequences table, which creates a stampede that kills the throughput completely.
In Drupal 6 the uid was a serial field instead.
In Drupal 8 it seems like db_next_id() is no longer used for user accounts, but this should be clarified
Proposed resolution
Run the cleanup during the cron job.
That is a little messy, because the sequences table depends on the used database type ('mysql', 'sqlite', ...), but it should be doable as the sequences schema is already installed by system module.
In addition or as an alternative to that we could check the size of the table (row count usually is pretty quick) during __destruct() and if it exceeds a certain limit then:
lock() => delete() => unlock() though that could get messy during __destruct().
Another possibility would be to add a random() jitter to the threshold to decrease the probability of two processes wanting to cleanup the table at the same time().
e.g. limit = 100000, jitter = 1000 would mean that probably there is a random() distribution of the requests and just 1 or 2 trigger the cleanup at the same time, which has the advantage of not needing a lock.
And 100000 BIGINTs would need around 780 kB of storage space, so should be neglectable.
(In detail: db_select("SELECT COUNT(*) FROM {sequences}")->result() - rand(0, jitter) > limit )
Remaining tasks
- Discuss
- Implement
User interface changes
- None
API changes
- Only internal changes
Data model changes
- None
Comments
Comment #2
Fabianx CreditAttribution: Fabianx as a volunteer and at Tag1 Consulting commentedComment #3
Fabianx CreditAttribution: Fabianx as a volunteer and at Tag1 Consulting commentedThe same problem is in Drupal 8 (found the code in core/modules/user/src/UserStorage.php) so moving there, but degrading to major as it can be circumvented there.
Hm, it can be circumvented in Drupal 7 as well by setting a uid explicitly in pre_save().
It remains a tricky issue.
Comment #8
andypostComment #14
larowlan#838992: Change the uid field from integer to serial by leveraging NO_AUTO_VALUE_ON_ZERO on MySQL