Provide a sequences API [#356074]

Comment	File	Size	Author
#109	356074comment-2223588-extended.patch	1.79 KB	Josh Waihi

#108	356074comment-2223588-extended.patch	1.79 KB	Josh Waihi

#106	356074-cleanup.patch	487 bytes	Josh Waihi

#103	sequences.patch	1.7 KB	mfb

#102	sequences.patch	1.75 KB	mfb

#98	sequences_new.patch	18.57 KB	chx

#94	sequences_new.patch	14.65 KB	chx

#92	sequences_new.patch	14.65 KB	chx

#91	sequences_new.patch	0 bytes	chx

#78	sequences_new.patch	14.32 KB	chx

#74	sequences_new.patch	14.08 KB	chx

#70	sequences_new.patch	13.76 KB	chx

#67	sequences_new.patch	13.27 KB	chx

#63	sequences_new.patch	13.2 KB	chx

#62	sequences_new.patch	12.99 KB	chx

#59	sequences_new.patch	10.29 KB	chx

#55	sequences_new.patch	11 KB	chx

#44	sequences_new.patch	12.88 KB	chx

#43	sequences_new.patch	21.3 KB	chx

#40	sequences_new.patch	12.93 KB	chx

#34	sequences_new.patch	18.04 KB	chx

#32	sequences_new.patch	17.67 KB	chx

#30	sequences_new.patch	19.83 KB	chx

#28	356074-sequences-api.patch	20.32 KB	Damien Tournoud

#26	sequences_new.patch	19.67 KB	chx

#24	sequences_new.patch	17.95 KB	chx

#19	new_sequences.patch	7.88 KB	chx

#17	new_sequences.patch	7.89 KB	chx

#15	new_sequences.patch	7.92 KB	chx

#12	sequences_new.patch	7.94 KB	chx

#10	sequences_new.patch	10.1 KB	chx

	new_sequences_api.patch	6.68 KB	chx

Comment #1

Damien Tournoud CreditAttribution: Damien Tournoud commented 8 January 2009 at 16:00

This looks great, and is useful.

I never really understood why we moved to auto-incremented column in D6. It breaks important things (most notably: Split ID-space staging scenarios, and DB import/export using some broken MySQL software) while not being that an DX improvement.

Log in or register to post comments

Comment #2

8 January 2009 at 16:10

Status:

Needs review

» Needs work

The last submitted patch failed testing.

Log in or register to post comments

Comment #3

Crell CreditAttribution: Crell commented 8 January 2009 at 16:10

Status:

Needs work

» Needs review

Didn't we spend a month in D6 getting rid of this?

Log in or register to post comments

Comment #4

chx CreditAttribution: chx commented 8 January 2009 at 16:22

Once again: actions_aid and simpletest_test_id already uses this. This patch saves code.

Log in or register to post comments

Comment #5

Crell CreditAttribution: Crell commented 8 January 2009 at 22:19

So are you seeing this as a secondary system, or are you actually going back to manual sequences instead of auto_inc fields? The former I'm OK with; the latter not so much.

Log in or register to post comments

Comment #6

chx CreditAttribution: chx commented 10 January 2009 at 10:30

Of course a secondary system. I thought that's clear as the doxygen says "Use this function if for some reason you can't use a serial field, normally a serial field with db_last_insert_id is preferred.".

Log in or register to post comments

Comment #7

10 January 2009 at 10:45

Status:

Needs review

» Needs work

The last submitted patch failed testing.

Log in or register to post comments

Comment #8

Crell CreditAttribution: Crell commented 11 January 2009 at 09:36

If a secondary system, then why is it integrated directly into the DB layer? :-)

In any case, db_insert() should never ever be called from a method inside the DB layer itself. Internally everything happens with $connection->insert() (and friends). (See nextId()).

Also in the same method, $sequence is passed in but never used. Methinks that is likely not the intended behavior.

Log in or register to post comments

Comment #9

chx CreditAttribution: chx commented 17 January 2009 at 11:26

Status:

Needs work

» Needs review

Rerolled against HEAD, made the change in the NextId method that Crell asked, doxygen'd the update, fixed the test runner script, added a small test. It's in the DB layer because for example PostgreSQL maintainers might decide that they want to use native sequences for this and then they will need the name of the sequenc. Just because we provide a degenerate case, that's a minor implementation detail and the users should name their sequences.

Log in or register to post comments

Comment #10

chx CreditAttribution: chx commented 17 January 2009 at 11:27

File	Size
sequences_new.patch	10.1 KB

Log in or register to post comments

Comment #11

Crell CreditAttribution: Crell commented 19 January 2009 at 06:59

Status:

Needs review

» Needs work

Why in the world are you calling Database::getActiveConnection() from within a connection object? $this->insert(). That's all you need.

The singleton calls should also be banished to the outside of the database system. :-)

I don't understand the purpose of the db_affected_rows() call in simpletest_clean_results_table(). Since that function is slated for execution anyway, it shouldn't be used. If you convert the delete query to the new API that should give you the value you want.

Log in or register to post comments

Comment #12

chx CreditAttribution: chx commented 24 January 2009 at 19:33

Status:

Needs work

» Needs review

File	Size
sequences_new.patch	7.94 KB

Log in or register to post comments

Comment #13

catch

he/him

English

CreditAttribution: catch commented 25 January 2009 at 15:12

Status:

Needs review

» Needs work

Minor but

Retrieves an unique id from a given sequence. Use db_last_insert_id is preferred.
+ *
+ * Use this function if for some reason you can't use a serial field,
+ * normally a serial field with db_last_insert_id is preferred.

Can just drop "Use db_last_insert_id is preferred."

Log in or register to post comments

Comment #14

Crell CreditAttribution: Crell commented 25 January 2009 at 17:39

Actually that's wrong too, because db_last_insert_id() is deprecated. If using a serial field, the insert->execute() method will return the new ID.

Log in or register to post comments

Comment #15

chx CreditAttribution: chx commented 4 March 2009 at 22:13

Status:

Needs work

» Needs review

File	Size
new_sequences.patch	7.92 KB

#391340: Job queue API uses this so rerolled.

Log in or register to post comments

Comment #16

4 March 2009 at 22:25

Status:

Needs review

» Needs work

The last submitted patch failed testing.

Log in or register to post comments

Comment #17

chx CreditAttribution: chx commented 8 March 2009 at 23:55

Status:

Needs work

» Needs review

File	Size
new_sequences.patch	7.89 KB

system_install function collision.

Log in or register to post comments

Comment #18

9 March 2009 at 00:10

Status:

Needs review

» Needs work

The last submitted patch failed testing.

Log in or register to post comments

Comment #19

chx CreditAttribution: chx commented 5 April 2009 at 03:54

Status:

Needs work

» Needs review

File	Size
new_sequences.patch	7.88 KB

Keeping up with HEAD.

Log in or register to post comments

Comment #20

30 April 2009 at 07:10

Status:

Needs review

» Needs work

The last submitted patch failed testing.

Log in or register to post comments

Comment #21

dww

we/he/they

CreditAttribution: dww commented 2 May 2009 at 17:20

#391340: Job queue API is no longer using this directly, but it needs to so that we can have sane garbage collection of the consumer id (at least, a pointer to this issue was chx's answer to my objection about the lack of garbage collection over there).

Also, I think there were some spots in project* during porting to D6 where auto_inc wasn't sufficient, and I wished we still just had the {sequences} table -- but of course I don't remember the details right now. ;) Anyway, I'm all in favor in general of an API like this, especially if there are already 2 or 3 spots in core that are basically duplicating this code for various reasons...

That said, some thoughts reading patch #19:

A) Don't we need a schema update to drop the {simpletest_test_id} table, not just remove it from the schema definition?

B) This patch doesn't do any garbage collection, either. ;)

C) The doc changes from #13 still aren't in here.

D) Is there an issue where actions_aid is discussed and why it can't just use an auto_inc directly? I know there *are* cases where that's not good enough and you'd need something like this, but it's certainly not obvious from reading this patch (and its comments). Seems like it'd be helpful to add some of that explanation in the comments, both to justify committing this patch, and to help potential users of this API decide if they need it or not.

Log in or register to post comments

Comment #22

chx CreditAttribution: chx commented 3 May 2009 at 06:06

A) there was no simpletest_test_id in D6 core.

B) not yet.

C) need fixing

D) we use actions_aid like this, simpletest and soon job queue. MySQL cant have two auto-columns (like a serial and a timestamp) so there are more than one reason for this patch.

Log in or register to post comments

Comment #23

chx CreditAttribution: chx commented 10 July 2009 at 15:34

#204411-57: Anonymous user id breaks on MySQL export/import (xID of Zero get's auto-incremented on external xSQL events) remarks that using this for user uids would solve problems on MySQL.

Log in or register to post comments

Comment #24

chx CreditAttribution: chx commented 11 July 2009 at 00:51

Status:

Needs work

» Needs review

File	Size
sequences_new.patch	17.95 KB

There is cleanup now. Documentation change added. And for motivation I added users.uid to this. To avoid the horrors with importing users, I have added a safeguard.

Log in or register to post comments

Comment #25

Damien Tournoud CreditAttribution: Damien Tournoud commented 11 July 2009 at 11:21

I would like to extend this API to add the ability to get a particular sequence number. If this number is > to the current sequence number, the sequence number gets updated.

That would allow us to properly solve the "I need to insert an entry at a particular sequence number" problem properly. PostgreSQL sequence numbering is perfectly brain dead in that regard (see #495956: system.install incorrectly assumes ANON and AUTH RID are always sequential).

Log in or register to post comments

Comment #26

chx CreditAttribution: chx commented 12 July 2009 at 21:57

File	Size
sequences_new.patch	19.67 KB

A lot, lot better patch.

Log in or register to post comments

Comment #27

12 July 2009 at 22:15

Status:

Needs review

» Needs work

The last submitted patch failed testing.

Log in or register to post comments

Comment #28

Damien Tournoud CreditAttribution: Damien Tournoud commented 12 July 2009 at 23:28

Status:

Needs work

» Needs review

File	Size
356074-sequences-api.patch	20.32 KB

- fixed the default implementation: we don't need FOR UPDATE (and this is not supported by SQLite anyway), and we can use direct SQL queries there
- fixed the MySQL implementation: we can use direct SQL queries there (untested)
- fixed the logic inside user_save()

Log in or register to post comments

Comment #29

12 July 2009 at 23:40

Status:

Needs review

» Needs work

The last submitted patch failed testing.

Log in or register to post comments

Comment #30

chx CreditAttribution: chx commented 14 July 2009 at 17:02

Status:

Needs work

» Needs review

File	Size
sequences_new.patch	19.83 KB

Log in or register to post comments

Comment #31

14 July 2009 at 17:20

Status:

Needs review

» Needs work

The last submitted patch failed testing.

Log in or register to post comments

Comment #32

chx CreditAttribution: chx commented 16 July 2009 at 07:04

Status:

Needs work

» Needs review

File	Size
sequences_new.patch	17.67 KB

Removing uid 1 placeholder caused the grief.

Log in or register to post comments

Comment #33

16 July 2009 at 07:20

Status:

Needs review

» Needs work

The last submitted patch failed testing.

Log in or register to post comments

Comment #34

chx CreditAttribution: chx commented 16 July 2009 at 16:55

Status:

Needs work

» Needs review

File	Size
sequences_new.patch	18.04 KB

Hm, so we want user_save to work with an arbitrary uid passed in? Sure, we can do that. I guess it makes sense in importing or working together with another DB, etc. And as noted in a comment, there is no security risk.

Log in or register to post comments

Comment #35

chx CreditAttribution: chx commented 16 July 2009 at 17:13

To list a few reasons why we are doing this:

MySQL can only have one automated value in a table. If you want an automatic timestamp and a serial, you can't.
Sometimes you just need a unique integer.
MySQL has no ways to query the current value of a sequence. It's not even stored anywhere, you can read here that it reads the MAX value from the serial column on server startup and uses that value to seed the auto increment. (Which IMO is a serious bug in MySQL 'cos if you delete the last few rows and then restart server then it will reissue already used serial value. Ick.)
PostgreSQL has no ways to safely set (ie no race conditions) the value of a sequence to an arbitrary larger value than the current value of the sequence. You would need to call nextval in a loop or get the current value and then run a setval which is so large that's larger than any values generated between the curval and the setval call. It's ugly.
Furthermore, in PostgreSQL serial works by reading the sequence only if nothing is specified, the act of using the default advances the sequence. Ie. if you specify a uid when inserting the user and then save another user without specifying the uid then it will break, you neeed to manually advance the sequence.

Log in or register to post comments

Comment #36

Crell CreditAttribution: Crell commented 16 July 2009 at 19:03

I'm feeling scope creep here. Originally this thread was for providing a sequence API for edge cases that just need an ID, but won't necessarily be using it in a primary key sense. Now it sounds like you want to roll back all of the work done in D6 to switch from Drupal-managed sequences TO auto-increment. What is the goal here now?

Log in or register to post comments

Comment #37

chx CreditAttribution: chx commented 16 July 2009 at 20:12

Complete roll back is a separate issue. I changed users because they were hacked to make auto increment work and this seemed like a total fit. And yes I want to roll back but that's a followup.

Edit: #204411: Anonymous user id breaks on MySQL export/import (xID of Zero get's auto-incremented on external xSQL events) is the issue I fixed with users.uid moved to this API.

Log in or register to post comments

Comment #38

Crell CreditAttribution: Crell commented 17 July 2009 at 19:24

Status:

Needs review

» Needs work

From the docblock, it looks nextId()'s second parameter should be named $minimum_id, not $existing_id. It should also be optional (probably with a 0-default). The default there should not be just in the utility function.

Why doesn't the base nextId() use a merge query? For that matter why is it writing insert and update queries literally? (There may be a good reason, but it's not documented so I have no idea what that is.)

There's some stray whitespace tweaks in the patch, it looks like. :-(

The MySQL nextId() implementation relies on insert failing with an exception. That will also break any active transactions, which is not acceptable. This is why Merge queries exist. :-) Please use them, or document why we cannot.

Log in or register to post comments

Comment #39

chx CreditAttribution: chx commented 20 July 2009 at 20:45

note: we want to use GREATEST instead of that IF.

Log in or register to post comments

Comment #40

chx CreditAttribution: chx commented 21 July 2009 at 04:27

Status:

Needs work

» Needs review

File	Size
sequences_new.patch	12.93 KB

Fixed the docblock, explained why we do not use a merge query, moved whitespaces to another issue, mysql fails back to parent if transactions enabled.

Log in or register to post comments

Comment #41

David Strauss

he/him

CreditAttribution: David Strauss commented 21 July 2009 at 05:07

I will think this over. I prefer the (1) get ID (2) INSERT workflow from a DX perspective. I will, however, probably want table-per-sequence for sequences in MySQL. If reduces the real-time next-value operation to one query; we can clean up old IDs on cron.

Log in or register to post comments

Comment #42

Berdir

German

Switzerland

CreditAttribution: Berdir commented 21 July 2009 at 07:57

Just wondering, most db_* functions allow to specify the database with an $options array, see http://api.drupal.org/api/function/db_select/7 for example.

Would that make sense for db_next_id() too? I'm not sure, if you add stuff to another Drupal Installation, for whatever reason....

Log in or register to post comments

Comment #43

chx CreditAttribution: chx commented 21 July 2009 at 14:21

File	Size
sequences_new.patch	21.3 KB

Well, we had good code for these earlier so i reintroduced the try-UPDATE-then-INSERT and tried that indeed if another transaction wants to write the same row, it will wait until this transaction commits.

Log in or register to post comments

Comment #44

chx CreditAttribution: chx commented 21 July 2009 at 14:23

File	Size
sequences_new.patch	12.88 KB

Opsie! Patch mixup :)

Log in or register to post comments

Comment #45

David Strauss

he/him

CreditAttribution: David Strauss commented 22 July 2009 at 01:57

One note: this does remove our ability to stagger keys in MySQL circular replication setups. Table-per-sequence would not.

Log in or register to post comments

Comment #46

Berdir

German

Switzerland

CreditAttribution: Berdir commented 25 July 2009 at 09:39

-  // Inserting uid 0 here confuses MySQL -- the next user might be created as
-  // uid 2 which is not what we want. So we insert the first user here, the
-  // anonymous user. uid is 1 here for now, but very soon it will be changed
-  // to 0.
-  db_query("INSERT INTO {users} (name, mail) VALUES('%s', '%s')", '', '');
+  db_query("INSERT INTO {users} (uid, name, mail) VALUES (0, '%s', '%s')", '', '');
   // We need some placeholders here as name and mail are uniques and data is
-  // presumed to be a serialized array. Install will change uid 1 immediately
-  // anyways. So we insert the superuser here, the uid is 2 here for now, but
-  // very soon it will be changed to 1.
-  db_query("INSERT INTO {users} (name, mail, created, status, data) VALUES('%s', '%s', %d, %d, '%s')", 'placeholder-for-uid-1', 'placeholder-for-uid-1', REQUEST_TIME, 1, serialize(array()));
-  // This sets the above two users uid 0 (anonymous). We avoid an explicit 0
-  // otherwise MySQL might insert the next auto_increment value.
-  db_query("UPDATE {users} SET uid = uid - uid WHERE name = '%s'", '');
-  // This sets uid 1 (superuser). We skip uid 2 but that's not a big problem.
-  db_query("UPDATE {users} SET uid = 1 WHERE name = '%s'", 'placeholder-for-uid-1');
+  // presumed to be a serialized array. 
+  db_query("INSERT INTO {users} (uid, name, mail, created, status, data) VALUES (1, '%s', '%s', %d, %d, '%s')", 'placeholder-for-uid-1', 'placeholder-for-uid-1', REQUEST_TIME, 1, serialize(array()));

Can you convert that to DBTNG while you're changing it anyway? Then I can remove it from #491556: DBTNG the rest and these issues will hopefully not conflict with each other.

Log in or register to post comments

Comment #47

31 July 2009 at 12:02

Status:

Needs review

» Needs work

The last submitted patch failed testing.

Log in or register to post comments

Comment #48

chx CreditAttribution: chx commented 2 August 2009 at 16:09

note that staggering the keys can easily be set up in a settings.php override. that's not a problem, not at all.

Log in or register to post comments

Comment #49

David Strauss

he/him

CreditAttribution: David Strauss commented 2 August 2009 at 16:40

"note that staggering the keys can easily be set up in a settings.php override"

No, it can't. The key staggering happens on the MySQL server, and IP-based failover systems are transparent to Drupal. Drupal wouldn't be able to know when to switch incrementation schemes.

Log in or register to post comments

Comment #50

chx CreditAttribution: chx commented 2 August 2009 at 23:21

Huh?? I thought that auto increment increments can be set up uniform -- of course, starting the sequences can not be but MEH just set them up manually, it's not like you have a cluster with an empty database often...

Log in or register to post comments

Comment #51

David Strauss

he/him

CreditAttribution: David Strauss commented 3 August 2009 at 06:12

@chx I'm not really sure what you're saying, especially by "I thought that auto increment increments can be set up uniform."

Log in or register to post comments

Comment #52

chx CreditAttribution: chx commented 4 August 2009 at 13:49

I meant, all tables have the same auto increment even if that's not 1. So having a centralized setting for that is enough.

Log in or register to post comments

Comment #53

David Strauss

he/him

CreditAttribution: David Strauss commented 12 October 2009 at 14:20

Just talked with chx. This is far smaller in scope than I thought and is actually a good idea. We do need to prime this for starting at "2" for the sake of the user table, so I'll leave it on "needs work" for now.

(I had thought on initial reading that this would take us back to the D5 sequences method, which is very much not the case.)

Log in or register to post comments

Comment #54

David Strauss

he/him

CreditAttribution: David Strauss commented 12 October 2009 at 14:35

Issue tags:

+chx-and-david-performance-sprint

Tagging.

Log in or register to post comments

Comment #55

chx CreditAttribution: chx commented 12 October 2009 at 14:42

Status:	Needs work	» Needs review
Issue tags:	-chx-and-david-performance-sprint

File	Size
sequences_new.patch	11 KB

We do not need to prime it because we take of imported users there.

Log in or register to post comments

Comment #56

chx CreditAttribution: chx commented 12 October 2009 at 14:43

Issue tags:

+chx-and-david-performance-sprint

Tagging back

Log in or register to post comments

Comment #57

12 October 2009 at 15:00

Status:

Needs review

» Needs work

The last submitted patch failed testing.

Log in or register to post comments

Comment #58

chx CreditAttribution: chx commented 12 October 2009 at 15:13

OK, so the current implementation uses D5-like thing but we had an implementation far back in time where we were using a single sequence for everything. We will revert to that and David Strauss agreed to this. The advantage is that you can use MySQL auto increment settings without problems. The only disadvantage is that the IDs will not be continuous but they are not continuous if you delete anyways, so that's not a real problem.

Log in or register to post comments

Comment #59

chx CreditAttribution: chx commented 12 October 2009 at 18:39

Status:

Needs work

» Needs review

File	Size
sequences_new.patch	10.29 KB

Log in or register to post comments

Comment #60

12 October 2009 at 18:55

Status:

Needs review

» Needs work

The last submitted patch failed testing.

Log in or register to post comments

Comment #61

David Strauss

he/him

CreditAttribution: David Strauss commented 12 October 2009 at 19:11

INSERT IGNORE can mask problems other than duplicate keys. Use a try/catch block and look for an error code/exception specific to duplicate key insertion failure. Alternatively, use INSERT ... ON DUPLICATE UPDATE [something with no effect].

Also, the code in the user module still specifies a sequence name.

Log in or register to post comments

Comment #62

chx CreditAttribution: chx commented 12 October 2009 at 20:32

Status:

Needs work

» Needs review

File	Size
sequences_new.patch	12.99 KB

Well, the actions tests were assuming that aid is 1. That's no more the case.

Log in or register to post comments

Comment #63

chx CreditAttribution: chx commented 12 October 2009 at 20:34

File	Size
sequences_new.patch	13.2 KB

Enrolled David's request w a lengthy comment about it.

Log in or register to post comments

Comment #64

David Strauss

he/him

CreditAttribution: David Strauss commented 12 October 2009 at 21:14

Status:

Needs review

» Reviewed & tested by the community

This looks good. Clean, simple, and does what we need. (And it supports key staggering with multiple MySQL servers.)

Log in or register to post comments

Comment #65

David Strauss

he/him

CreditAttribution: David Strauss commented 12 October 2009 at 21:19

Status:

Reviewed & tested by the community

» Needs review

Wait, system_update_7040() looks like it might need work.

Log in or register to post comments

Comment #66

David Strauss

he/him

CreditAttribution: David Strauss commented 12 October 2009 at 21:22

Status:

Needs review

» Needs work

Log in or register to post comments

Comment #67

chx CreditAttribution: chx commented 12 October 2009 at 21:22

Status:

Needs work

» Reviewed & tested by the community

File	Size
sequences_new.patch	13.27 KB

system_update_7040 needed a little love still.

Log in or register to post comments

Comment #68

David Strauss

he/him

CreditAttribution: David Strauss commented 12 October 2009 at 21:23

Indeed, RTBC.

Log in or register to post comments

Comment #69

Crell CreditAttribution: Crell commented 13 October 2009 at 05:43

Status:

Reviewed & tested by the community

» Needs work

The MySQL version of nextId() needs its parameter to have a default value.

Shouldn't the base version of nextId() catch any errors and handle the transaction itself? Nothing in the calling routine is going to know what transaction object to roll back in case there's a problem.

It should be noted in docblocks that these are *not* atomic operations.

I've read the MySQL implementation three times now. I still do not grok what's going on there. The comment tells me that INSERT IGNORE can cause problems. OK, great, I would never have thought to use INSERT IGNORE. How does ODKU help here? WHY are we doing that in the first place? I don't follow.

The $new_id = query is doing something quite weird that I don't understand without a comment. Does that even, um, work?

I also read the comment for the delete query several times. I still don't understand what's going on there. Of course we keep the new_id in the table. The auto-inc just set it, didn't it?

Also, all of these long comment blocks would be much easier to read if they have a proper blank line before them.

It also looks like we're merging the keyspace for users, actions, and anything else that uses the sequences system. Eeek! OK, I get that we're not supposed to assume sequential values for a surrogate key, but that just makes me feel wrong inside. Users are a first class entity like Nodes, and now various other things. Why do they all use auto-inc in their own keyspaces... EXCEPT for Users which now suddenly use a different API entirely to get its keys? That feels like a ticking time bomb to me that is going to screw someone up sooner or later. I dislike "clever hacks", as they usually need to be replaced later by something not quite so clever.

Log in or register to post comments

Comment #70

chx CreditAttribution: chx commented 13 October 2009 at 05:46

Status:

Needs work

» Needs review

File	Size
sequences_new.patch	13.76 KB

Moved the deletion to be once per page and the existing_id to happen if necessary.

Log in or register to post comments

Comment #71

13 October 2009 at 06:05

Status:

Needs review

» Needs work

The last submitted patch failed testing.

Log in or register to post comments

Comment #72

David Strauss

he/him

CreditAttribution: David Strauss commented 13 October 2009 at 06:12

Status:

Needs work

» Needs review

Agreed that some of the comments could stand improvement. Also, chx has posted an updated patch in #70 with the optimizations I suggested offline.

It should be noted in docblocks that these are *not* atomic operations.

What do you mean by "atomic" here? Sequences don't follow the same rules as normal data changes. Moreover, this sequence API behaves with the same atomicity rules as auto-increment sequences do. If you BEGIN a transaction in MySQL, INSERT a row into a table with a serial column, and ROLLBACK, you'll notice that the auto-increment on the table *keeps* its increase. (And there are good reasons for this.)

Why do they all use auto-inc in their own keyspaces... EXCEPT for Users which now suddenly use a different API entirely to get its keys?

I have to object to calling it a shared key space. It's not sharing a key space any more than everything else that uses unsigned integers "shares a key space." You're only objecting to the aesthetics of appearing to skip some values in the sequence for each consumer of the sequence.

We make an exception for users because the Users table does even clever-er things with magic 0 and 1 uid values. See the installer for all the hacks we need to support auto-increment on the Users table in MySQL with these magic values.

That feels like a ticking time bomb to me that is going to screw someone up sooner or later. I dislike "clever hacks", as they usually need to be replaced later by something not quite so clever.

The only way to separate the sequences (and preserve support for staggered MySQL sequences) is to use a table for each sequence. This means sequence tables would be created on the fly. *Some* administrators like to lock down DDL events happening in production unless planned. (It's not such a bad idea.) A change to have a table per sequence would require table creation privileges at runtime or, worse, require developers to implement a sequences info hook.

Even more fun is the fact that DDL events kill transactions in MySQL. Getting a sequence value could kill a transaction in progress if we create tables on the fly.

The "clever hack" of skipping keys seems like the lesser evil, especially when it simplifies the code while maintaining correctness.

Log in or register to post comments

Comment #73

David Strauss

he/him

CreditAttribution: David Strauss commented 13 October 2009 at 06:13

Status:

Needs review

» Needs work

Cross-post with the test bot.

Log in or register to post comments

Comment #74

chx CreditAttribution: chx commented 13 October 2009 at 06:25

Status:

Needs work

» Needs review

File	Size
sequences_new.patch	14.08 KB

Aaaaaaaand more comments (and a bugfix).

Log in or register to post comments

Comment #75

14 October 2009 at 06:55

Status:

Needs review

» Needs work

The last submitted patch failed testing.

Log in or register to post comments

Comment #78

chx CreditAttribution: chx commented 14 October 2009 at 16:26

Status:

Needs work

» Needs review

File	Size
sequences_new.patch	14.32 KB

Log in or register to post comments

Comment #79

agentrickard

he/him

English

Georgia (US)

CreditAttribution: agentrickard commented 14 October 2009 at 18:23

Crell asked me to give this a look, since I have the same id = 0 autoincrement problem in Domain Access (d6), since I copied the logic from {users} for use there. I also have a module (Domain Prefix) that allows on-fly-fly table copying, and can cause issues when dealing with the {users} table. See the patch in #445386: Copying "users" table fails which solves this.

I thought moving away from {sequences} in D6 was a Good Idea, and #35 aside, I still think its a good idea.

If I read this right, then modules other than user and actions cannot use this table unless we are ok with _not_ objecting to the aesthetics of appearing to skip some values in the sequence for each consumer of the sequence.

I object to this, both in contrib and in core. It's confusing and counter-intuitive for the developer. (And a seeming regression from the D5 sequences days of key/value) pairs.

That said, I don't have a whole lot of stake here (transactions, multiple unique keys, simply not my call). But this looks like a single use-case API which breaks expected behavior. As a developer, I expect sequential integrity (no skipped keys) in the id strings for my tables. So if this goes in, I just won't use it for my modules.

I can also see a lot of administrator confusion, when our clients ask us "Why are there users 1, 2, 3, 4 and then nothing until 26? We never deleted any users?" This, I think, could be a big problem.

AFAIK, the autoincrement 0 problem on MySQL only bites us when doing a db copy from one to another. It may be better (and simpler) to run a check for the absence of UID 0 and adjust the table appropriately. (I have some code in Domain Prefix that handles this when copying tables.)

Log in or register to post comments

Comment #80

Crell CreditAttribution: Crell commented 14 October 2009 at 19:01

@agentrickard: Is "clients who think they know more then they actually do" the only problem in practice? I don't like the non-sequential keys either, but technically neither MySQL or Postgres guarantee them in the first place.

Log in or register to post comments

Comment #81

David Strauss

he/him

CreditAttribution: David Strauss commented 14 October 2009 at 19:14

I refuse to design around "clients who think they know more then they actually do." We don't make technical decisions based on whether uninformed parties think we've made the correct technical decision.

Log in or register to post comments

Comment #82

chx CreditAttribution: chx commented 14 October 2009 at 19:14

If I read this right, then modules other than user and actions cannot use this table unless we are ok with not objecting to the aesthetics of appearing to skip some values in the sequence for each consumer of the sequence.

Seriously, that's not a valid argument at all. I have heard at DC DC that some people are developing/rolling in a way that they leave holes for development/production. If a client asks you then just tell them that the numbers above 1 are meaningless.

Log in or register to post comments

Comment #83

agentrickard

he/him

English

Georgia (US)

CreditAttribution: agentrickard commented 14 October 2009 at 19:32

Hey, I'll back off that point if its a naive one. I'm just saying, "this makes me uncomfortable".

I would argue, David, that we, in fact _do_ make technical decisions based on "uninformed parties" comfort all the time. Otherwise, we'd all login with our uids and force 16-digit randomized passwords, for instance.

The best argument so far is that autoincrement keys themselves can't be trusted to be sequential (or can be altered). That had not occurred to me. If that's the case, consider my objection withdrawn.

Log in or register to post comments

Comment #84

agentrickard

he/him

English

Georgia (US)

CreditAttribution: agentrickard commented 14 October 2009 at 19:36

However, I still might argue that the scope of this patch is far larger than the problem it tries to solve.

Do we have sequence issues in cases other than table copying, where AUTOINCREMENT won't allow zeros? I see some mention of this in #35, but don't get the use-case? Automated timestamps + a sequential ID? We fix that now by manually entering the timestamp (which is easier than the patch, isn't it)?

I mean, this is more code than the problem it fixes. Isn't it?

Log in or register to post comments

Comment #85

David Strauss

he/him

CreditAttribution: David Strauss commented 14 October 2009 at 19:45

Otherwise, we'd all login with our uids and force 16-digit randomized passwords, for instance.

No, that's called a usability issue. Skipping UIDs has never and will never affect the way users interact with the system.

Log in or register to post comments

Comment #86

chx CreditAttribution: chx commented 14 October 2009 at 21:40

well, actions has a use case. simpletest used to have a use case where the test got an id and the simpletest_test_id was not using it for anything else. We have added some data there so this case is now moot. queue had a similar use case where every consumer got a unique id. Although we have optimized this out, it certainly can be expected that more "i need a unique id in the db" cases similar to simpletest and queue will appear.

Log in or register to post comments

Comment #87

Crell CreditAttribution: Crell commented 15 October 2009 at 01:54

Status:

Needs review

» Reviewed & tested by the community

I am not 100% convinced here that this isn't going to introduce as many problems as it solves. However, chx and David have made a compelling argument so I am not going to block this patch. I also will not fully endorse it, though. Kicking it upstairs for final decision. :-) (Yes, I am going to pass the buck because as DB maintainer I don't have a job description!)

Log in or register to post comments

Comment #88

agentrickard

he/him

English

Georgia (US)

CreditAttribution: agentrickard commented 15 October 2009 at 13:59

I'm with Crell. I don't quite get it, but I won't block it, either.

Log in or register to post comments

Comment #89

15 October 2009 at 16:55

Status:

Reviewed & tested by the community

» Needs work

The last submitted patch failed testing.

Log in or register to post comments

Comment #90

Dries CreditAttribution: Dries commented 15 October 2009 at 18:18

+++ includes/database/database.inc	2009-10-14 15:36:27 +0000
@@ -1047,6 +1047,41 @@ abstract class DatabaseConnection extend
+   *  Use this function if for some reason you can't use a serial field.

I find this comment to be very limited. In #35, a lot of good reasons where mentioned, and they are all lost in the phpDoc. I'm not suggesting we copy them, but I do think we can do a better job helping people understand why and when to use this.

It might also be useful to mention when not to use the sequences API.

+++ includes/database/database.inc	2009-10-14 15:36:27 +0000
@@ -1047,6 +1047,41 @@ abstract class DatabaseConnection extend
+    $stmt = $this->query('UPDATE {sequences} SET value = GREATEST(value, :existing_id) + 1', array(

I'm not a big fan of abbreviating things to $stmt but it is consistent with what we do elsewhere. Ignore for now.

+++ includes/database/mysql/database.inc	2009-10-14 16:23:36 +0000
@@ -68,6 +68,41 @@ class DatabaseConnection_mysql extends D
+      // To ensure the new id will be larger than the existing, INSERT the
+      // existing id into the table. It is possible that it's in there, so we
+      // want to ignore duplicate key errors. In MySQL, INSERT IGNORE is a
+      // solution, but that can mask problems other than duplicate keys so we
+      // use INSERT ... ON DUPLICATE UPDATE in such a way that the UPDATE does
+      // not do anything.

This comment makes me scratch my head. It should be made more clear.

Log in or register to post comments

Comment #91

chx CreditAttribution: chx commented 16 October 2009 at 00:31

Status:

Needs work

» Reviewed & tested by the community

File	Size
sequences_new.patch	0 bytes

Log in or register to post comments

Comment #92

chx CreditAttribution: chx commented 16 October 2009 at 00:34

File	Size
sequences_new.patch	14.65 KB

Log in or register to post comments

Comment #93

16 October 2009 at 00:50

Status:

Reviewed & tested by the community

» Needs work

The last submitted patch failed testing.

Log in or register to post comments

Comment #94

chx CreditAttribution: chx commented 16 October 2009 at 01:25

Status:

Needs work

» Needs review

File	Size
sequences_new.patch	14.65 KB

Bah, system.install upgrades numbers.

Log in or register to post comments

Comment #95

cweagans

He/Him

English

Boise, ID, USA

CreditAttribution: cweagans commented 16 October 2009 at 02:39

Status:

Needs review

» Reviewed & tested by the community

Good to go.

Log in or register to post comments

Comment #96

Dries CreditAttribution: Dries commented 16 October 2009 at 03:02

Status:

Reviewed & tested by the community

» Fixed

Committed.
Thanks!

Log in or register to post comments

Comment #97

chx CreditAttribution: chx commented 18 October 2009 at 06:46

Status:

Fixed

» Reviewed & tested by the community

The commit never happened.

Log in or register to post comments

Comment #98

chx CreditAttribution: chx commented 18 October 2009 at 06:47

File	Size
sequences_new.patch	18.57 KB

system_upgrade_7043 conflict fixed.

Log in or register to post comments

Comment #99

webchick

she/they

English

Vancouver 🇨🇦

CreditAttribution: webchick commented 18 October 2009 at 06:57

Status:	Reviewed & tested by the community	» Needs work
Issue tags:		+Needs documentation

It looks like Dries missed this, so I committed it.

Please document changes in module upgrade guide.

Log in or register to post comments

Comment #100

webchick

she/they

English

Vancouver 🇨🇦

CreditAttribution: webchick commented 18 October 2009 at 07:16

Well that'll teach me to commit patches without looking at them. :P

1. This patch added an extra system_update_7043, which means test bot failed 10,000 patches in under a second. ;) Fixed.

2. When I ran the update I got:

Failed: SQLSTATE[42000]: Syntax error or access violation: 1075 Incorrect table definition; there can be only one auto column and it must be defined as a key

Log in or register to post comments

Comment #101

webchick

she/they

English

Vancouver 🇨🇦

CreditAttribution: webchick commented 18 October 2009 at 07:35

Issue tags:

+Needs tests

...and this also broke SimpleTest's browser-based testing. Batch API spews errors about $max_id = db_select('SELECT MAX(value) FROM {sequences}')->fetchField(); Should be db_query().

So now, we need tests too. :P~

Also, why is only mysql's database.inc changed and not also pgsql and sqlite?

Log in or register to post comments

Comment #102

mfb

they or he

CreditAttribution: mfb commented 18 October 2009 at 21:27

Status:

Needs work

» Needs review

File	Size
sequences.patch	1.75 KB

Fix the system_update_7044() errors.

Log in or register to post comments

Comment #103

mfb

they or he

CreditAttribution: mfb commented 19 October 2009 at 05:01

File	Size
sequences.patch	1.7 KB

Just tested this and noticed it was still broken, db_delete did not call execute(). Re-rolled to make it quite a bit simpler.

Log in or register to post comments

Comment #104

Dave Reid

he/him

English

Nebraska USA

CreditAttribution: Dave Reid commented 20 October 2009 at 18:20

Priority:	Normal	» Critical
Status:	Needs review	» Reviewed & tested by the community

Tested and seems to work fine. HEAD (update.php) is broken without this.

Log in or register to post comments

Comment #105

webchick

she/they

English

Vancouver 🇨🇦

CreditAttribution: webchick commented 20 October 2009 at 19:06

Status:

Reviewed & tested by the community

» Needs work

Thanks, committed to HEAD.

I'd still like to see test coverage of that query that caused SimpleTest to break.

Log in or register to post comments

Comment #106

Josh Waihi CreditAttribution: Josh Waihi commented 4 November 2009 at 00:41

Status:

Needs work

» Needs review

File	Size
356074-cleanup.patch	487 bytes

yeah the original nextId function isn't correctly built. I don't know if my patch is the correct solution. I'm not in the loop here. But this helps PostgreSQL back to passing tests.

Basically, $new_value isn't set and so it through notices etc and doesn't make the function work.

Log in or register to post comments

Comment #107

Damien Tournoud CreditAttribution: Damien Tournoud commented 4 November 2009 at 01:10

The default implementation is broken anyway: we can't, by contract, write a value into a serial column.

For the default implementation, I'm ready to settle with:

do {
  $new_value = $this->query("INSERT INTO {sequences} () VALUES ()", array(), array('return' => Database::RETURN_INSERT_ID);
  $this->query("DELETE FROM {sequences} WHERE value < :value", array(':value' => $new_value));
}
while ($new_value <= $existing);

return $new_value;

It will be very slow if $existing is big, but what can you do about it?

For PostgreSQL, I'm considering using sequences, this way:

if ($existing) {
  $this->query("ALTER SEQUENCE seq MINVALUE :value", array(':value' => $existing));
}

return $this->query("SELECT nextval('seq')")->fetchField();

Log in or register to post comments

Comment #108

Josh Waihi CreditAttribution: Josh Waihi commented 4 November 2009 at 02:43

File	Size
356074comment-2223588-extended.patch	1.79 KB

I've put in the changes as DamZ suggested. However, the PostgreSQL solution failed:

drupal7=# SELECT nextval('sequences_value_seq');
 nextval 
---------
       2
(1 row)

drupal7=# alter SEQUENCE sequences_value_seq MINVALUE 9;
ERROR:  START value (2) cannot be less than MINVALUE (9)

So I've used RESTART WITH instead:

<DamZ> fiasco: you cannot really use RESTART, because sequence numbers can be pre-cached into the backends
<DamZ> fiasco: so there is no guarantee that the number you will RESTART with has not be used yet

Its the way PostgreSQL is designed. If that is also its flaw, PostgreSQL should address it. Not us.

Log in or register to post comments

Comment #109

Josh Waihi CreditAttribution: Josh Waihi commented 4 November 2009 at 02:44

File	Size
356074comment-2223588-extended.patch	1.79 KB

resubmitting the patch because you can't access it with the '#' in there.

Log in or register to post comments

Comment #110

chx CreditAttribution: chx commented 4 November 2009 at 20:11

Please start a separate issue for the postgresql one. There is no way you can pull it off (I asked in #postgresql and listed in #35 as one of the very reason this patch exists) and we should not let a bugfix be held up by the postgresql.

Log in or register to post comments

Comment #111

chx CreditAttribution: chx commented 4 November 2009 at 20:17

Also note that all the existing_id magick is for auto catching up with mass imports and if you loop that will be lots of fun. Or not. I helped with a certain site that webchick knows well in 2006 which was importing 300 000 users...

Log in or register to post comments

Comment #112

Josh Waihi CreditAttribution: Josh Waihi commented 9 November 2009 at 19:53

we should not let a bugfix be held up by the postgresql

Yes we should. Chx, this is not MSSQL or some other db we don't support. Drupal needs to work on more than just MySQL.

My patch above still stands unless someone has a better idea.

This patch is crucial for PostgreSQL at the moment as it fixes a bunch of tests. Infact, test virtually won't run without it.

Log in or register to post comments

Comment #113

Crell CreditAttribution: Crell commented 9 November 2009 at 22:11

I'd really prefer that follow-on patches happen in new issues if possible, since this was already committed. Not that they don't get addressed, but that we don't play ping-pong in a single thread.

As for the best way to support this API in Postgres, I defer to Damien and Josh as they actually know how Postgres works.

Log in or register to post comments

Comment #114

chx CreditAttribution: chx commented 10 November 2009 at 13:08

Josh, then let's concentrate on making the default work. The postgresql specific is a separate issue. The default should work on pgsql too. If we can do a better one that's good but if not, oh well, that's why we have a default.