Sorry about creating a new critical issue, this should be easy to fix :)

This came on in #761976: DatabaseSchema_pgsql::changeField() is broken.

Currently, the upgrade test does this (to create the D6 database structure including content):

db_create_table('some_table', $table_definition_with_serial_column);
db_insert('some_table)
  ->fields(array(
    'serial_column' => 1
  // ....

The problem here is that InsertQuery automatically tries to fetch the last_insert_id after the row was inserted. It does that by using CURRVAL() on the column. The problem here is that if we explicitly set a value to the serial column, we never call NEXTVAL() and because of that, we don't have a "current value". So the currval() call fails.

Proposed fix:
- Catch the exception, check error code (55000) and in case it's that, set the $last_insert_id to NULL.

I also tried to automatically get the value from the passed in $serial_column but that is quite complex and I'm not sure if it's worth it. the value is passed by the caller so we can imho safely assume that he will not use the return value :)

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Berdir’s picture

Status: Active » Needs review

Forgot to set the status.

however, there might be a bigger problem here, so don't RTBC it just yet...

Debugging...

Berdir’s picture

Assigned: Unassigned » Berdir
Berdir’s picture

FileSize
1.61 KB

Follow-up problem:

After inserting a row with a manually set value, trying to insert a row without setting the serial column fails with a unique key constraint error.

A possible fix is to automatically update the sequence when a value is inserted into the serial column.

The attached patch does this. I also just found out about #445214: dwr() fails when a NULL value is passed in a serial column (on PostgreSQL and SQLite) which fixes the same bug inside dwr() but doesn't contain the part with the auto-updating...

Damien Tournoud’s picture

Status: Needs review » Needs work

This looks good to me. Could you just please move the try/catch into an override of PDO::lastInsertId()?

Berdir’s picture

Status: Needs work » Needs review
FileSize
1.96 KB

Ok, updated the patch.

I've created a simple test script, can someone convert this into a test?

db_drop_table('currval_test');

db_create_table('currval_test', array(
  'fields' => array(
    'serial_field' => array(
      'type' => 'serial',
      'not null' => TRUE,
    ),
    'normal_field' => array(
      'type' => 'varchar',
      'length' => 64,
      'not null' => TRUE,
      'default' => '',
    ),
  ),
  'primary key' => array(
    'serial_field',
  ),
));

//db_query("SELECT NEXTVAL('{blocks}_bid_seq')");
$last_insert_id = db_insert('currval_test')->fields(array(
    'serial_field',
    'normal_field',
  ))
  ->values(array(
    'serial_field' => '1',
    'normal_field' => 'user',
  ))
  ->execute();

$last_insert_id = db_insert('currval_test')
  ->fields(array('normal_field' => 'bla'))
  ->execute();

Should be simple, we only need to verify that this doesn't fail with an exception and that $last_insert_id is > 1 (Not == 2!)

Berdir’s picture

Assigned: Berdir » Unassigned
Issue tags: +Needs tests
jhedstrom’s picture

Assigned: Unassigned » jhedstrom

I'll write the tests.

bjaspan’s picture

Assigned: jhedstrom » Unassigned

re #4: I think I disagree that the try/catch should be in lastInsertId(). If you call lastInsertId() in a situation in which there is no valid last insert id, shouldn't it return an exception? The problem here is that InsertQuery::query() is calling lastInsertId() in just such a situation. InsertQuery::query() could be smart enough to detect that there is no last insert id and not try to return it, or if it does not want to deal with that alternatively it can catch the exception when it occurs and handle it. If anyone else calls lastInsertId() under the same conditions, they should have to make the same choice.

bjaspan’s picture

@jhedstrom: I did not mean to unassign the issue from you, it was a cross-post, but I cannot re-assign you, so you'll have to.

jhedstrom’s picture

Assigned: Unassigned » jhedstrom
Damien Tournoud’s picture

re #4: I think I disagree that the try/catch should be in lastInsertId(). If you call lastInsertId() in a situation in which there is no valid last insert id, shouldn't it return an exception? The problem here is that InsertQuery::query() is calling lastInsertId() in just such a situation. InsertQuery::query() could be smart enough to detect that there is no last insert id and not try to return it, or if it does not want to deal with that alternatively it can catch the exception when it occurs and handle it. If anyone else calls lastInsertId() under the same conditions, they should have to make the same choice.

No, typically lastInsertId() doesn't raise an exception. It does on PostgreSQL but not on the other database engines. This is something we want to solve directly there. Plus, we are only catching a very specific exception.

jhedstrom’s picture

Assigned: jhedstrom » Unassigned
Status: Needs review » Needs work
FileSize
2.66 KB

Here's a test. It currently fails because when a manual serial is inserted after nextval has been called, lastInsertId returns the previous auto-incremented ID instead of the new one (in the testcase, 5 is returned instead of the expected 7).

Berdir’s picture

Status: Needs work » Needs review
FileSize
3.6 KB

Ok, new patch:

- Merge and extend the tests from #12 (first insert the id 10, then 7, make sure $last_insert_id returns the correct value in both cases)
- Updated pgsql driver to manually get and return the value if passed in
- Handle racing conditions by setting the sequence to the bigger value of nextval() and max($serial_field)

Test passes on MySQL and PostgreSQL, someone should verify this on SQLite...

bjaspan’s picture

+++ includes/database/pgsql/database.inc
@@ -188,6 +188,30 @@ class DatabaseConnection_pgsql extends DatabaseConnection {
+        // This fail if a value was inserted into the serial column and

Nit: "fails"

+++ includes/database/pgsql/query.inc
@@ -69,6 +69,26 @@ class InsertQuery_pgsql extends InsertQuery {
+    if (!empty($table_information->serial_fields)) {
+      ¶
+      // If the first serial field is in the insert fields, update the

Nit: whitespace.

+++ includes/database/pgsql/query.inc
@@ -69,6 +69,26 @@ class InsertQuery_pgsql extends InsertQuery {
+      if (in_array($table_information->serial_fields[0], $this->insertFields)) {
+        $last_insert_id = $insert_values[array_search($table_information->serial_fields[0], $this->insertFields)];

This does an array search twice. Because FALSE is not a valid value for a serial column, we can replace this with one call to array_search, then test !== FALSE.

+++ includes/database/pgsql/query.inc
@@ -69,6 +69,26 @@ class InsertQuery_pgsql extends InsertQuery {
+      foreach ($table_information->serial_fields as $index => $serial_field) {

So, ummmm... I wanted to see if we could avoid the in_array on every element of serial_fields, because that is O(N^2) (though granted we almost never have more than one serial field, in fact I'm not sure even that is legal).

However, I cannot find anywhere in HEAD that uses or defines serial_fields at all. Does this property even exist?

+++ modules/simpletest/tests/database_test.test
@@ -561,6 +561,27 @@ class DatabaseInsertTestCase extends DatabaseTestCase {
+    // Test a manual insert.

It seems like we should have a test for inserting an explicit value, then verifying that an auto-increment value does not cause a unique key constraint.

Powered by Dreditor.

Berdir’s picture

Assigned: Unassigned » Berdir
FileSize
4.37 KB

- Fixed the typo and whitespace
- Switched the code to a single array_search call
- Added a missing part that adds the $serial_fields information

marvil07’s picture

I am updating the patch as the styling suggestions, but this still needs review about the other logic suggestions.

BTW I tried the relevant tests on postgres and sqlite3 and they work fine in both databases types.

marvil07’s picture

for the record, pay attention to the patch in #15, since my patch is a subset of that :-p

Berdir’s picture

Assigned: Unassigned » Berdir
FileSize
4.73 KB

Found the problem for the failing test. $key needs to be checked type safe against FALSE, if ($key) doesn't work...

This passes on postgresql.

Berdir’s picture

FileSize
4.7 KB

And now without debug calls...

puregin’s picture

Ran the database tests on the last patch with sqlite; everything passed except DatabaseTransactionTestCase->testCommittedTransaction() and DatabaseTransactionTestCase->testTransactionRollBackSupported() which give General error: 5 database is locked.

bjaspan’s picture

Status: Needs review » Reviewed & tested by the community

The patch in #19 looks good to me, and the new tests pass on mysql and (bedir asserts) pgsql, thus fixing the pgsql issue. The new tests apparently reveal an existing bug in the sqlite driver. I believe our policy is not to hold up a commit due to problems on non-mysql drivers. Thus, RTBC.

Damien Tournoud’s picture

Status: Reviewed & tested by the community » Needs review

SQLite is not affected by this patch. #20 is probably due to an old version of SQLite being used.

I'm not confortable with this part of the patch:

+          $new_value = $this->connection->query("SELECT NEXTVAL('" . $table_information->sequences[$index] . "') AS nextval, MAX(" . $serial_field . ") AS max FROM {" . $this->table . "}")->fetchObject();
+          // Set the current sequence value to the bigger of those two values.
+          $curval = $new_value->nextval > $new_value->max ? $new_value->nextval : $new_value->max;
+          $this->connection->query("SELECT SETVAL('" . $table_information->sequences[$index] . "', :curval)", array(':curval' => $curval));

... as this is prone to race conditions. At the minimum we should use only one query here, but we should study if an atomic solution is possible.

puregin’s picture

I reran the database transaction tests with sqlite 3.6.10, and am getting the same errors. I will try to update to 3.7.x and repeat

puregin’s picture

Same errors with sqlite 3.7.0.1

bjaspan’s picture

I found that "SELECT SETVAL('foo', (SELECT 30))" sets the counter to 30. So now I just need to find the syntax for selecting the max of (curval and max of column values) in a single expression... but at the moment I cannot access my pgsql server.

Damien Tournoud’s picture

We discussed that with @berdir today, and the only way we can think of to make this purely atomic is to implement a while loop:

if (in_array($serial_field, $this->insertFields)) {
  // Load the next sequence value and the max value of the table.
  while ($this->connection->query("SELECT curval('" . $table_information->sequences[$index] . "') > (SELECT MAX(" . $table_information->sequences[$index] . ") FROM {" . $this->table . "})")->fetchField()) {
    $this->connection->query("SELECT nextval('" . $table_information->sequences[$index] . "')");
  }
}

It will be slow and clunky, but it's the only way I can see to avoid race-conditions. Most of the cases, it should not be a concern because this feature (inserting values into a serial field) is only used during imports, and most of the time the loop should only be used once.

bjaspan’s picture

okay

chx’s picture

Wasn't this the problem with db_next_id? can't we reuse that code?

Berdir’s picture

After thinking about it, I think there is a very simple, possible race condition:

If an INSERT query tries to insert a new row without a value set between another INSERT added a row with a manually set serial value which equals nextval(), it explodes :)

@chx: That might work, but I'm not sure how to do it exactly. Basically, we'd have to block other INSERTS while inserting rows which have a serial column set. Maybe we could simply lock the whole table for writes and unlock it after we have updated the sequence? Would stop other INSERTS *before* they call nextval()?

bjaspan’s picture

What if we always called setval('seq', 1, false) after creating a table with a sequence? This ensures that currval() and nextval() will work, and the "false" argument means that the next call to nextval() will return 1 instead of 2 so we won't change expected behavior.

We could do a similar trick during update on all such tables that have not yet been inserted into.

bjaspan’s picture

My suggestion in #30 would avoid the initial problem, generating an exception when calling db_insert() with an explicit value for a serial column that has never been auto-inserted into, but does nothing to help the followup problem of the key collision when the sequence catches up to explicitly inserted value.

chx just approached me in the coder lounger to ask why db_next_id() doesn't solve the problem. berdir is concerned about concurrency issues, but the implementation of nextId() for pgsql seems to address that by getting an advisory lock on the sequence and incrementing it until it exceeds a specified value. When we insert an explicit value, we could call db_next_id($explicit_value) to make sure that particular explicit value is exceeded by the sequence.

Oh, now I see berdir's concern. While nextId() is iterating, unrelated inserts can insert their own value at whatever the then-currval of the sequence is, which might happen to be exactly the explicit value we just inserted. This can happen because insert queries do not get the advisory lock that nextId() does. So, can we resolve this by having insert queries with explicit values for serial columns get the same advisory lock that nextId() does?

Here's another option: When we are about to insert an explicit value, increment the sequence *before* the insert until it exceeds the value we are inserting. Then do the insert. If two different inserts are running concurrently, presumably for different values (if they are trying to insert the same explicit value one of them would fail in any case), the sequence will first be incremented past the first one (which will then insert) and then past the second one (which will then insert), and they will both be happy.

tpfeiffer’s picture

Just discussed with berdir the following approach: *Before* the INSERT (say with id 42), do the following:

SELECT setval('bla_id_seq', GREATEST(MAX(id), 42)) from bla;
INSERT INTO bla(id, ...) VALUES(42, ...);

This will

  • make the INSERT itself conflict only if there is already a row with id 42 (which must not happen anyway)
  • any INSERT coming after the setval() start with id>=43
  • .

Patch follows.

Berdir’s picture

FileSize
2.67 KB

Patch for the suggestion in #32.

I also have a working implementation of the suggestion in #31 in case there is a reason that this doesn't work.

bjaspan’s picture

You and bedir appear to have hit on the same idea I suggested in #31 in the last paragraph, "Here's another option...". So obviously I like the approach. :-)

+++ includes/database/pgsql/query.inc
@@ -43,6 +43,24 @@ class InsertQuery_pgsql extends InsertQuery {
+            // Force $last_insert_id to the specified value. This is only done
+            // if $index is 0.
+            if ($index == 0) {
+              $last_insert_id = $serial_value;
+            }

This confused me at first b/c I thought index 0 referred to the serial column's position in the table, not its position in the $table_information->serial_fields array. I suggest the comment should say: "This is only done for the first serial column in the table, because that is the column whose newly inserted value is returned as the last insert value."

Assuming that statement is true, of course...

+++ includes/database/pgsql/query.inc
@@ -43,6 +43,24 @@ class InsertQuery_pgsql extends InsertQuery {
+            $this->connection->query("SELECT setval('" . $table_information->sequences[$index] . "', GREATEST(MAX(" . $serial_field . "), :serial_value)) FROM {" . $this->table . "}", array(':serial_value' => (int)$serial_value));

See my suggestion in #30. If we use the optional third arg false to setval(), the next call to nextval() will return the value we are setting here.

Actually, now that you've gone and read #30, what we actually need to do is document for future that we CANNOT use that approach! I just realized it reintroduces a race condition. If we did that here, and another insert into the table occurred before the actual insert query lower in this function, the value we set here would be returned in the other query and the other query's value would be returned here. Wouldn't that be a fun bug to debug?

Let's document why we must do it the way this patch does it, and not using the third arg to setval().

I expect this will be RTBC when the comments are improved. Actually, I'll go improve them now and re-post.

Powered by Dreditor.

bjaspan’s picture

Status: Needs review » Reviewed & tested by the community
FileSize
4.72 KB

New patch. The only changes are comments, so I'm marking RTBC based on the code in the previous patch.

Status: Reviewed & tested by the community » Needs work

The last submitted patch, pgsql-serial-insert-890090-35.patch, failed testing.

Berdir’s picture

Fatal error in trigger other actions?

Lets try again...

Berdir’s picture

Status: Needs work » Needs review
Berdir’s picture

Status: Needs review » Reviewed & tested by the community

Was RTBC before.

Damien Tournoud’s picture

Status: Reviewed & tested by the community » Needs review

Same issue.

setval(sequence, GREATEST(MAX(field), value))

Is not concurrency safe. setval() never is, as far as I know. I believe that #26 is the only way forward.

Status: Needs review » Needs work

The last submitted patch, pgsql-serial-insert-890090-35.patch, failed testing.

Berdir’s picture

Status: Needs work » Needs review
FileSize
3.13 KB

Discussed with DamZ and we came to the conclusion that using explicit values for a serial column is prone for concurrency issues anyway so we simply document that it is not safe.

Berdir’s picture

Status: Needs review » Reviewed & tested by the community

This was RTBC like three times already.. :)

Regarding SQLite, I can confirm two things:
- I get two Database Locked errors in the Transaction tests also, with sqlite 3.7.0
- These errors have nothing to do with this issue.

webchick’s picture

Status: Reviewed & tested by the community » Fixed

Committed to HEAD.

Status: Fixed » Closed (fixed)
Issue tags: -Needs tests

Automatically closed -- issue fixed for 2 weeks with no activity.