One of the problems in simple test runs in PostgreSQL is this:

INSERT INTO simpletest348799node_revision (nid, vid, uid, title, body, teaser, log, timestamp, format) VALUES (:db_insert_placeholder_0, :db_insert_placeholder_1, :db_insert_placeholder_2, :db_insert_placeholder_3, :db_insert_placeholder_4, :db_insert_placeholder_5, :db_insert_placeholder_6, :db_insert_placeholder_7, :db_insert_placeholder_8) - Array ( ) SQLSTATE[55000]: Object not in prerequisite state: 7 ERROR: currval of sequence "simpletest348799node_revision_vid_seq" is not yet defined in this session

This was a db_insert call made from drupal_write_record, which as far as node_revision is concerned, comes from too places:

/**
 * Helper function to save a revision with the uid of the current user.
 *
 * Node is taken by reference, because drupal_write_record() updates the
 * $node with the revision id, and we need to pass that back to the caller.
 */
function _node_save_revision(&$node, $uid, $update = NULL) {
  $temp_uid = $node->uid;
  $node->uid = $uid;
  if (isset($update)) {
    drupal_write_record('node_revision', $node, $update);
  }
  else {
    drupal_write_record('node_revision', $node);
  }
  $node->uid = $temp_uid;
}

From what I can gather, the insert query returns the last insert id, the PDO extension, as explained here: http://nz.php.net/manual/en/pdo.lastinsertid.php#84530 executes "SELECT CURRVAL('seq_name')" as a means to retrive the last insert id. In a PostgreSQL session, the current value of a sequence is not set until nextval() has been called on the sequence in question. Since in this SQL query 'vid' is being explcitly set, the default option 'nextval()' does not execute and so currval has no value, hence the error.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Josh Waihi’s picture

There is a simliar issue with unique key contraints using drupal_write_record with the node table also.

Damien Tournoud’s picture

Title: values for serial columns in INSERT queries should not be explicitly set. » dwr() fails when a NULL value is passed in a serial column (on PostgreSQL and SQLite)

This is due to the change in drupal_write_record() that got in with #303965: Enhanced data import - node_save() and user_save() (see the description in #44).

In a nutshell: callers should not pass NULL values for serial columns in drupal_write_record(). It used to be harmless, because dwr() used to ignore all values passed in serial columns. The change that got in with #303965: Enhanced data import - node_save() and user_save() removes that, and revealed bugs in the calling functions at several places in core (including the field API, and apparently _node_save_revision()!), but *only on PostgreSQL and SQLite*. MySQL considers that setting a NULL value in an autoincrement column is the same as not passing the value.

We will have to fix the calling functions themselves. But, because the behavior is inconsistent across engines, we will probably have to throw an explicit exception in dwr() when a NULL value is set for a serial column.

Josh Waihi’s picture

Issue tags: +PostgreSQL Surge, +PostgreSQL

Thanks Damian, spent a while trying to track down the commit that did this. Tagging Issue.

Josh Waihi’s picture

submitting Damian's patch it improves the test on PostgreSQL. I'm happy with it being committed.

Josh Waihi’s picture

Status: Active » Reviewed & tested by the community

setting to RTBC

yched’s picture

Status: Reviewed & tested by the community » Needs work

Fixing incoming NULL values is one thing.
The bug described in #303965-50: Enhanced data import - node_save() and user_save() (which actually breaks node_save() on SQLite) is a separate one, *not* caused by $node->vid = NULL, but by $node->vid = 0.

The patch fixes it because, although the comment claims 'Ignore NULL values', it actually relies on empty($object->$field).

Either we state that 0 can never be an intended value for a serial, and the fix here unbreaks node_save() - but we need to update the comment so that it is clear we want to catch NULL and 0, and some good soul doesn't revert it back later to 'only check NULL' based on the comment.
Or we accept explicit 0 in serials, just like any other explicit value - then this patch needs to check only for NULL, and node_save() needs its own fixing (unset($node->vid) before _node_save_revision() on node or new revision creation).

Damien Tournoud’s picture

The bug described in #303965-50: Enhanced data import - node_save() and user_save() (which actually breaks node_save() on SQLite) is a separate one, *not* caused by $node->vid = NULL, but by $node->vid = 0.

Good point!

I really believe that we should use the NO_AUTO_VALUE_ON_ZERO SQL mode on MySQL: MySQL standard behavior (assuming 0 in an autoincrement column means "whatever the next value is") is inconsistent with PostgreSQL and SQLite, which makes it harder to write proper cross-platform code when developing on MySQL.

Crell’s picture

I'd love to use NO_AUTO_VALUE_ON_ZERO on MySQL. IME, though, that's a server-level setting that you don't always have access to. A lot of the shared servers I work on won't let me set it, as it's a server-wide property.

Admittedly I've not tried setting it per-connection through Drupal, so it's worth trying; but we should be careful that we don't inadvertently raise the DB permission requirements in the process.

Damien Tournoud’s picture

@Crell: I don't see anything in http://dev.mysql.com/doc/refman/5.0/en/server-sql-mode.html that could imply that NO_AUTO_VALUE_ON_ZERO is any different from TRADITIONAL and ANSI that we already use. It seems to be a per-session parameter.

Crell’s picture

Hm, OK. Let's give it a shot. If you roll a patch I can test it on a server I know has been problematic in the past. If that works, we can commit that. Even if it isn't the ideal fix for this issue it's something we should be doing anyway. :-)

yched’s picture

+1 for making MySQL break on the same stuff that other db backends.
Note that once all dbs break consistently, we'll still need to decide what's the right fix :-) - see 2nd part of comment #6

Josh Waihi’s picture

FileSize
986 bytes
Josh Waihi’s picture

Status: Needs work » Needs review

Here is the patch that implements NO_AUTO_VALUE_ON_ZERO, so this should fail the test, it will give us an idea as to where we need to fix things in HEAD

Status: Needs review » Needs work

The last submitted patch failed testing.

Josh Waihi’s picture

Josh Waihi’s picture

Status: Needs work » Needs review

this will hopefully make mysql fail less, though I'm not expecting 100%

Status: Needs review » Needs work

The last submitted patch failed testing.

Josh Waihi’s picture

Status: Needs work » Needs review
FileSize
3.21 KB

This issue with dwr() leads to more and more problems, such as:

$test_id = db_insert('simpletest_test_id')->useDefaults(array('test_id'))->execute();
// Generates SQL: INSERT INTO simpletest_test_id (test_id) VALUES (default);

MySQL will insert the first value as 0 then after produce this error:ERROR 1062 (23000): Duplicate entry '0' for key 1
However, providing NULL as the the value will still work for MySQL, even though, this is a NOT NULL column.
in PostgreSQL, this statement works fine, the sequence works as expected and using NULL instead fails because it is a NOT NULL column.

Status: Needs review » Needs work

The last submitted patch failed testing.

yched’s picture

   // Generate the node table query and the node_revisions table query.
   if ($node->is_new) {
+    unset($node->nid);
     drupal_write_record('node', $node);

reverts #303965: Enhanced data import - node_save() and user_save()...

Josh Waihi’s picture

Status: Needs work » Active

@yched #303965: Enhanced data import - node_save() and user_save() shouldn't have been commited anyway as it broke HEAD.Ok, so where to from here, we need a solution, an approach, I don't know much about dwr() or the changes that have happened lately. If some one can advise on the approach, I'm happy to write the patch.

Josh Waihi’s picture

Ok, after reading #303965: Enhanced data import - node_save() and user_save() and understanding why it was done. I still can't see a clear solution. The problem with this implementation is this; if you explicitly set a serial columns value, then 1) the sequence should be updated to prevent collisions 2) the PDO driver is still going to want to return the last insert id which will not be set for the session because pgsql uses CURRVAL('my_seq') which has no value untill NEXTVAL('my_seq') is called.

Hmm, maybe we can prevent the latter from happening by detecting if a serial column has a value set and preventing the driver from attempting to retrieve the last insert id.

Josh Waihi’s picture

Status: Active » Needs review

I've got a solution which changes the pgsql driver to allow inserts to return values other than the last_insert_id and alters dwr() so serial columns are removed before an insert query unless they have a value explicitly set in them. Haven't tested in MySQL - will see what the testbot comes up with

Status: Needs review » Needs work

The last submitted patch failed testing.

Josh Waihi’s picture

Status: Needs work » Needs review
FileSize
3.73 KB

forgot to attach patch

chx’s picture

Status: Needs review » Reviewed & tested by the community

node multiple load dies an early death (10 passes, 0 fails, and 1 exception) without the patch and passes (38 passes, 0 fails, and 0 exceptions) on SQLite.

Josh Waihi’s picture

just cleaned up the pgsql insert class a bit, everything still the same.

Josh Waihi’s picture

er, hung over, rolled the wrong stuff.

webchick’s picture

Status: Reviewed & tested by the community » Needs review

I'd like Moshe to give this a once-over. He authored the patch that re-worked drupal_write_record() which ended up causing this issue, and also was the original author of the function.

moshe weitzman’s picture

Status: Needs review » Reviewed & tested by the community

Sorry, I have no expertise to know if this is a proper fix or not. The 'enhanced data import' patch went in with tests so i am certain that this patch does not break it, nor does it break node_save() and user_save() in general. So, I personally would be comfortable going ahead with it.

Dries’s picture

Status: Reviewed & tested by the community » Needs work

What once was a fairly simple function, is now becoming increasingly complex. ;-)

+    // If a serial column does exist with no value (i.e. 0) then remove it as
+    // the database will insert the correct value for us.
+    elseif (isset($serial) && array_key_exists($serial, $fields)) {
+      unset($fields[$serial]);
+    }

Why would one pass a serial with value zero? If you do, I think it should be interpreted as "I want to insert a value with ID 0". I don't think silently modifying the parameters of a call is appropriate. If 0 is not a valid value, let's return an error message instead.

+      // If the serial value was set before the insert, then the database
+      // wouldn't have returned the last insert id, so we populate it instead.
+      if (array_key_exists($serial, $fields) && $fields[$serial] > 0) {
+        $object->$serial = $fields[$serial];
+      }
+      else {
+        $object->$serial = $last_insert_id;
+      }

I think this is reversed. If the database returns a serial ID, we should return that one because it is the only one that is guaranteed to be correct. I don't like how this code makes assumptions about the underlying database behavior -- it is prone to mistakes. We could return 'A' while the database actually returned 'B' which data loss and/or data inconsistencies as a result. I don't want my abstraction layer to 'guess' what the database does, I want to get the database to return me what really happened. If we start making assumptions about how the database works, we'll end up adding more and more special cases.

Josh Waihi’s picture

Status: Needs work » Reviewed & tested by the community
FileSize
3.82 KB

@Dries,

+    // If a serial column does exist with no value (i.e. 0) then remove it as
+    // the database will insert the correct value for us.
+    elseif (isset($serial) && array_key_exists($serial, $fields)) {
+      unset($fields[$serial]);
+    }

at this point, if the serial column is present but does not have a value set, it will equal 0 because serial columns get type casted before hand. Therefore the serial column will equal 0 or greater, it can not be NULL.
On top of that, our databases handle 0 values differently for serial columns, PostgreSQL will let you insert 0 as a value, whilst MySQL will auto increment to the next available value, so I don't think we should allow 0 values for serial columns anyhow.

+      // If the serial value was set before the insert, then the database
+      // wouldn't have returned the last insert id, so we populate it instead.
+      if (array_key_exists($serial, $fields) && $fields[$serial] > 0) {
+        $object->$serial = $fields[$serial];
+      }
+      else {
+        $object->$serial = $last_insert_id;
+      }

I've re-written the patch so this is logic is based upon the Database return type.

yched’s picture

Two reasons why a serial column might get to this point with a 0 value :
- as Josh Waihi says, $fields['serial_colmun'] = NULL does get typecasted to 0 a few lines above. As I said in #6 above, I'm not sure it's the most common case - how much code calls d_w_r() with a serial vlaue explicitly set to NULL ? At least, it is *not* the reason node_save is broken on SQLite.
- current node_save workflow, described in #303965-50: Enhanced data import - node_save() and user_save(), calls d_w_r('node_revision') with vid = 0, and expects it will insert [next_vid]

IMO the main question to answer is "is it valid to call d_w_r with a serial set to 0 while you in fact expect the [next_id] value to be inserted ?". If no, node_save workflow should be fixed.

@Dries : "What once was a fairly simple function, is now becoming increasingly complex. ;-)"
Well, if the code in there gets more complex to handle db backends subtleties so that I don't have to care about them, that's fine IMO.

Josh Waihi’s picture

I've changed a few array_key_exists calls to isset calls because they are faster.
It's worth noting that while handling 0 values for serials is something that should be dealt with in the DB layer, that would require loading the entire schema for every insert we did which is a huge performance hit. dwr() already loads the schema which makes it an ideal place to handle this.
With this patch, NULL, 0 and not set will all insert the next serial value which is consistent IMO for all databases.

Josh Waihi’s picture

thanks to chx for pointing out a few optimization improvements

Josh Waihi’s picture

removed $fields[$serial] > 0 because that is stupid in an if statement

chx’s picture

Yeah that's fine now.

Josh Waihi’s picture

Found another issue in the postgresql drive, incorrect constant name.

webchick’s picture

Status: Reviewed & tested by the community » Fixed

Committed to HEAD. Thanks!

Yes, it's too bad this function gets more complex, but yched has a good point at #33. It's nice to get this logic encapsulated in one place so it doesn't have to be repeated in every X_save() function. This also helps fix some PostgreSQL failures.

Status: Fixed » Closed (fixed)
Issue tags: -PostgreSQL Surge, -PostgreSQL

Automatically closed -- issue fixed for 2 weeks with no activity.