When trying to install module via authorize.php, the following error occured to me, along with its code line and the data in question:

Notice: unserialize(): Error at offset 221 of 225 bytes in .../includes/batch.queue.inc on line 27

$item->data = unserialize($item->data);

a:2:{i:0;s:35:"update_authorize_batch_copy_project";i:1;a:4:{i:0;s:5:"devel";i:1;s:13:"ModuleUpdater";i:2;s:75:".../sites/default/private/temp/update-extraction/devel";i:3;O:15:"FileTransferSSH":5:{s:11:"

So the serialized object is truncated somehow. It turns out that serializing private and protected class members involves including null character:

Note: Object's private members have the class name prepended to the member name; protected members have a '*' prepended to the member name. These prepended values have null bytes on either side. (PHP: serialize).

And here is the relevant object and its serialization where some strange character is replaced with %:

array(2) {
  [0]=>
  string(35) "update_authorize_batch_copy_project"
  [1]=>
  array(4) {
    [0]=>
    string(5) "devel"
    [1]=>
    string(13) "ModuleUpdater"
    [2]=>
    string(75) ".../sites/default/private/temp/update-extraction/devel"
    [3]=>
    object(FileTransferSSH)#17 (5) {
      ["username:protected"]=>
      string(3) "ogi"
      ["password:protected"]=>
      string(7) "password"
      ["hostname:protected"]=>
      string(9) "localhost"
      ["port:protected"]=>
      int(22)
      ["jail"]=>
      string(24) "..."
    }
  }
}
object(InsertQuery_pgsql)#30 (9) {
  ["table:protected"]=>
  string(5) "queue"
  ["delay:protected"]=>
  NULL
  ["insertFields:protected"]=>
  array(3) {
    [0]=>
    string(4) "name"
    [1]=>
    string(4) "data"
    [2]=>
    string(7) "created"
  }
  ["defaultFields:protected"]=>
  array(0) {
  }
  ["insertValues:protected"]=>
  array(1) {
    [0]=>
    array(3) {
      [0]=>
      string(17) "drupal_batch:13:0"
      [1]=>
      string(381) "a:2:{i:0;s:35:"update_authorize_batch_copy_project";i:1;a:4:{i:0;s:5:"devel";i:1;s:13:"ModuleUpdater";i:2;s:75:".../sites/default/private/temp/update-extraction/devel";i:3;O:15:"FileTransferSSH":5:{s:11:"%*%username";s:3:"ogi";s:11:"%*%password";s:7:"password";s:11:"%*%hostname";s:9:"localhost";s:7:"%*%port";i:22;s:4:"jail";s:24:"...";}}}"
      [2]=>
      int(1264011582)
    }
  }

It's PHP 5.2.6 and PostgreSQL 8.3.9 on Debian 5.0 Lenny. I guess it has something to do with PostgreSQL and I decided it's worth to raise severity of this bug to critical.

CommentFileSizeAuthor
#114 690746-fix-cache-tables-array.patch1.53 KBStevel
PASSED: [[SimpleTest]]: [MySQL] 21,251 pass(es). View
#111 690746-follow-up.patch738 bytesDamien Tournoud
PASSED: [[SimpleTest]]: [MySQL] 21,186 pass(es). View
#105 690746.patch16.74 KBDamien Tournoud
PASSED: [[SimpleTest]]: [MySQL] 20,999 pass(es). View
#101 serialize.patch24.68 KBjbrown
PASSED: [[SimpleTest]]: [MySQL] 20,930 pass(es). View
#97 serialize.patch22.41 KBjbrown
PASSED: [[SimpleTest]]: [MySQL] 20,921 pass(es). View
#96 serialize.patch22.76 KBjbrown
PASSED: [[SimpleTest]]: [MySQL] 20,929 pass(es). View
#92 serialize.patch22.77 KBjbrown
FAILED: [[SimpleTest]]: [MySQL] Invalid PHP syntax in system.install. View
#89 serialize.patch22.25 KBjbrown
PASSED: [[SimpleTest]]: [MySQL] 20,922 pass(es). View
#77 serialize.patch17.03 KBjbrown
PASSED: [[SimpleTest]]: [MySQL] 20,428 pass(es). View
#75 serialize.patch17.83 KBjbrown
PASSED: [[SimpleTest]]: [MySQL] 20,420 pass(es). View
#72 690746-serialize.patch21.05 KBCrell
PASSED: [[SimpleTest]]: [MySQL] 20,430 pass(es). View
#70 690746-serialize.patch21.05 KBCrell
FAILED: [[SimpleTest]]: [MySQL] Invalid PHP syntax in system.install. View
#68 690746-serialize.patch21.05 KBCrell
FAILED: [[SimpleTest]]: [MySQL] Invalid PHP syntax in block.install. View
#62 serialize.patch23.3 KBjbrown
FAILED: [[SimpleTest]]: [MySQL] Invalid PHP syntax in system.install. View
#59 serialize.patch20.27 KBjbrown
PASSED: [[SimpleTest]]: [MySQL] 20,050 pass(es). View
#55 serialize.patch19.73 KBjbrown
PASSED: [[SimpleTest]]: [MySQL] 20,065 pass(es). View
#53 serialize.patch19.75 KBjbrown
PASSED: [[SimpleTest]]: [MySQL] 20,048 pass(es). View
#47 serialize.patch19.12 KBjbrown
FAILED: [[SimpleTest]]: [MySQL] 20,062 pass(es), 2 fail(s), and 2 exception(es). View
#40 serialize2.patch8.05 KBMichaelCole
PASSED: [[SimpleTest]]: [MySQL] 19,998 pass(es). View
#35 serialize.patch7.16 KBMichaelCole
PASSED: [[SimpleTest]]: [MySQL] 19,950 pass(es). View
#26 d7cvs-queue-blob.diff982 bytesogi
FAILED: [[SimpleTest]]: [MySQL] Invalid PHP syntax in system.install. View
#21 d7cvs-queue-blob-2.diff1.28 KBogi
PASSED: [[SimpleTest]]: [MySQL] 17,896 pass(es). View
#7 d7cvs-queue-blob.diff697 bytesogi
PASSED: [[SimpleTest]]: [MySQL] 17,547 pass(es). View
#6 d7a1-queue-blob.diff429 bytesogi
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch d7a1-queue-blob.diff. View
#4 queue-blob.diff420 bytesogi
FAILED: [[SimpleTest]]: [MySQL] Invalid patch format in queue-blob.diff. View
Members fund testing for the Drupal project. Drupal Association Learn more

Comments

ogi’s picture

Confirmed that the string got truncated in the PHP PDO layer. This is the PostgreSQL log:

2010-02-03 18:43:43 EET LOG:  execute pdo_pgsql_stmt_09c0a1ac: INSERT INTO queue
 (name, data, created) VALUES ($1, $2, $3)
2010-02-03 18:43:43 EET DETAIL:  parameters: $1 = 'drupal_batch:34:0', $2 = 'a:2
:{i:0;s:35:"update_authorize_batch_copy_project";i:1;a:4:{i:0;s:4:"atom";i:1;s:1
3:"ModuleUpdater";i:2;s:74:".../sites/default/private/temp/
update-extraction/atom";i:3;O:15:"FileTransferSSH":5:{s:11:"', $3 = '1265215423'
ogi’s picture

I checked the source of both PDO pgsql and PDO mysql.

PDO pgsql calls PQescapeStringConn for escaping strings but

...The parameter from points to the first character of the string that is to be escaped, and the length parameter gives the number of bytes in this string. A terminating zero byte is not required, and should not be counted in length. (If a terminating zero byte is found before length bytes are processed, PQescapeStringConn stops at the zero; the behavior is thus rather like strncpy.)...

so any zero truncates string.

In contrast, PDO mysql calls mysql_real_escape_string for escaping strings and it does encode the NUL character:

...Characters encoded are NUL (ASCII 0), “\n”, “\r”, “\”, “'”, “"”, and Control-Z (see Section 8.1, “Literal Values”)...

For completeness, I checked PDO sqlite and it's like PostgreSQL: sqlite3_snprintf is called and NUL characters are stop signs:

...The %q option works like %s in that it substitutes a null-terminated string from the argument list. But %q also doubles every '\'' character. %q is designed for use inside a string literal. By doubling each '\'' character it escapes that character and allows it to be inserted into the string...

So where should NUL character be escaped (PHP PDO or Drupal) and do all DB allow storing such character in text data type? I don't think PHP developers will accept using logic that is different than native DB API, so this leaves fixing it to Drupal. PostgreSQL should be able to store NUL character but I'm skeptical if SQLite allows such things.

ogi’s picture

Thinking a bit about the problem, a nice solution is to invent drupal_serialize and drupal_unserialize which encode/decode the NUL character to something like '\0'. These functions make sense only when serialized object is stored in some storage (DB or file).

Another possibility is to use BLOB instead of text data type for cases when serialized objects are stored. I don't know if this solves the problem though.

ogi’s picture

FileSize
420 bytes
FAILED: [[SimpleTest]]: [MySQL] Invalid patch format in queue-blob.diff. View

I've just changed data field of queue table to be bytea (PostgreSQL's blob) and it works :-) The attached patch is tested with a new Drupal+PostgreSQL installation. Do I need to add schema update to the patch?

ogi’s picture

Status: Active » Needs review
ogi’s picture

FileSize
429 bytes
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch d7a1-queue-blob.diff. View

Second try for the patch.

ogi’s picture

Version: 7.0-alpha1 » 7.x-dev
FileSize
697 bytes
PASSED: [[SimpleTest]]: [MySQL] 17,547 pass(es). View

Third try, against HEAD.

Damien Tournoud’s picture

Component: postgresql database » database system
Status: Needs review » Needs work

This is unfortunate. If we have to do this, let's change all the TEXT columns that are used to store serialized data into BLOB, and promote that to a rule.

Promoting to the database layer, we need to fix that for every database engine.

ogi’s picture

Changing to BLOB is required only if serialized data contain objects with protected or private members. If it's just nested arrays or objects with only public members, TEXT is OK.

Crell’s picture

BLOB fields make any sort of searching or indexing or treatment of that field impossible. It also complicates field handling on some databases. I really don't want to go down that route if we can avoid it.

How many "serialized" fields do we have? I know of cache, session, and bits of the Field system.

Oh yeah, and PHP, you suck. PostgreSQL, you suck, too.

Damien Tournoud’s picture

@Ognyan Kulev: we need to promote clear and simple rules.

"All serialized data should be stored in BLOB fields" is clear and simple.

"All serialized data that might have zero values in them should be stored in BLOB fields, otherwise TEXT is ok" is unclear and confusing.

ogi’s picture

@Crell: refusing to store NUL "character" in TEXT column is sensible and I welcome it. And it's natural for such kind of fields (serialized data structures) not to be indexable or processable other than set/get.

@Damien Tournoud: "All serialized data should be stored in BLOB fields" sounds OK to me. I'm willing to make a patch but I just need to known the policy about this schema change: a) should there be e.g. system_update_70xx-like functions and b) should it preserve old contents because my PostgreSQL didn't allow me to ALTER ... ALTER COLUMN but I had to ALTER ... DROP; ALTER ... ADD. I guess the answer to both of these is Yes. Another thing that is unknown to me is the comment "@} End of "defgroup updates-6.x-to-7.x" * The next series of updates should start at 8000." So this update function should be in updates-6.x-to-7.x, isn't it?

mainpc’s picture

I agree. If it can contain NULL or is just an array of bytes (i.e. raw serialization), then it is binary and should be treated as such by everything. If it is text (and has a specific character encoding), then it should never contain NULL (at least, without escaping). If we need to use text functions against these fields, then we should not be using serialization and should be normalizing the data.

Josh Waihi’s picture

Blob fields for serialized data makes sense, its serialized therefore there is no need to index or search on such fields. Also for schema updates, dropping cache column to re-create them won't matter since we can rebuild those easily enough.

@Ognyan Kulev because this is a change to Drupal 7, you need to write an update function called system_update_70XX in system.install that alters the columns your changing to blob.

Nice work BTW :)

Crell’s picture

There are other serialized fields besides cache that we cannot drop, like Field API configuration and sometimes even the variable table.

Josh Waihi’s picture

serialized columns will have a key from which they index on, we can build an array in PHP from which we can store the data until it can be inserted back into the new blob field. Unless we can do a manual CAST conversion in the db?

Crell’s picture

That could work, possibly, but I'm concerned about the memory usage of that approach. It would have to be in 3 separate update hooks: Add new column, port data over one record at a time (batched), and delete old column.

ogi’s picture

Title: text column type in PostgreSQL doesn't reliably hold serialized object » text column type doesn't reliably hold serialized objects in non-MySQL RDBMS

With single UPDATE ... SET column_blob=column::bytea statement I managed to copy all data from one text column to another bytea column in PostgreSQL 8.3. Such statements surely exist in MySQL and SQLite but the problem is that they will be SQL driver specific and system_update_* would have hard-coded (supposedly) 3 different SQL-specific statements. Is this acceptable?

Crell’s picture

I'd prefer not to if we can avoid it, but maybe.

For MySQL we can probably just do a straight read since BLOBs are almost TEXT and require no special processing. I've no idea about SQLite, though, other than making schema changes to SQLite tables is a PITA.

Although... No one is running Drupal 6 on SQLite, so do we even need that update hook?

Josh Waihi’s picture

Well, technically - no. We currently not supporting any version of Drupal that runs on SQLite until Drupal 7.0 is released. Considering the nature, if need be - a per database solution will be fine.

ogi’s picture

Status: Needs work » Needs review
FileSize
1.28 KB
PASSED: [[SimpleTest]]: [MySQL] 17,896 pass(es). View

I've made a patch and tested it (install without patch, look {queue}, patching, update.php, check if schema is changed and data is still in {queue}) with the following DBs: MySQL 5.0.51a, PostgreSQL 8.3.9, SQLite 3.5.9 (Debian Lenny).

It's hard for me to judge if {queue}.data is the only column that needs changing. Only columns that may hold serialized objects need to be changed. Can anybody help with that?

Crell’s picture

The queue table doesn't exist in D6, so it wouldn't need to be changed anyway. :-)

In core, there's the cache tables (which don't need data retained as we flush them anyway), sessions (same), and variable. There's also, and this one is ugly, most of the configuration for fields. That has to migrate from CCK's tables in D6 to core's tables in D7. I am not sure what else is happening there, though, so those may be covered elsewhere already by other upgrade processes.

The tables themselves would need to be updated in the schema either way, even if we don't preserve data.

ogi’s picture

Status: Needs review » Needs work

There's ALTER TABLE name ALTER [ COLUMN ] column [ SET DATA ] TYPE type [ USING expression ] statement in PostgreSQL. I'll make a new patch soon and submit it after testing, including with upgrade from D6.

Damien Tournoud’s picture

As far as I know, we have everything in place for:

  $data_column_schema = array('description' => 'A collection of data to cache.', 'type' => 'blob', 'not null' => FALSE, 'size' => 'big');
  db_change_field('queue', 'data', 'data', $data_column_schema);

To work on all databases.

Which problem have you bumped into with that, Ognyan?

andypost’s picture

Subscribe.

ogi’s picture

FileSize
982 bytes
FAILED: [[SimpleTest]]: [MySQL] Invalid PHP syntax in system.install. View

I now see that there is functionality in pgsql/schema.inc for casting on changing fields but unfortunately it gives SQL error - I wonder if it has ever worked because there's no such syntax in PostgreSQL's ALTER TABLE. The error message is SQLSTATE[42601]: Syntax error: 7 ERROR: syntax error at or near "data" LINE 1: ALTER TABLE queue ALTER data SET data = CAST(data_old as byt... ^ and the relevant code is:

    $this->connection->query("ALTER TABLE {" . $table . "} ALTER $field SET $field_new = CAST(" . $field . "_old as " . $typecast . ")");

    $this->addField($table, "$field_new", $spec);

    $this->dropField($table, $field . '_old');

    // Rename the column if necessary.
    if ($field != $field_new) {
      $this->connection->query('ALTER TABLE {' . $table . '} RENAME "' . $field . '" TO "' . $field_new . '_old"');
    }

Anyway, I'm attaching the simplified patch.

Josh Waihi’s picture

I'm wondering if its a reserved word issue? http://www.postgresql.org/docs/7.3/static/sql-keywords-appendix.html the table says its not but "data" is a reserved word in the SQL standard. That PHP code doesn't make sense to me. Shouldn't it be more like:

$this->addField($table, "$field_new", $spec);
$this->connection->query("UPDATE {" . $table . "} SET $field_new = CAST(" . $field . "_old as $type_cast)");
// set not null here if the case.
$this->dropField($table, $field . '_old');

or maybe I'm just seeing it out of context

andypost’s picture

It's a casting trouble. A working solution I found http://bytes.com/topic/postgresql/answers/174532-cast-text-bytea

drupal=# alter table queue add column data2 text;
ALTER TABLE
drupal=# update queue set data2=data
UPDATE 1
drupal=# alter table queue alter data2 type bytea using decode(replace(data2, '\\', '\\\\'), 'escape');
ALTER TABLE
ogi’s picture

@andypost In my experience and as I wrote above, UPDATE forum_gtree SET gid2=gid::bytea; works fine in 8.3. The thread you point is from 2005 and perhaps things were different then.

andypost’s picture

@Ognyan Kulev anyway we should choose a method for altering

1) add new column, copy data, drop old column
2) alter table queue alter data type bytea using data::bytea;

Both this changes require some decoding table to generate type-transformation

andypost’s picture

Issue tags: +Needs tests

Tagging, tests for serialization also needed at #718636: Cache does not unserialize serialized data in certain conditions which already fixed

jbrown’s picture

Title: text column type doesn't reliably hold serialized objects in non-MySQL RDBMS » Text column type doesn't reliably hold serialized variables

This problem can also be triggered in MySQL like this:

drupal_set_message(hash_file('md5', 'CHANGELOG.txt', TRUE));
drupal_exit();

resulting in

PDOException: SQLSTATE[HY000]: General error: 1366 Incorrect string value: '\x90O\xA2\x12\xD8(...' for column 'session' at row 1: INSERT INTO {sessions} (uid, cache, hostname, session, timestamp, sid) VALUES (:db_insert_placeholder_0, :db_insert_placeholder_1, :db_insert_placeholder_2, :db_insert_placeholder_3, :db_insert_placeholder_4, :db_insert_placeholder_5) ON DUPLICATE KEY UPDATE uid=VALUES(uid), cache=VALUES(cache), hostname=VALUES(hostname), session=VALUES(session), timestamp=VALUES(timestamp); Array ( [:db_insert_placeholder_0] => 1 [:db_insert_placeholder_1] => 0 [:db_insert_placeholder_2] => 127.0.0.1 [:db_insert_placeholder_3] => batches|a:1:{i:1;b:1;}test|b:1;messages|a:1:{s:6:"status";a:1:{i:0;s:16:"�O��(�@���I�\";}} [:db_insert_placeholder_4] => 1269578171 [:db_insert_placeholder_5] => 0ae09c3f5feb76b49cfe9b5d133f172e ) in _drupal_session_write() (line 169 of /home/jonny/sites/drupal-7.x-dev/includes/session.inc).

This definitely needs to be fixed before D7 is released so we are not making major schema changes after.

I would be very interested to see the performance implications of a drupal_serialize() that wrapped gzdeflate() around serialize().

MichaelCole’s picture

Writing test today.

MichaelCole’s picture

Additional crash cases:

// Crash cases:
// 1)
// drupal_set_message("This freaky data is not a crash: " . hash('md5', 'lorem ipsum', TRUE));
// drupal_exit();

// 2)
// variable_set("dontcrash", "This freaky data is not a crash: " . hash('md5', 'lorem ipsum', TRUE));

// 3)
$name = "dontcrash";
$value = "This freaky data is not a crash: " . hash('md5', 'lorem ipsum', TRUE);
db_merge('variable')->key(array('name' => $name))->fields(array('value' => serialize($value)))->execute();

MichaelCole’s picture

Status: Needs work » Needs review
FileSize
7.16 KB
PASSED: [[SimpleTest]]: [MySQL] 19,950 pass(es). View

(New patch. Patch in #21 only covered queue table.)

What's the problem?

serialize($binarydata) will fail when inserted into a 'text' field. For example:

  //   variable_set("dontcrash", "Please dont crash" . hash('md5', 'too late', TRUE));
  $name = "dontcrash";
  $value = "Please dont crash" . hash('md5', 'too late', TRUE);
  db_merge('variable')->key(array('name' => $name))->fields(array('value' => serialize($value)))->execute();

What's the solution?

Convert:
'type' => 'text',
to:
'type' => 'blob',
'size' => 'big', // if data is bigger than 16k (mysql), make big

What fields were changed (current size)?

variable.value (big)
sessions.session (big)
actions.parameters (big)
batch.description (big)
menu_router.load_functions (16k) - comments say serialize
.to_arg_functions (16k) - comments say serialize
.access_arguments (big)
.page_arguments (big)
menu_links.options(big)
queue.data (big)
system.info (big)

What fields were skipped?

cache.headers (assumed cache knew what it was doing)
menu_router.description (description, not a serial)
menu_router.include_file (filename, not a serial)

Test plan:

Tell yer testbot to put that in it's pipe and smoke it!

Status: Needs review » Needs work

The last submitted patch, serialize.patch, failed testing.

cleaver’s picture

Status: Needs work » Needs review

I reviewed the test at the code sprint and it looks good. The patch does apply successfully and the test does not fail locally.

Note, this is a special test case. The patch is to the system.install, so the patch must be applied BEFORE installing Drupal--ie. while the database is still empty.

Testing manually gets the desired result -- test fails if the code patch is not applied. The test succeeds if installing using the patched system.install code.

Not sure how to get this so that testbot will like it.

MichaelCole’s picture

"The test did not complete due to a fatal error. Completion check text.test 73 TextFieldTestCase->testTextfieldWidgets()"

Hmmm... What's up with that testbot?

Tests pass on my local dev machine.
- updated CVS
- ran just textfield, then all field tests

Reviewed test for impact, and didn't see anything that should have caused the test to fail to run.

Line 73 is a single "test" function that runs two large tests (text and text_long). Resubmitting patch with fix to break these two tests into two test functions.

We'll get more info on next run, or see if it's an intermittent test harness problem.

MichaelCole’s picture

#35: serialize.patch queued for re-testing.

MichaelCole’s picture

FileSize
8.05 KB
PASSED: [[SimpleTest]]: [MySQL] 19,998 pass(es). View

This new patch is the SAME CODE, but with a slight modification to the failed test to help with debugging. Test passes on my local computer, but did not complete due to "fatal error" on testbot. This patch will help isolate that error if testbot continues to fail.

MichaelCole’s picture

Status: Needs review » Reviewed & tested by the community

Previously reviewed and approved: serialize.patch

serialize.patch failed on testbot problem. Resubmitted to testbot and passed. See #35

serialize2.patch is the SAME CODE, but added some debug statements to help diagnose testbot failure. This patch can be IGNORED.

cleaver’s picture

Status: Reviewed & tested by the community » Needs review

Michael, I think the last successfully tested patch will be the one applied.

Might want to re-upload the patch from #35 to be sure.

cleaver’s picture

Status: Needs review » Reviewed & tested by the community

Forgot that it changes the status... changing back, this patch is good.

Crell’s picture

Status: Reviewed & tested by the community » Needs work

Can someone confirm please that we don't need a complete definition array for the update function? I think we do actually need a complete definition with null and description keys.

Also, queue is new in Drupal 7 so we don't need to update that one.

catch’s picture

Issue tags: +D7 upgrade path

Tagging.

jbrown’s picture

#35 There are serialized fields in other core modules.

dblog
-----
watchdog.variables

field
-----
field_config.data
field_config_instance.data

filter
------
filter.settings

image
-----
image_effects.data

rdf
---
rdf.mapping

system
------
variable.value
actions.parameters
batch.batch
cache.headers
menu_router.load_functions
menu_router.to_arg_functions
menu_router.access_arguments
menu_router.page_arguments
menu_links.options
queue.data
sessions.session
system.info

user
----
user.data
jbrown’s picture

Status: Needs work » Needs review
FileSize
19.12 KB
FAILED: [[SimpleTest]]: [MySQL] 20,062 pass(es), 2 fail(s), and 2 exception(es). View

Try this.

It converts all the fields in #46 to 'blob'.

Fields that store serialized data now have 'serialize' => TRUE even if they are not written with drupal_write_record() .

drupal_write_record() will only serialize data for 'blob' fields.

andypost’s picture

Status: Needs review » Needs work
Issue tags: -D7 upgrade path

Confirm that db_change_field() should use full field definition. For example http://api.drupal.org/api/function/block_update_7003/7

Also all field constraints should be dropped (if any) http://api.drupal.org/api/function/db_change_field/7

jbrown’s picture

Status: Needs work » Needs review

I implemented the full field definition for the update functions. This is required because the mysql driver uses ALTER TABLE {' . $table . '} CHANGE which needs the same full column definition as CREATE TABLE: http://dev.mysql.com/doc/refman/5.1/en/alter-table.html

column_definition clauses use the same syntax for ADD and CHANGE as for CREATE TABLE. See Section 12.1.17, “CREATE TABLE Syntax”.

serialized fields don't have any constraints

andypost’s picture

Status: Needs review » Needs work
Issue tags: -Needs tests +D7 upgrade path

Really needs a bit of work...

+++ modules/filter/filter.install	26 Apr 2010 20:21:46 -0000
@@ -422,6 +422,23 @@
+  ¶

trailing whitespace

+++ modules/system/system.install	26 Apr 2010 20:21:47 -0000
@@ -2398,6 +2409,100 @@
+  ¶

same 4 times

+1 to RTBC after re-roll

jbrown’s picture

Upgrading from D6 needs to be tested with mysql and postgres.

jbrown’s picture

@andypost: I don't know what trailing whitespace you are referring to.

jbrown’s picture

Status: Needs work » Needs review
FileSize
19.75 KB
PASSED: [[SimpleTest]]: [MySQL] 20,048 pass(es). View

Fix testDrupalWriteRecord() .

andypost’s picture

Status: Needs review » Needs work

@jbrown please try to review your patch with dreditor and you'll see what I'm talking about, there should not be spaces on blank lines. Dreditor will show them as red colored

+++ modules/filter/filter.install	26 Apr 2010 20:57:30 -0000
@@ -422,6 +422,23 @@
+  db_change_field('filter', 'settings', 'settings', $spec);
+}
+  ¶ // two spaces here
+/**
  * @} End of "defgroup updates-6.x-to-7.x"

this is a trailing whitespace

Powered by Dreditor.

jbrown’s picture

Status: Needs work » Needs review
FileSize
19.73 KB
PASSED: [[SimpleTest]]: [MySQL] 20,065 pass(es). View

Thanks @andypost !

andypost’s picture

Status: Needs review » Reviewed & tested by the community

Now it's good to go

jbrown’s picture

Status: Reviewed & tested by the community » Needs work

Not quite yet!

Upgrading from D6 needs to be tested with mysql and postgres.

jbrown’s picture

Status: Needs work » Needs review
jbrown’s picture

FileSize
20.27 KB
PASSED: [[SimpleTest]]: [MySQL] 20,050 pass(es). View

This patch fixes creation of queue table during upgrade.

Todo: alter cache tables during upgrade.

catch’s picture

Issue tags: +Needs Documentation

We can drop and recreate cache tables, should be quicker than altering them if they have stuff in them. Modules which maintain their own cache table are going to have to do this too, so this will have to be documented in the 6-7 upgrade guide.

jbrown’s picture

Status: Needs review » Needs work

Are cache tables currently dropped during D6->D7 upgrade? Is there an issue for this?

jbrown’s picture

Status: Needs work » Needs review
FileSize
23.3 KB
FAILED: [[SimpleTest]]: [MySQL] Invalid PHP syntax in system.install. View

Updated patch with cache tables dropped and recreated.

Crell’s picture

Status: Needs review » Needs work

I'm confused. Why are we converting to use drupal_write_record() more? Blob fields are handled transparently by the DB drivers.

I am also not wild about the serialized flag in schema. That's not information about the table; it's information about how Drupal may or may not use those fields. I am not convinced those belong together; to date the schema is almost Drupal-agnostic in its definitions.

jbrown’s picture

The main purpose of this patch is for serialized variables to be reliably stored in the db.

The two places where I have used drupal_write_record() turn 45 lines into 2 lines and make the code much cleaner.

This patch does not introduce the 'serialize' schema option - whether or not that is a good idea is a separate discussion. Although, perhaps this patch should not have 'serialize' for fields that are not written by drupal_write_record().

andypost’s picture

Status: Needs work » Needs review

@Crell This patch about storing serialized data and not about handling blob fields.

Postgres could not be tested until #761976: DatabaseSchema_pgsql::changeField() is broken

Patch seems good

YesCT’s picture

#62: serialize.patch queued for re-testing.

Status: Needs review » Needs work
Issue tags: +Needs Documentation, +D7 upgrade path

The last submitted patch, serialize.patch, failed testing.

Crell’s picture

Status: Needs work » Needs review
FileSize
21.05 KB
FAILED: [[SimpleTest]]: [MySQL] Invalid PHP syntax in block.install. View

Hrm. OK, that's what I get for reading patches too quickly. Yeah, serialize is already there. It sucks, but we can't change that now.

It looks like the patch just broke on a change to the schema of cache tables, so here's a simple reroll. Bot, you cool?

Status: Needs review » Needs work

The last submitted patch, 690746-serialize.patch, failed testing.

Crell’s picture

Status: Needs work » Needs review
FileSize
21.05 KB
FAILED: [[SimpleTest]]: [MySQL] Invalid PHP syntax in system.install. View

Grumble grumble duplicate function names grumble grumble. I think I got them all.

Status: Needs review » Needs work

The last submitted patch, 690746-serialize.patch, failed testing.

Crell’s picture

Status: Needs work » Needs review
FileSize
21.05 KB
PASSED: [[SimpleTest]]: [MySQL] 20,430 pass(es). View

I guess I didn't. *sigh*

dhthwy’s picture

You've got a few comments past the 80 character limit. Plus your comments for functions need to begin with a 3rd person verb as far as I understand (ie: s/convert/converts).

Don't hate me tho. I thought I'd point that out before aspilicious and sun jump you and steal your candy.

Damien Tournoud’s picture

Status: Needs review » Needs work
     // Build array of fields to update or insert.
-    if (empty($info['serialize'])) {
+    if ($info['type'] != 'blob' || empty($info['serialize'])) {
       $fields[$field] = $object->$field;
     }
     elseif (isset($object->$field)) {

Em. Let's not do that. If I set serialize, I want the data serialized. End of story. The fact that some databases cannot handle it in some cases is totally irrelevant.

Other then that, I see no compelling reasons to move to drupal_write_record() in those places. Let's only fix one bug at a time, please remove those. This issue is *only* about converting text columns to blob column, and adding a test.

Speaking of which:

+  /**
+   * Test that we can insert serialized data into important fields.
+   */
+  function testInsertSerializedData() {
+    // Alternative crash cases:
+    // 1)
+    //   drupal_set_message("Please dont crash" . hash('md5', 'too late', TRUE));
+    //   drupal_exit();
+    // 2)
+    //   variable_set("dontcrash", "Please dont crash" . hash('md5', 'too late', TRUE));
+    $name = "dontcrash";
+    $value = "Please dont crash" . hash('md5', 'too late', TRUE);
+    try {
+      db_merge('variable')->key(array('name' => $name))->fields(array('value' => serialize($value)))->execute();
+      $this->pass(t('Success! No PDO exception thrown.'));
+    } catch (Exception $e) {
+      $this->fail(t('Fail! Your serialized blobs are quite distraught.'));
+    }
+
+  }

This:

  1. Doesn't really do what it says it does. It more tests for inserting binary data then anything.
  2. Doesn't have a lot of added value (we already know for a fact that storing binary data in a TEXT column cannot reliably work, and we already have tests for inserting binary data in BLOB columns).

I vote for just removing it.

jbrown’s picture

Status: Needs work » Needs review
FileSize
17.83 KB
PASSED: [[SimpleTest]]: [MySQL] 20,420 pass(es). View

Implemented recommendations in #74.

For {cache_block} and {cache_update} I just modified the update functions that drop the headers field. Those functions now just drop and recreate the tables.

Is it okay to use drupal_get_schema_unprocessed('system', 'cache') in an update function?

Damien Tournoud’s picture

Thanks for the new patch, we are getting there. Some additional remarks:

  • Please remove the new 'serialize' => TRUE entries in the schemas.
  • Ok to simply drop/create the cache tables, that makes a lot of sense.
  • We already have an update function somewhere in the system module to crop the headers column of the cache_* tables, could you just modify that instead of the code at the bottom of system_update_7055()?
jbrown’s picture

FileSize
17.03 KB
PASSED: [[SimpleTest]]: [MySQL] 20,428 pass(es). View

Done.

I don't think we can use drupal_get_schema_unprocessed() in an update function for the reasons discussed here: http://drupal.org/node/150220

Josh Waihi’s picture

Status: Needs review » Needs work

Agreed, I don't think you can use drupal_get_schema_unprocessed()

jbrown’s picture

chx’s picture

You can't ever use existing schema in the code in the update because the the schema in hte code might change and then your update lands you in a different place and various users will have a different schema with the same schema version which is a terrible mess. Write out the definition.

jbrown’s picture

Yeah - I'll reroll this with the definition once the blocker is resolved.

catch’s picture

Why is the postgres issue a blocker?

jbrown’s picture

See #65

db_change_field() needs to work on postgres

Damien Tournoud’s picture

I agree, nothing blocks this issue. The issue with PostgreSQL schema implementation doesn't block testing of this on MySQL and SQLite.

jbrown’s picture

I wasn't aware that we were breaking HEAD for postgres.

catch’s picture

HEAD is already broken for postgres, we don't support head to head updates (see http://drupal.org/project/head2head), and we already have db_change_field() calls in core - so this patch doesn't cause any harm that's not already there.

andypost’s picture

@catch As reported at #32 mysql affected too!

Damien Tournoud’s picture

Please move ahead with this patch. There is nothing blocking it. PostgreSQL doesn't implement the API correctly, and this problem is dealt with in the other issue. Nothing prevent us to move forward here.

jbrown’s picture

Status: Needs work » Needs review
FileSize
22.25 KB
PASSED: [[SimpleTest]]: [MySQL] 20,922 pass(es). View

I haven't actually tested this yet on a D6=>D7 upgrade.

jbrown’s picture

jbrown’s picture

Hang on - the each cache table has its own description.

jbrown’s picture

FileSize
22.77 KB
FAILED: [[SimpleTest]]: [MySQL] Invalid PHP syntax in system.install. View
catch’s picture

+  // 'text' column type doesn't reliably hold serialized variables. Convert to 'blob'. See http://drupal.org/node/690746
+  // 'headers' column is no longer used

This is longer than 80 chars and we rarely reference issues in code comments.

Just "Convert text to blob since this stores serialized data, remove headers column." should be fine I think.

Leaving CNR for the bot and apart from that it looks fine.

jbrown’s picture

I'll run it through a D6=>D7 upgrade tomorrow and check it with the schema module.

Damien Tournoud’s picture

Looks good to me too.

jbrown’s picture

FileSize
22.76 KB
PASSED: [[SimpleTest]]: [MySQL] 20,929 pass(es). View

Fixes foreach typo in system.install

jbrown’s picture

FileSize
22.41 KB
PASSED: [[SimpleTest]]: [MySQL] 20,921 pass(es). View

Updated comments.

Damien Tournoud’s picture

Status: Needs review » Reviewed & tested by the community

Carefully reviewed the patch. Everything we need seems to be there.

catch’s picture

rfay’s picture

subscribe

jbrown’s picture

Status: Reviewed & tested by the community » Needs review
FileSize
24.68 KB
PASSED: [[SimpleTest]]: [MySQL] 20,930 pass(es). View

Fixed schema upgrade code in the image module.

andypost’s picture

This changes are correlate with #831332: drupal_get_schema_unprocessed() called in update functions

Maybe better to sync changes in cache tables by producing one update function.

Stevel’s picture

Status: Needs review » Needs work

#831332: drupal_get_schema_unprocessed() called in update functions got committed, meaning this patch doesn't apply any more.

Why are the cache tables droppend and recreated instead of doing a db_change_field like on the other the converted fields?

@andypost: I've posted an idea for achieving that (and a reason) in #837342: Simultaneous updating required for modules with a cache bin when the cache schema changes.

andypost’s picture

@Stevel The reason to drop/create cache tables is because cache tables could be big and once this change happens in _update_N so cache should be cleared anyway. So actually no reason to keep data in cache tables. Contrib modules that have cache tables should care about cleaning and tracking changes in cache tables.

Damien Tournoud’s picture

Status: Needs work » Needs review
FileSize
16.74 KB
PASSED: [[SimpleTest]]: [MySQL] 20,999 pass(es). View

Full reroll.

I kept the drop / recreate pattern for cache tables, which simplify everything.

This looks ready to go in for me.

catch’s picture

Looks ready to me too, would mark RTBC but I haven't tested it at all.

Status: Needs review » Needs work
Issue tags: -Needs Documentation, -D7 upgrade path, -schema change

The last submitted patch, 690746.patch, failed testing.

Damien Tournoud’s picture

Status: Needs work » Needs review
Issue tags: +Needs Documentation, +D7 upgrade path, +schema change

#105: 690746.patch queued for re-testing.

Damien Tournoud’s picture

Status: Needs review » Reviewed & tested by the community

RTBC based on #106 and my previous attempt on #98.

Dries’s picture

Status: Reviewed & tested by the community » Fixed

Committed to CVS HEAD. Thanks.

Damien Tournoud’s picture

Status: Fixed » Needs review
FileSize
738 bytes
PASSED: [[SimpleTest]]: [MySQL] 21,186 pass(es). View

Sorry about that.

catch’s picture

Status: Needs review » Reviewed & tested by the community
Stevel’s picture

Status: Reviewed & tested by the community » Needs work

no, all cache tables have their own descriptions, so the array definition which holded these definitions is gone here, I've seen it somewhere recently, looking up where.
edit: found it in #101:

+  $cache_tables = array(
+    'cache' => 'Generic cache table for caching things not separated out into their own tables. Contributed modules may also use this to store cached items.',
+    'cache_form' => 'Cache table for the form system to store recently built forms and their storage data, to be used in subsequent page requests.',
+    'cache_page' => 'Cache table used to store compressed pages for anonymous users, if page caching is enabled.',
+    'cache_menu' => 'Cache table for the menu system to store router information as well as generated link trees for various menu/page/user combinations.',
+  );

Then we need $schema['description'] = $description in each run of the for loop.

Stevel’s picture

Status: Needs work » Needs review
FileSize
1.53 KB
PASSED: [[SimpleTest]]: [MySQL] 21,251 pass(es). View

New patch that sets the correct description for each cache table.

catch’s picture

Status: Needs review » Reviewed & tested by the community
Dries’s picture

Status: Reviewed & tested by the community » Fixed

Committed to CVS HEAD. Thanks.

Status: Fixed » Closed (fixed)
Issue tags: -Needs Documentation, -D7 upgrade path, -schema change

Automatically closed -- issue fixed for 2 weeks with no activity.