I tried to update a "sections" module and i've implemented a schema change from get_next_id to auto_increment.

1. the schema file is ready and working well if i install from scratch
2. i found this VERY GREAT new D6 function db_update_field() and tried db_update_field($ret, 'sections_data', 'sid');
3. this should cause automatically change the sid field from int(10) to serial. Here is the schema.

$schema['sections_data'] = array(
    'fields' => array(
      'sid'        => array('type' => 'serial', 'not null' => TRUE),
      'name'       => array('type' => 'varchar', 'length' => 255, 'not null' => TRUE, 'default' => ''),
      'status'     => array('type' => 'int', 'not null' => TRUE, 'default' => 0, 'size' => 'tiny'),
      'path'       => array('type' => 'text', 'not null' => FALSE),
      'theme'      => array('type' => 'varchar', 'length' => 255, 'not null' => TRUE, 'default' => ''),
      'visibility' => array('type' => 'int', 'not null' => TRUE, 'default' => 0, 'size' => 'tiny'),
      'weight'     => array('type' => 'int', 'not null' => TRUE, 'default' => 0)
    ),
    'primary key' => array('sid')
  );

4. but on a update i get the following errors on MySQL 5.0

* user warning: Incorrect table definition; there can be only one auto column and it must be defined as a key 
query: ALTER TABLE sections_data CHANGE sid `sid` INT NOT NULL auto_increment in 
C:\Inetpub\wwwroot\drupal6\includes\database.mysqli.inc on line 154.

update.php error

* Failed: ALTER TABLE {sections_data} CHANGE sid `sid` INT NOT NULL auto_increment

It looks like the schema API does not understand the "primary key" configuration in the sections.schema and therefor the field update fails with db_update_field().

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

hass’s picture

Component: database system » mysql database

checked db_update_field function and saw PGSQL have this build in, the mysql code not. So i move to mysql database only.

ChrisKennedy’s picture

Are you using db_update_field in a module update? If so it's a bad idea because if a user needs to upgrade a field through multiple steps the first call to db_update_field() will change it to the current schema definition and the intermediate updates will be broken.

Anyway, should the index & key code from pgsql just be moved over into database.mysql-common.inc?

hass’s picture

Please take a look here http://cvs.drupal.org/viewvc.py/drupal/contributions/modules/sections/se... i think this update way is correct and will work well if this bug is fixed.

chx’s picture

Status: Active » Needs review
FileSize
1.06 KB

Yup, this is a serious problem. Now I add primary key right next to auto_increment and take care to avoid ERROR 1068 (42000): Multiple primary key defined on table creation.

chx’s picture

FileSize
1.15 KB

A few words added to comments.

moshe weitzman’s picture

I think chris is right though. in general it is a bad idea to call db_update_field().

moshe weitzman’s picture

and the patch looks fine to me. needs testing.

bjaspan’s picture

Re: #2 and #3: Chris is right and hass is wrong. You cannot use db_update_field() in a hook_update_N() function. See http://drupal.org/node/150220 for an explanation.

bjaspan’s picture

Component: mysql database » database system
Status: Needs review » Needs work

I've gone around in circles on the patch several times but am now convinced it is not yet right. There are several interlocking and complicated issues.

On MySQL, auto_increment columns must be the primary key. Prior to this patch:

1(a). Creating a table with a type serial column without explicitly specifying the serial column as the pkey fails (correctly).

1(b). It is impossible to convert an integer column that is not already the pkey into a serial column because db_change_field() does not automatically declare the new serial column as the pkey.

With this patch:

2(a). Creating the table will succeed (incorrectly) resulting a table that does not match the schema because the table will have a pkey but the schema will not.

2(b). It is possible to convert an integer column that is not already the pkey into a serial column because db_change_field() silently adds the primary key specification.

I assert that 2(a) is a serious problem (the schema must match the database 100% of the time) and that 2(b) is an unimportant "benefit" (you can just make the integer column the pkey with db_add_primary_key() before changing it to be type serial).

On pgsql, serial columns do NOT have to be the pkey. By allowing 2(a), this patch also makes bugs likely in which tables will have a primary key on mysql but not on pgsql, based on the same schema. Yuck. 1(a) essentially means that Schema API requires that serial columns be the pkey on all databases which is what we are stuck with if we want to support MySQL.

It gets worse, though. Read this part of the db_change_field() documentation:

 * Change a field definition.
 *
 * IMPORTANT NOTE: On some database systems (notably PostgreSQL),
 * changing a field definition involves adding a new field and
 * dropping an old one. This means that any indices, primary keys and
 * sequences (from serial-type fields) that use the field to be
 * changed get dropped.  For database portability, you MUST drop them
 * explicitly before calling db_change_field() and then re-create them
 * afterwards.  Use db_{add,drop}_{primary_key,unique_key,index} for
 * this purpose.

So, suppose you want to convert a non-pkey int column to type serial. On mysql, you must use db_add_primary_key() BEFORE db_change_field() or the conversion will not work. On pgsql, you must use db_add_primary_key() AFTER db_change_field() because db_change_field() causes the existing pkey specification to get dropped. But you can't call db_add_primary_key() after db_change_field() on mysql because the pkey already exists.

In short, I do not know how to add a new column or alter an existing column to be type serial in a way that works on both mysql and pgsql.

I wonder if I've confused myself. I think the answer is special-case code in db_change_field() to handle type serial, but I need to think about this after sleeping (woke up at 5am this morning, ick!) before being confident of anything.

hass’s picture

@bjaspan: About #8, i'm with you if you say it's not good practice to use db_update_field() here... but in this special case it works well and i'm changing my mind about the way - to be save for future. But nevertheless this is bad for database compatibility. I have no PgSQL experience and no install next to me and finally i don't want to spend time on this. If i change the module_update_N code and remove db_update_field() i'm now made to think about database type detection (currently undocumented in http://drupal.org/node/150220!) and i must use different API functions for different DB types to get the same DB schema :-(((. Additional i must use DB type specific schema api functions when i don't know that i must use them or not. Nor do i know anything about DB2 and Oracle... (and MsSQL will be no problem).

This is not what we want with schema api, isn't it?

chx’s picture

Title: db_update_field throws errors on auto_increment » beta breaker: db_update_field throws errors on auto_increment

We need a solution to this otherwise no modules can be upgraded to Drupal 6. My solution, so far would be to document it in the upgrade documentation: you need to create a new table with the same schema as the old table with one difference, make the serial a serial. Use INSERT...SELECT to move data this is fast and safe and then swap the tables.

I happen to have table copy and table swap code for mysql/pgsql in http://drupal.org/node/80963#comment-298658

chx’s picture

Note that it's possible that the linked is not reusable 100% because we might be better off if we allow people to define the schema to be created for the temporary/new table because the Drupal 5 schema might be different from the Drupal 6 schema. But the db_replace_table is sound.

chx’s picture

I might have overcomplicated -- just create a db specific function which changes a field to a serial?

chx’s picture

system_update 6002 and 6019 and 6022 and 6023 and 6025 are wrong then? Esp 6019.

chx’s picture

Actually not, those are db_change_field commands not db_update_field and those are primary keys already so all is well in core...

hass’s picture

As i remember... db_change_field() does not change an integer into serial db type independed. Maybe a bug of db_change_field()...

In the risk of asking somewhat stupid... i understand well why i should not use db_update_field() in a specific update_N hook, but i must ask - where i can use this type of function if not there? I have no idea where i should use it if not inside an update process. Nobody should/will alter a table in production or only temp tables - but why should i do such senseless actions!??? And if i say it's bad practice i can move my db_update_field() call later from sections_update_3 to sections_update_4 and i'm fine too.

bjaspan’s picture

I believe I have a solution and am testing it.

Gábor Hojtsy’s picture

Eagerly awaiting a solution.

bjaspan’s picture

Title: beta breaker: db_update_field throws errors on auto_increment » beta breaker: db_change_field and type 'serial' columns
Status: Needs work » Needs review
FileSize
9.41 KB

hass: This issue actually has nothing to do with db_update_field(), the problem is with db_change_field() and type serial columns. I am actually going to submit an issue to remove db_update_field() from core as there is no legitimate use for in D6; it is experimental code. If you think there is a flaw in the Schema API that prevents you from doing something in a portable way, then by all means submit a new issue about it (like you did with this one).

Everyone:

First, I was wrong. On MySQL, auto_increment columns do not need to be the primary key, they must only be in any key or index.

Second, the crux of the problem is that MySQL's requirement for auto_increment columns means that the column must be created and assigned to a key in a single operation. I therefore changed db_change_field()'s interface to allow that. Beyond that I'll let the patch and function documentation speak for itself.

I have tested this somewhat but not yet fully. Feedback welcome.

Frando’s picture

I agree with bjaspan that we should remove db_update_field for D6. While it's in theory a great function, it is of little use as it doesn't play nicely with our current update system. Leaving it in will cause lots of confusion, as we'll have a function that appears to be a solution for updates, but is not at the same time (we already state in the docs that you should not use it).

Regarding db_change_field: I like bjaspan's patch. It would be cool if we could just drop all keys and indexes of the field right in db_change_field, but I'm not sure whether that is possible in a simple way (read: without database inspection). Again, relying on the schema is a bad idea here, because that function really should be update-safe.

bjaspan’s picture

http://drupal.org/node/173566: Remove db_update_field.

Frando is right that database inspection would make db_change_field() more useful and is what we need in core to make things like db_update_field() actually work. He is also right that we may end up completely revamping our approcah to updates once we have it. Database inspection is in schema.module but will not be in D6.

Frando’s picture

FileSize
10.44 KB

While reviewing bjaspan's patch I corrected some code-style mistakes here and there and simplified the documentation from a "user's" point of view (so that the usage instructions are at the beginning and the explanation/reasons seperated below).

The code looks good.

Patch attached.

bjaspan’s picture

Frando's doc changes are a big improvement. One nit: he didn't update "and you want to change **T.c** to be type serial". New patch attached.

bjaspan’s picture

Sorry, here's the new patch.

chx’s picture

FileSize
13.36 KB

I think we want to fix update 6019 while at it. I fixed the uid 0 bug here as well and made sure all tables have proper auto inc fields. We do not surprises people upgrading from old Drupals or whatnot.

chx’s picture

*we do not want surprises.

hass’s picture

This part of the patch looks wrong, while it seems not DB independent. Why not using db_change_field() here? In the following example you have DB depended logic. This is not what we want with schema api. Everything should be DB independed. Correct me if i'm wrong, please.

   switch ($GLOBALS['db_type']) {
     case 'pgsql':
-      $ret[] = update_sql("ALTER TABLE {node} DROP CONSTRAINT {node}_pkey");
-      $ret[] = update_sql("ALTER TABLE {node} ADD PRIMARY KEY (nid)");
       $ret[] = update_sql("ALTER TABLE {node} ADD CONSTRAINT {node}_nid_vid_key UNIQUE (nid, vid)");
       db_add_column($ret, 'blocks', 'bid', 'serial');
       $ret[] = update_sql("ALTER TABLE {blocks} ADD PRIMARY KEY (bid)");
@@ -3070,8 +3068,6 @@ function system_update_6016() {
       break;
     case 'mysql':
     case 'mysqli':
-      $ret[] = update_sql('ALTER TABLE {node} DROP PRIMARY KEY');
-      $ret[] = update_sql('ALTER TABLE {node} ADD PRIMARY KEY (nid)');
       $ret[] = update_sql('ALTER TABLE {node} ADD UNIQUE KEY nid_vid (nid, vid)');
       $ret[] = update_sql("ALTER TABLE {blocks} ADD bid int NOT NULL AUTO_INCREMENT PRIMARY KEY");
       $ret[] = update_sql("ALTER TABLE {filters} ADD fid int NOT NULL AUTO_INCREMENT PRIMARY KEY");

And is a primary key drop. As i remember this is MySQL specific and not valid for PgSQL. Maybe i'm wrong, but i thought db_change_field() knows that it must drop first - i don't know if i must drop a primary key in PgSQL, DB2, Oracle.

+      db_drop_primary_key($ret, 'users');
+      db_change_field($ret, 'users', 'uid', 'uid', array('type' => 'serial', 'unsigned' => TRUE, 'not null' => TRUE), array('primary key' => 'uid'));
chx’s picture

hass, that part is already in core in case you have not realized. We are not rewriting every update in D6 to schema API -- lots of work, little gain.

Again you are wrong in the place where you assume that schema API knows *anything* -- what you are talking of is called ntrospection and it's in schema module but not in D6 as it's written above. I happen to know that all these fields are going to be primary keys and also that these tables had a primary key before hence I drop the primary key but if you would have read the issue you would know that it's not necessary that the primary key is the serial therefore, no automatism.

However, I have a question to bjaspan / frando. What about db_add_field , here? It needs the same treatment, doesn't it?

hass’s picture

I understand that we may not like to upgrade D5 install code to the new schema api if not absolutely required, but D6 code should be altered to schema api... it's new code and should follow the new rules and not old outdated rules... maybe some refactoring is required. I wonder much why module and theme maintainers are made to upgrade their module and theme code, but this rules are not mandatory for core. As module and theme maintainer i feel there is no big gain, too... but i must upgrade.

Strange upgrade rules out there!

chx’s picture

a significant portion of Drupal 6 updates already use schema API. Those that don't, again, do not worth the bothering to be rewritten, if you are a contrib author, then you are writing the update for the first time and you use the schema API. Or have you ported your module before May 25 and now need to recode? I sincerely doubt.

bjaspan’s picture

hass: Use of Schema API is not mandatory. Modules (and core, for that matter) can continue using db-specific code if necessary, it is just much less pleasant. And there is just no motivation to rewrite existing D6 updates if they already work. We all have very limited time.

chx: re db_add_field(), yes, good catch, it needs a $new_keys argument too. I'll do this later today if no one beats me to it.

ChrisKennedy’s picture

Actually I wrote up a patch last week to convert the d6 non-schema updates to schema: http://drupal.org/node/171477

Dries’s picture

As far as I can tell, this change looks sane to me. Has it been well-tested on PostgreSQL? :)

Code style is a bit inconsistent: we use $field_new and $new_keys. The location of 'new' seems a bit random.

There is also a missing space in ADD '.implode.

bjaspan’s picture

I have lightly tested it by using the $new_keys argument to db_change_field on mysql and pgsql. My full testing will include installing D6 and upgrading D5 to D6 with mysql and pgsql and verifying that schema.module reports no conflicts, but I have not done that yet.

hass’s picture

Additional i found two functions that confused me today. I think this should be cleaned up and db_change_column removed from the core, too.

http://api.drupal.org/api/function/db_change_column/6
http://api.drupal.org/api/function/db_change_field/6

bjaspan’s picture

hass has a point that the difference between db_change_column() and db_change_field() is confusing. For the record:

- db_change_field() is part of Schema API that adds a column to a table.

- db_change_column() is a pgsql-only function for adding a column to a table and should not be used by any new code starting with D6 (with the caveat that Schema API is not mandatory).

We cannot remove db_change_column() unless we rewrite all the code that uses it. At the very least, we should clearly document that it is deprecated for new code.

bjaspan’s picture

@#28: I added the $keys_new parameter to db_add_field() per chx's suggestion.

@#33: I have now tested this patch on mysql and pgsql by installing a new system and upgrading D5 to D6 as well as testing the new functionality of of db_add_field() and db_change_field() on both databases. After the upgrade, schema.module reports 3 inconsistencies on mysql and 4 on pgsql, but none of them are related to this patch (see http://drupal.org/node/173982).

I think this is done.

chx’s picture

You forgot the ready patch :(

bjaspan’s picture

ooops.

chx’s picture

Left out my changes which are relevant (all serial problems in 6019).

chx’s picture

Next try...

chx’s picture

Bdragon tests, reports utter failure and I find a missing space in db_change_field for MySQL and a missing array() in system.install. So far, number of bugs, bjaspan:chx , 1:1 :D

chx’s picture

Ah well, those table that are autoincrement in D5 can't be and need not be redone..

bdragon’s picture

Status: Needs review » Reviewed & tested by the community

Tested in mysql 5.0.45, upgrading from DRUPAL-5 to HEAD, finally works.

I had to disable the blogapi and path modules, they were causing spurious warnings and gumming up the system. (Then again, I just noticed I had meta redirects disabled in webdeveloper, so I was having to copy the meta refresh url manually.. That was part of it. No matter what I tried though, before disabling those two modules, I could not get it to kick off the updates. The warnings were relating to the addition of per-language aliases.)

After updating, reenabling the modules worked ok, no more warnings. Everything carried over fine.

chx’s picture

Those two will be covered in a separate issue , they have nothing to do with serials, here, blogapi has no tables and path has no serial at all. It might be that (parts of 6005) needs to live in update.php but then again, that's a separate issue.

Gábor Hojtsy’s picture

Let's get another test before commit :)

bjaspan’s picture

Status: Reviewed & tested by the community » Needs work

chx's latest patch is not quite right. It gets batch.bid and boxes.bid wrong; batch.bid is accidentally left out of the new list of tables to convert to type serial, so it remains as type int, and boxes.bid gets changed to be unsigned whereas its schema does not declare it to be so. Both discrepancies are reported by schema.module's comparison.

I'll fix these tomorrow; it is too late for me to trust myself to write code now. :-)

bjaspan’s picture

Questions regarding the latest patch:

boxes.bid gets changed to be unsigned whereas its schema does not declare it to be so. Should we change the schema to declare boxes.bid as unsigned or change the update not to make it unsigned?

batch.bid is left out of the new list of tables to convert to type serial, so it remains as type int. When I try to add it to the list, do db_change_field() is called on it, the update.php process dies. I'm guessing this is because we are using the batch table during update.php. However, this has not been a problem at any time since I wrote the D5 to D6 upgrade patch in July. Does anyone know why this is happening?

I've attached a new patch that leaves boxes.bid as unsigned and attempts to update batch.bid but kills update.php instead. Can someone that knows the batch API figure this out?

chx’s picture

Status: Needs work » Needs review
FileSize
17.53 KB

Bleh. Then change the batch definition and do not fsck with its update.

Dries’s picture

1. db_change_column is only used in .install files to it might actually be fairly easy to deprecate that? How long do we intend to keep that function around? It might be a good idea to rename it to _db_change_column() to indicate that it has been deprecated? Thoughts?

2. If you are adding a type 'serial' field, you MUST specify at least one key or index including it in this array. It would be great to add a sentence explaining why this needs to be done. We explain that later for db_add_field() to it might be sufficient to add a "see also for more information".

3. The documentation doesn't explain why you have to pass the keys when changing the field -- and why you can't do this in two steps. Let's make sure developers understand why this is important because it is not self-explanatory.

A bit more massaging around the documentation so it is accessible to newbies, and this patch should be ready to go in.

chx’s picture

Status: Needs review » Reviewed & tested by the community
FileSize
17.53 KB

CCK might want to use db_change_field. I added a note to db_add_field. The documentation doesn't explain why you have to pass the keys when changing the field -- and why you can't do this in two steps. It does:

On MySQL, all type 'serial' fields must be part of at least one key or index as soon as they are created. You cannot use
db_add_{primary_key,unique_key,index}() for this purpose because the ALTER TABLE command will fail to add the column without a key or index specification. The solution is to use the optional $keys_new argument to create the key or index at the same time as field.

this is in the db_change_field doxygen. Questions answered, patch working --> ready.

Gábor Hojtsy’s picture

Status: Reviewed & tested by the community » Needs review

Indeed, db_change_field properly documents this, although I needed to read the block you quoted twice to understand what is suggested. BUT as Dries pointed out, $key_new is not documented equally at all places, and there is no see also at the places (here, namely db_add_field()), where it is poorly documented.

chx’s picture

FileSize
17.61 KB

If you look at the patch numbers, i upped 49 again.

bjaspan’s picture

I will look at this patch on a plane later today.

I realized while waking up this morning that batch is new with Drupal 6 and we should not be doing any updates for it, just changing the table spec. It looks like chx beat me to this realization...

bjaspan’s picture

New patch attached. Changes since chx's in #53:

- I now use Schema API to create the batch table instead of separate mysql/pgsql code. This makes it clearer that we are creating it correctly (though chx' change was fine).

- boxes.bid schema was changed but only updated on mysql; fixed for pgsql.

- The change from http://drupal.org/node/140666 in system_update_6016() which changed node's primary key from nid,vid to nid was removed by #53. He added it back for mysql in 6019 but not pgsql; now fixed.

With this patch, D5 to D6 upgrades once again run cleanly and result in zero schema mismatches on mysql and pgsql. I think this is RTBC.

I can't think of a better demonstration of why we need automated testing. :-)

chx’s picture

Status: Needs review » Reviewed & tested by the community

Very nice work, thanks!

Gábor Hojtsy’s picture

Status: Reviewed & tested by the community » Fixed

All, right, committed. Thanks for your work.

Anonymous’s picture

Status: Fixed » Closed (fixed)