I received the following error msg when updating the db to beta3:

user warning: Duplicate entry '4102-thumbnail' for key 1 query: ALTER TABLE image ADD PRIMARY KEY (nid, image_size) in /nfs/c14/h01/mnt/415588/domains/example.com/html/includes/database.mysql-common.inc on line 374.

Is this a problem that needs to be resolved or a primary key addition attempt that is inconsequential? Please advise. Any help would be appreciated.

Thanks-

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

joachim’s picture

Category: support » bug
Status: Active » Postponed (maintainer needs more info)

What version were you on before?
We have several reports open for this sort of problem coming from D5.

superflyman’s picture

@joachim

I was upgrading from 6.x-1.0-alpha4.

joachim’s picture

Do you happen to have your old database from alpha 4?
If so, can you find out what has has for nid 4102 in the {image} table?

superflyman’s picture

Ok, I found an older db and searched for 4102. Here's what I found:

INSERT INTO `image` VALUES ('4102','22679','_original');

Is this the right row?

joachim’s picture

Have you got any other nids of 4102 in that table?
The problem appears to be that the 4102 / thumbnail combination occurs more than once.
Which it shouldn't, as there is only 1 thumbnail for each node.
So somehow we're getting multiple data being put in.

superflyman’s picture

I don't know how I missed this but I did... Okay, here are the right rows I think:

INSERT INTO `image` VALUES ('4102','22823','thumbnail');
INSERT INTO `image` VALUES ('4102','22824','thumbnail');

jriedel’s picture

I just ran into this today when updating the image module on a site. I went and looked into the DB for the nid in the image table. It had dups for both the thumbnail and the preview sizes. I was not sure which ones to get rid of, so I thought, I'll just re-upload the image then delete the problem one. I went to the that image node/XXXX and then it hit me.

I just clicked the buton to rebuild the images.

Check the database again the dups were gone.

Issued the ALTER TABLE by hand and everyting is now happy.

I'm wondering if this could have happened because of a race condition with two requests trying to building the images at the same time.

joachim’s picture

Title: Error on Database Updates for 6100 when upgrading to Beta3 » Error on Database Updates for 6100 when upgrading to Beta3: 'duplicate entry' for primary key
Status: Postponed (maintainer needs more info) » Active

Aha!
So if we rebuilt all images before that upgrade, or told people to do that, then we'd be okay...
that's valuable information!

So the options are:
a. Just tell people to rebuilt all images before this upgrade
b. rebuild all derivatives as part of 6100 (would require a batch within a batch, is that possible?)
c. rebuild just the bad ones (how do we detect them? would still perhaps require a sub-batch to make sure large sites don't time out on the operation?)
d. clean up the duplicates by hand with SQL

As for the matter of what causes the duplicate rows to get there in the first place -- I think jriedel is right, it's some sort of race condition.
It's not top priority to fix it though: this system will no longer exist for D7, the duplicates don't harm normal site operation, and the flow around calling a derivative rebuild is spaghetti.
If someone has a sudden insight into what's causing it then great, otherwise, let's not lose sleep over it -- I reckon we should focus on making the upgrade error go away.

jriedel’s picture

@joachim

Thinking about it a bit, I have good idea how I happened to get the duplicates.

On the front page of the site I have a custom block that provides a random image out of a few selected galleries.

The image I had trouble with was part of a batch that was put on the site with Image Import. I suspect that as I imported the images then went to the gallery to check them out, someone hitting the front page of the site caused the derivative to be created as I was doing the same thing in the gallery.

As to the a - d options.

a, on a large site, yuck.
b, on a large site, just as bad as a.
c, Least amount of trouble. Maybe just give people a list of those nid and have them go to each image and do a rebuild. Should be a something like SELECT COUNT(image_size), image_size, nid from image group by nid, image_size having count(image_size) > 1;
d, Leaves files from derivative images hanging around and no way to get rid of them later. Also no good way to tell wihich row should be deleted.

The first thing is a good warning about the problem, and how one can fix the images. Then look at a way to do it all as part of the upgrade. I do agree that proper locking around the build derivatives part would not be an easy task. Also once this last update is in, the index will prevent it from happening again.

Dogtag’s picture

I'm sorry for being so dense, but how do we "rebuild image derivatives"? I'm not even sure what that means, exactly.

My site has images that that were uploaded via CCK ImageField. They are displayed in their respective nodes (one per node right now), using an ImageCache preset, and in a Views Slideshow, in a block, using a different ImageCache preset.

Thanks.

joachim’s picture

> My site has images that that were uploaded via CCK ImageField
Then you are not concerned with this module.

And in answer to your question: either the node admin page, or VBO.

Dogtag’s picture

Ah, yes looking at the dependencies, I see the only reason I got the Image module was because I had tried the Image FUpload module, which required it. I forgot to remove the Image module after removing FUpload.

Thanks.

superflyman’s picture

@joachim

So, in my case with the following dups:

INSERT INTO `image` VALUES ('4102','22823','thumbnail');
INSERT INTO `image` VALUES ('4102','22824','thumbnail');

Can I just delete the first row from the database?

Or should I just try and rebuild that image node thru the GUI? Would that fix it?

joachim’s picture

@superflyman

you could:
a) remove one of those rows yourself
b) do a 'regenerate derivatives' on all image nodes at admin/content/node
c) figure out the right SQL command to remove all duplicates of that form, so we can incorporate it into the update code :)

emmajane’s picture

Status: Active » Needs review
FileSize
1.84 KB

I had a similar problem with D5. In the D6 module the two insert queries (Lines 276 and 281) have "group by" clauses, so I'm not sure if my fix will work. I added
on duplicate key update nid=files.nid; however, the MySQL documentation says this won't work when there is a GROUP BY (please see: http://dev.mysql.com/doc/refman/5.1/en/insert-select.html).

I don't have D6 using this module installed, but if someone does, can you update lines 276 and 281 to see if it works? I've also uploaded a patch against BETA3.

$ret[] = update_sql("INSERT INTO {image} SELECT DISTINCT f.nid, f.fid, f.filename FROM {files} f INNER JOIN {node} n ON f.nid = n.nid WHERE n.type = 'image' AND f.filename" . $cond . " GROUP BY nid,filename on duplicate key update nid=files.nid");

joachim’s picture

I've reported on several trial runs I've done here: #207557: update 5-1 to 5-2 gives user warning: Duplicate entry '611-thumbnail' for key 1 query in image_update_5200. Could really use help from someone who knows their SQL!

joachim’s picture

FileSize
2.4 KB

Here is a simple debug module.

Install it along with devel.

* Looks at the queries originating from image module in the current page view,
* and if any of them are other than SELECT (ie, they alter data), then
* check the image table for duplicates. If they are found, dump debug output.

The only way we are going to find out where the duplicates come from is by catching them in the act. Even then, I'm not sure we'll get it -- I don't know what kind of conditions could cause this, but if it could be wacky things like two users causing a derivative regen at the same time, it's going to be hard t catch.

joachim’s picture

I just had to do a similar data migration where there were duplicate rows in an old table that caused duplicate key errors in a new table.

Using the IGNORE keyword worked:

INSERT IGNORE INTO term_user (uid, tid) SELECT uid, tid FROM term_user_OLD
joachim’s picture

Status: Needs review » Closed (duplicate)

I'm closing this as a duplicate of #207557: update 5-1 to 5-2 gives user warning: Duplicate entry '611-thumbnail' for key 1 query in image_update_5200.

Please reopen and explain if I am wrong.

I'm going to commit a patch for this problem over there. Please look at that and try the dev version once it's into that.