Not LIVE FROM THE MINNESOTA SEARCH SPRINT, but found on the plane ride home...

The {search_dataset} has a unique key named "sid_type". Because of the query in search_wipe, which is called on every single sid that gets indexed, this unique key should be an index.

    db_query("DELETE FROM {search_dataset} WHERE sid = %d AND type = '%s'", $sid, $type);

I think that this was an index at one time. And I think that this might of gotten mixed up in the conversion to schema api.

This patch should be backported to 6.x.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

douggreen’s picture

And here's the patch...

Damien Tournoud’s picture

This sid_type should in fact be a primary key.

The change from index to unique was consciously made during the 6.x development cycle. It lead to some problems (such as collations problems solved in #218403: Duplicate entry errors in search indexer), but it's a good choice.

douggreen’s picture

I'm not too clear on the differences here. But I do know that the current "unique key" implementation is significantly impacting this DELETE which is done on every single node that is indexed. If you know a better way, please offer the patch. Thanks!

Damien Tournoud’s picture

I know a better way: just make this key a primary key.

Anonymous’s picture

@Damien: How is Primary key better than a Unique key in this instance?

douggreen’s picture

A unique key is just a constraint, whereas a primary key is an index. You get no performance gain from constraints, but you do from indexes.

Damien Tournoud’s picture

@douggreen: In fact, for MySQL the difference is not very clear. MySQL also creates an index for UNIQUE KEY, but it's not clear to me if there is prefix and suffix compression for these indexes, nor whereas MySQL can use it to lookup values. Moreover, PRIMARY KEYS are clustered (the physical order of rows change), whereas other index types are non-clustered.

For the {search_dataset} particular case, a PRIMARY KEY is what makes sense anyway.

douggreen’s picture

I do know that the unique key in mysql is giving terrible performance on the DELETE FROM {search_dataset} WHERE sid = %d AND type = '%s', so it surprises me to hear that "MySQL also creates an index for UNIQUE KEY."

robertDouglass’s picture

Subscribe.

lilou’s picture

FileSize
1.41 KB

Reroll patch #1 and add description to update function.

Dries’s picture

Status: Needs review » Fixed

Committed to CVS HEAD. Thanks.

Damien Tournoud’s picture

Status: Fixed » Active

Sorry, why removing the unique constraint? It makes total sense!

Damien Tournoud’s picture

Assigned: douggreen » Unassigned
Status: Active » Needs review
FileSize
1.95 KB

Here is a revert of #10, changing the unique constraints in search_index and search_dataset by primary keys.

Dries’s picture

Status: Needs review » Fixed

I think you're right. Committed the follow-up patch.

Damien Tournoud’s picture

Status: Fixed » Needs review
FileSize
596 bytes

Sorry, finally forgot to drop the redundant 'word' index in the schema.

mfb’s picture

Priority: Normal » Critical
Status: Needs review » Needs work

There's an update script in system.install running on the search tables, but search module and its tables may not even be installed! This causes errors at update.php. The update script should be in search.install instead.

Damien Tournoud’s picture

Status: Needs work » Needs review
FileSize
2.27 KB

Oups.

webchick’s picture

Status: Needs review » Needs work

This no longer applies because of some whitespace changes in #303889: Impossible to update D6 -> D7 but should be a quick re-roll.

chx’s picture

Status: Needs work » Needs review
FileSize
2.1 KB
kbahey’s picture

Status: Needs review » Reviewed & tested by the community

I tested this on top of #46 in http://drupal.org/node/303889, and that combination made update.php work again.

webchick’s picture

Status: Reviewed & tested by the community » Fixed

Third time's a charm? :)

Damien Tournoud’s picture

I'm pretty sure this issue will came back to haunt us. A ghost will step in, and say "this is an *horrible* idea, please revert". The only question is: when will that happen?

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for two weeks with no activity.