Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
Not LIVE FROM THE MINNESOTA SEARCH SPRINT, but found on the plane ride home...
In search_index (the function that scores HTML), a SQL to UPDATE the {search_index} is done, and if that fails then an INSERT is done. However, since search_wipe is called just before this loop, there will never be any matching items for the UPDATE and it will always fail. The loop also can't introduce any duplicate records because the array key is the word. Thus I don't think we ever need to do the UPDATE. And the attached patch will half the number of queries needed for every word inserted into the index!
Comment | File | Size | Author |
---|---|---|---|
#19 | 257912.patch | 1.46 KB | douggreen |
#15 | 257912.patch | 1.46 KB | douggreen |
#11 | 257912_1.patch | 1.38 KB | Damien Tournoud |
#9 | 257912.patch | 1.21 KB | douggreen |
#5 | 257912.patch | 1.21 KB | douggreen |
Comments
Comment #1
Damien Tournoud CreditAttribution: Damien Tournoud commentedHum. And what about the big comment in front of the
UPDATE
that says:This
UPDATE
is required, so this is by design.Comment #2
douggreen CreditAttribution: douggreen commentedAh, I think you're right on the collate comment. But if that's the only case where we need to UPDATE, we should try the INSERT first and then do the UPDATE if the INSERT fails. The duplicate INSERT will fail because of the word_sid_type unique key. I think that this approach will result in one query instead of two 99% of the time.
Comment #3
Damien Tournoud CreditAttribution: Damien Tournoud commentedRight, we should invert those queries, because the INSERT will infrequently fail.
Comment #4
douggreen CreditAttribution: douggreen commentedComment #5
douggreen CreditAttribution: douggreen commentedAnd since scores are floats and not numbers the %d in the UPDATE should be fixed.
Comment #6
Damien Tournoud CreditAttribution: Damien Tournoud commentedPlease also fix the type of the sid column in the
UPDATE
query (it should be an int). Oh this was messy, and the fact that I fixed that bug on the day of the release of D6 is not an excuse.Comment #7
douggreen CreditAttribution: douggreen commentedWhat's wrong with the type? It's a string, which is correct.
Comment #8
Damien Tournoud CreditAttribution: Damien Tournoud commentedThe sid is an int column:
But there the
%d
in theUPDATE
query is in quotes.Comment #9
douggreen CreditAttribution: douggreen commentedI didn't notice the quotes around '%d'... The attached patch fixes those too.
Comment #10
moshe weitzman CreditAttribution: moshe weitzman commentedLooks good to me. This is a big performance win, and should be backported IMO.
Comment #11
Damien Tournoud CreditAttribution: Damien Tournoud commentedOoops. We overslept on that one.
Here is a slightly modified version of douggreen's patch (with an updated comment).
Please review!
Comment #12
douggreen CreditAttribution: douggreen commented@Damien, thanks for fixing the comment, I obviously missed that!
Is it at all possible to keep the comment on two lines, or did it break 80 characters that way?
Please remove the @ in front of db_query. I'm not sure why and when people need to add this, but I don't think it's needed.
Comment #13
douggreen CreditAttribution: douggreen commentedI replaced
@db_query
. withdb_query
. I tried to make that comment fit on 2 lines, but couldn't make it terse yet explanatory. But while working on it, I modified Damien's text. I'm not going to be stubborn on what the comment says, it's the code here that's really important. As moshe points out, we might want to backport this, so let's try to agree on the comment verbiage, and get this RTBC!Comment #14
moshe weitzman CreditAttribution: moshe weitzman commented@douggreen - did you mean to attach a patch?
Comment #15
douggreen CreditAttribution: douggreen commenteddoh!
Comment #16
Damien Tournoud CreditAttribution: Damien Tournoud commentedHum. The comment looks good. But the @ in front of the db_query is required. The INSERT should fail with a duplicated key warning.
Comment #17
douggreen CreditAttribution: douggreen commentedWhat does the @ do in php? I thought it was redundant. The only place in core that there is an @db_query is in modules/node/node.module line +194, and I kinda assumed that this was a mistake.
Comment #18
Damien Tournoud CreditAttribution: Damien Tournoud commentedThe @ prefix suppresses error reporting (http://www.php.net/manual/en/language.operators.errorcontrol.php).
Comment #19
douggreen CreditAttribution: douggreen commentedI think that this is ready now... new patch attached restoring the '@' sign. Damien's patch (#11) above was good enough too. My latest patch should be identical to his, just with a slightly different comment.
The reason this patch has take so long is that it just took some education on my part to understand why we needed @db_query!
Comment #20
Damien Tournoud CreditAttribution: Damien Tournoud commented@douggreen, I like your comment better.
Marking #19 as (very well) reviewed. We finally made it!
Comment #21
catchstill applies cleanly.
Comment #22
Dries CreditAttribution: Dries commentedTested, reviewed and committed. Thanks.
Comment #23
moshe weitzman CreditAttribution: moshe weitzman commentedpatch applies just fine to D6.
Comment #24
Gábor HojtsyThanks, committed to 6.x.
Comment #25
Anonymous (not verified) CreditAttribution: Anonymous commentedAutomatically closed -- issue fixed for two weeks with no activity.