Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
I've added ini_set('memory_limit', '1800M');
to the top of the function so it will work. The update takes a LONG time (15+ minutes) to run and in MySQL under Show Processlist the Status is says "System lock". I have 173,721 unpublished nodes in this database.
Comment | File | Size | Author |
---|---|---|---|
#20 | drupal-2174551-18-taxonomy-update-7011-fix.patch | 1.64 KB | David_Rothstein |
#14 | drupal-2174551-14-taxonomy-update-7011-fix.patch | 2.54 KB | mikeytown2 |
#13 | drupal-fix_taxonomy_update_7011-2174551-13.patch | 1.29 KB | hswong3i |
Comments
Comment #1
mikeytown2 CreditAttribution: mikeytown2 commentedComment #2
mikeytown2 CreditAttribution: mikeytown2 commentedComment #3
heddnTo help with memory consumption, it should probably be rewritten as a count() query then loop over the results using db_query_range() to page through a thousand rows at a time.
Comment #4
David_Rothstein CreditAttribution: David_Rothstein commentedUgh, yeah. That's a lot of unpublished nodes. Hopefully most sites don't have anywhere near that many.
We could probably change the update function to run as a batch (with $sandbox), and also the initial query could be made to join against the {taxonomy_index} table (so it only pulls node IDs that actually need deleting; for many Drupal 7 sites this will be 0 anyway).
Comment #5
David_Rothstein CreditAttribution: David_Rothstein commentedI added this to the "known issues" section of the release notes and release announcement for now.
Comment #6
greggles@mikeytown2 - did you run the update via update.php or drush updb? Do you have a memory limit on drush?
Comment #7
mikeytown2 CreditAttribution: mikeytown2 commentedFirst try was via update.php. That failed. Next was drush updb, that failed as well until I gave PHP a lot of ram. Luckily I ran the update again from the cli (drush) and that worked after waiting over 15 minutes for the query to complete.
Comment #8
Garrett Albright CreditAttribution: Garrett Albright commentedFor the record, and to save others from having to look the function up:
Comment #9
Jamie Holly CreditAttribution: Jamie Holly commentedSo converting this to the regular ole' SQL syntax that MySQL would ultimately get, using @mikeytown2's 173,721 nodes, you end up with something like:
And just keep adding to that ### until you have 173,721 values in your IN statement.
Two areas I can see that this could be greatly improved upon. The first is using a batch, as @David_Rothstein suggested.
The second is eliminating any nodes that aren't in taxonomy_index on the initial query:
That would really help out sites with a lot of unpublished nodes that also don't have any taxonomy assigned.
The other option is a sub query, but I'm not sure about support in other databases. This should perform better though (If Mikey has a copy of that database around and could test, that would be great):
Comment #10
mikeytown2 CreditAttribution: mikeytown2 commentedReturns 161,262 nodes on one of our test servers with an older db backup. So in this case it most likely doesn't help reduce the total number of nids to remove.
Returns an empty result (which is correct as this is not a D6->D7 site).
Comment #11
mikeytown2 CreditAttribution: mikeytown2 commentedWe would want an INNER JOIN correct?
This gives back an empty result
Comment #12
Jamie Holly CreditAttribution: Jamie Holly commentedHAHA you're right. That's what I get for typing without the proper amounts of coffee in me.
We would want nodes where an entry exists on taxonomy_index. No need for the null check either.
SELECT n.nid
FROM node AS n
INNER JOIN taxonomy_index AS t
ON n.nid=t.nid
WHERE n.status=0;
That way if there isn't a corresponding taxonomy_index then it won't even get returned.
Comment #13
hswong3i CreditAttribution: hswong3i commentedA quick fix that only execute 50 nids in each execute, via GIT 7.x:
Comment #14
mikeytown2 CreditAttribution: mikeytown2 commentedChanged this so it will use batch processing. Also uses the new query from #12. Tested it using the old query and the update takes about 30 seconds to run going through 161k records. Put in the new query and it doesn't try to delete anything. The limit parameters for the query should probably be included in the query correctly instead of string concatenation.
Comment #16
mikeytown2 CreditAttribution: mikeytown2 commented14: drupal-2174551-14-taxonomy-update-7011-fix.patch queued for re-testing.
Comment #17
mikeytown2 CreditAttribution: mikeytown2 commentedThinking about this more and I should make the limit always be 0, 5000 and run it till $nids = 0. Will wait until this first patch passes before adjusting this patch... Won't happen till Monday so feel free to modify it.
Comment #18
hswong3i CreditAttribution: hswong3i commentedI am questioning on performance of delete query: is it really such heavy that require for a background batch processing?
IMHO delete query shouldn't be such memory expensive, just because we put everything into a single
IN ()
statement, and so result as a huge result matrix (i.e. running out of memory).Shall #13 enough for fixing this issue?
Comment #19
hswong3i CreditAttribution: hswong3i commented13: drupal-fix_taxonomy_update_7011-2174551-13.patch queued for re-testing.
Comment #20
David_Rothstein CreditAttribution: David_Rothstein commentedRerolled to fix a few things; this should now be working correctly, I believe.
Comment #21
David_Rothstein CreditAttribution: David_Rothstein commentedCross-posted there, but I would say, since we don't know how many nodes the site has, batch processing is still a good idea.
If I understand mikeytown2's comment above correctly, this still takes 30 seconds to run in his situation (or rather, it would if all the unpublished nodes came from an upgraded Drupal 6 site with this issue), which is long enough to warrant a batch.
Comment #22
mikeytown2 CreditAttribution: mikeytown2 commented@David_Rothstein
That is correct. It takes 30 seconds to issue the delete command for 160k rows in the DB.
Patch in #20 works for me so going to RTBC it as well.
Comment #23
prakharcse CreditAttribution: prakharcse commentedhii
Iwant to upgrade my drupal 6.30 to the updated version of drupal 7 . Please if somebody guide me through this migration.
Comment #24
Garrett Albright CreditAttribution: Garrett Albright commentedprakharcse, you can learn more about upgrading Drupal core here.
Comment #25
marcingy CreditAttribution: marcingy commentedAlso ran into this issue (on a none d6->d7 site) with a lot of unpublished nodes(751,357). Patch here does the trick for me and prevents a nasty fatal error.
Comment #26
Kristen Pol#20 is working for me.
Comment #27
David_Rothstein CreditAttribution: David_Rothstein commentedCommitted to 7.x - thanks! http://drupalcode.org/project/drupal.git/commit/5be1de3
I'll also update the 7.26 release notes and release announcement to mention this is now fixed.
Comment #29
vipcar CreditAttribution: vipcar commented20: drupal-2174551-18-taxonomy-update-7011-fix.patch queued for re-testing.
Comment #30
marcingy CreditAttribution: marcingy commentedThis is already committed
Comment #32
marcingy CreditAttribution: marcingy commentedComment #33
David_Rothstein CreditAttribution: David_Rothstein commentedDrupal 7.27 was a security release, so this is now scheduled to go out in Drupal 7.28 instead.