I have a very large site I inherited with over 400K field collection items (and over half a million nodes). I started the update at 2:15 today, and it's still running nearly four hours later. I can tell it's working as I'm now down to ~370K items, but I am very concerned about how long this is taking and there's no way to know how far along in the process it is. Are there any ways this can possibly be optimized?

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

mikemccaffrey’s picture

I am seeing this issue as well. When running update 7002, my system is either hanging or crashing with a memory allocation error.

It seems that the cause of this is in the left joins that the update is doing between field_collection_item and the various field_data_field_something tables. The individual field_data tables don't have indexes on the field_something_value fields, so joining on that is very computationally intensive.

You may want to add the indexes yourself before running this update as a short term fix. It might also be good to have the update add indexes before proceeding with the join.

seanr’s picture

Status: Active » Needs review
FileSize
1.77 KB

The attached patch definitely speeds up this update.

Status: Needs review » Needs work

The last submitted patch, 2: field_collection_update_7002-2344513-2.patch, failed testing.

seanr’s picture

Status: Needs work » Needs review

Fixed patch base path. BTW, all credit for this belongs to https://www.drupal.org/u/astierler

seanr’s picture

FileSize
1.21 KB
seanr’s picture

Status: Needs review » Reviewed & tested by the community

We've tested this at least a couple dozen times in a CI setup and have had no problems. It definitely speeds up the update substantially. Marking RTBC.

jmuzz’s picture

Status: Reviewed & tested by the community » Needs work

Comments should end in a full stop.

Can we resolve the doubt around deleting the index afterwards? On the one hand many have already run this update so in theory it shouldn't be changed in a way that will leave newly updated versions in a different state than the ones that updated before. On the other hand there is reason to have this index as indicated in #2213795: Add index to FIELDNAME_value column on field_collection field tables .

Josh Waihi’s picture

Status: Needs work » Needs review
FileSize
1.71 KB

With large amounts of data, you can also run into issues with php memory attempting to retrieve all the data at once. I've revised the patch to perform deletes in batches of 1000. I've also changed the drupal_set_message call to watchdog which is more visible in drush log output.

itsekhmistro’s picture

Hi, the patch in comment #8 doesn't follow the Drupal practice: big DB updates should be processed in batches.

Attached updated patches that performs field_collection_update_7002 in batches.
Note, highly recommended to run this update with drush for performance reasons.