Running cps_update_7107() on a site with a large number of changesets took several hours (at least - I tried it on a dev version of the site and after several hours it never finished). In addition (and one of the main reasons it's so slow), it resulted in tons and tons of entities being added to changesets that look like they don't belong there.

The update function as currently written basically adds every entity to every changeset. But what cps_add_untracked_entities() does (and what this update function is trying to approximate) is add every entity that existed at the time the changeset was published. Later entities should not be added.

It doesn't seem possible to go back and do exactly what cps_add_untracked_entities() would have done for each changeset, but we can get a lot closer. The patch I'm posting here gets the updates down to about 2 minutes, and orders of magnitude fewer entities added.

As described in the code comments, this still isn't perfect. It will still add some entities to certain changesets that don't belong there. Additionally, the (existing) code for determining which revision of the entity to add doesn't seem quite right either. But this seems to be about as close as I can figure out how to get.

The main consequence of getting this wrong seems to be that if a published changeset is ever reverted, we will go back to an archived version of the site that might not be 100% accurate. Some content might be there that shouldn't be, and the wrong version of some content might be shown. So this is a problem that might not be too serious, but in theory could be very serious. However at least this patch should get it closer to accurate than it previously was.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

David_Rothstein’s picture

Status: Active » Needs review
FileSize
6.91 KB

Here is the patch.

alexjarvis’s picture

Tiny tweak to patch, just replaces else if => elseif.

alexjarvis’s picture

Fabianx’s picture

Priority: Normal » Major
Status: Needs review » Reviewed & tested by the community

RTBC, marking major

catch’s picture

This looks good to me.

David_Rothstein’s picture

One caveat on this, if you've recently enabled a new CPS module (like cps_file_entity) and then run this update function right afterwards, it turns out that for the new entity type (files in this case) it goes back to the old behavior - inserts a record for every entity in every changeset. That's not really correct, I guess.

This is still an improvement over the previous update function, though.

Our workaround was just to change the order in which things happen - make sure this update function ran before the cps_file_entity module was enabled.

merlinofchaos’s picture

Status: Reviewed & tested by the community » Fixed

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

The last submitted patch, 1: cps-update-7107-fix-2452199-1.patch, failed testing.

The last submitted patch, 2: cps-update-7107-fix-2452199-2.patch, failed testing.

Status: Closed (fixed) » Needs work

The last submitted patch, 2: cps-update-7107-fix-2452199-2.patch, failed testing.

catch’s picture

Status: Needs work » Closed (fixed)