Drupal 10, the latest version of the open-source digital experience platform with even more features, is here.In #1033072: migrate interface takes forever (litterally) to load, I have a fix to improve the performance on huge hosting_package_instance tables. However, those huge tables are usually a sign of zombie entries more than huge deployments (although we do want to allow that so the index fix probably should stay anyways).
I wonder what is the proper way to cleanup such old installs. In #934864: platform verification tasks take forever, and then some more, omega8cc refers to that problem and specifically to those two comments: http://drupal.org/node/907248#comment-3429476 and this: http://drupal.org/node/907248#comment-3436884
I am not sure how does comment relate to the zombies, but I'd be curious about how to fix older installs.
Anyone?
| Comment | File | Size | Author |
|---|---|---|---|
| #11 | hosting-clean_up_zombie_packages-1034520-11.patch | 1.1 KB | ergonlogic |











Comments
Comment #1
Anonymous (not verified) CreditAttribution: Anonymous commentedIs there a clear discrepancy, i.e many more entries in hosting_package vs hosting_package_instance? Maybe we can delete anything that doesn't have an instance of that package? Although I suppose that is unsafe (unless we then force a Verify of platforms afterward in case we delete packages on platforms that haven't had sites installed on them).
Not really sure :s
Comment #2
omega8cc CreditAttribution: omega8cc commentedThis can be tricky, because you can have two categories of zombies there: duplicates and those "normal" - left from no longer used and/or modified platforms.
I'm afraid the only reliable and secure method would be to purge them all (all records) completely and re-verify all platforms. I didn't check the code yet, but I hope it will be enough.
[EDIT] Ah, full purge would require re-verifying also all sites. Not the best in the world idea, probably :(
Comment #3
bwood CreditAttribution: bwood commentedsubscribe
Comment #4
ergonlogicWe could build an array of active sites and platforms using _hosting_get_platforms() and hosting_get_sites_by_status(). We could then delete any entry in hosting_package_instance where the rid isn't among those nids.
The suggestion in #1 should work for hosting_package, since we add instances for platforms (without sites) too.
Presumably it would be safe to delete any entries in hosting_package_languages where there isn't a corresponding iid in hosting_package_instance, but I don't know if that's even a problem.
We could perhaps start calling hosting_package_instance_sync() with just the $rid argument in post site and platform delete hooks. Since there won't be any packages passed in, it should just delete any instances associated to rid. We could then do the one-off purge in an update hook.
I'm making this a feature request, and setting as 'needs review' to get opinions. Since I've just been working with the packages system, I'd be happy to add this fix, if we agree that it's a valid path to pursue.
Comment #5
anarcat CreditAttribution: anarcat commentedI wonder if we shouldn't do what is suggested in #2036283: Make Aegir aware of site-specific packages, that is TRUNCATE the _instance table and rebuild it. But instead of doing a verify on each site (which will trash the cache), why couldn't do a lightweight version somehow?
Maybe we could work on a temporary table or something... We *do* have all the data anyways, it's just that there's too much of it. Now, within #2036283: Make Aegir aware of site-specific packages we actually change the database structure, so that may require an actual verify to get more data from the backend, but that's a different story.
Comment #6
ergonlogicWe could start here, since it should at least avoid additional build-up of zombie entries. Then we could work out how to do a one-time cleanup.
Comment #7
anarcat CreditAttribution: anarcat commentedWhat I am worried is that there is some data in the instance table that is not associated with a package or even a platform, so even if we call it with $rid, it will not clean anything up...
Comment #8
omega8cc CreditAttribution: omega8cc commentedYes, there is usually a ton of duplicated leftovers, so we probably need to re-build it from scratch, which in turn should trigger re-verify for platforms and sites :/
Comment #9
ergonlogicRight, but I don't think that doing a one-time cleanup will fix this in the long run. We don't currently remove package instances when platforms or sites are deleted:
So, what I'm suggesting in #6 is that we, at least, start cleaning up after ourselves systematically.
Comment #10
ergonlogicAlso, while we define hosting_package_instance_delete(), we don't actually call it from anywhere. We could change it to delete an array of iid's, or write a new hosting_package_instances_delete() to do it. Then:
Orphans instances should get deleted by doing this, regardless of whether they have an rid or not. This would be pretty easy to do in a hook_update(), without having to verify everything.
Comment #11
ergonlogicHere's a patch for #6
Comment #12
anarcat CreditAttribution: anarcat commentedOooh, I see. Of course we need to cleanup after ourselves! :)
I see what you mean, that's a good way to go!
Comment #13
ergonlogicI merged the patch in #11 in 210d94f. There may be more cleanup like that we can do too.
I'll work on #10, unless anyone has better suggestions?
Comment #14
anarcat CreditAttribution: anarcat commentedgo.
Comment #15
mrfelton CreditAttribution: mrfelton commentedManually applied the patch in #11 to my aegir 6.10 install (it didn't apply cleanly), truncated my hosting_package_instance table and reverified all platforms and sites. My hosting_package_instance has reduced in size from 2297248 rows to just 72665, and aegir is far speedier now taking about 5 seconds to load the migrate or clone modal dialog, vs the 30 or so it took before hand. I have confirmed that deleting a platform now actually deletes entries from the hosting_package_instance table which should help to keep its size down. Thanks for the tips, shame this didn't make it into the recent 1.10 release though.
Comment #16
ergonlogicWe don't have a patch for this anymore :)