My hosting_task_log table is currently soaring over 200MB (!!), with log entries dating back to when I first installed Aegir. Is there some mechanism to clear this table or set the maximum age or number of entries? In the short-term, is it safe to manually empty this table?

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Steven Jones’s picture

You'd potentially lose logs associated to old task runs, but if you don't care about that data, then it should be okay to delete.

Maybe we should clear up 'old' node revisions, for some definition of 'old'?

Dane Powell’s picture

I've taken a crack at this but I'm getting hung up on the rather complex world of Drush / Provision architecture, and I could use a pointer.

Basically, I've created a task called 'Delete tasks' that you can run on a site to (duh) delete tasks and task logs. In order for this to work, I've created a provision-task_delete Drush command that takes a list of nids as an argument and deletes those tasks in the Aegir database using node_delete(). The problem is that the Drush command is getting called in the context of the affected site, rather than the Aegir database, so node_delete is actually running on that site's database. How can I run node_delete() on the Aegir database?

Steven Jones’s picture

Going via the backend seems an unnecessary step to me, the front end is a standard Drupal site after all. Would something like the Revision deletion module suffice for the garbage collection?

Dane Powell’s picture

Thanks for the tip - I released my module at https://drupal.org/project/hosting_task_gc

Initially I wanted to make it as general as possible, which is why I tried to implement the provision task. I changed my mind and ended up making it dead-simple, since that gets 90% of the job done with 10% of the effort :) It's only processed 1/3 of the deleted sites in my Aegir installation, and already my {hosting_task_log} table has shrunk from 1GB to 700MB!

Basically, the module deletes any tasks (entries in {hosting_task}) associated with deleted sites. It also deletes any 'orphaned' task logs (entries in {hosting_task_log}). This seems like a sane initial solution - later on, it might make sense to allow for deletion of old logs on live sites, for some definition of 'old'. But like I said, this simple solution solves a big part of the problem (for me anyway).

The orphaned task log occur due to a possible design flaw in Aegir- when tasks get deleted, their associated task logs are not deleted. There should probably be a hook_delete to take care of this. However, in order for this to work, an 'nid' field would needed to be added to {hosting_task_log}.

I'm leaving this issue open if you want to 'fix' this problem, but feel free to close it as 'won't fix'.

Steven Jones’s picture

Title: Periodically clean up table hosting_task_log » hosting_task_log entries are not deleted when task is
Project: Hosting » Hostmaster (Aegir)
Version: 6.x-0.4-alpha3 » 6.x-1.7
Category: feature » bug

The orphaned task log occur due to a possible design flaw in Aegir- when tasks get deleted, their associated task logs are not deleted. There should probably be a hook_delete to take care of this. However, in order for this to work, an 'nid' field would needed to be added to {hosting_task_log}.

That's an Aegir bug if we're not cleaning up our tables as we go.

Dane Powell’s picture

Status: Active » Needs review
FileSize
2.77 KB

I think the attached patch should fix this. It deletes task logs in hook_delete() and hook_delete_revision(), and adds the necessary nid field to the {hosting_task_log} table. However, it is completely untested. My only Aegir installation is a production one, so I can't really test this myself.

A couple of other notes: update 6006 might be VERY resource-intensive, as it has to update quite a huge number of rows. Not sure if that's a problem or what to do about it. Also, I think updates should be counting like 61xx, not 60xx. Don't know if that's worth fixing at this point or not.

Steven Jones’s picture

Assigned: Unassigned » Steven Jones

Thanks for the patch, I can review it in a bit.

anarcat’s picture

Keep in mind that to be complete, this patch would need to remove the *existing* orphaned entries in a hook_update_N() function so that we clean up existing installs... no?

Dane Powell’s picture

Sorry, I forgot to mention in my summary in #6 that the patch does delete existing orphaned entries. Update 6006 adds the nid field to {hosting_task_log}, populates the nid field, then removes orphaned entries.

Steven Jones’s picture

So I set up a test site with about 450k log entries, and tested out your patch. We can improve it in a couple of ways, first, by adding an index on the vid column, and the nid column while we're there. We can also add a simpler query to remove the orphaned task logs.

I've added a follow up commit that adds those indexes, and adds the nid to the add log function.

I'll let it sit for a few days and the get this in.

Dane Powell’s picture

Looks reasonable, thanks for following up.

Steven Jones’s picture

Status: Needs review » Reviewed & tested by the community
FileSize
7.62 KB

Here's another little follow-up, which I'm going to commit.

Steven Jones’s picture

Status: Reviewed & tested by the community » Fixed

Pushed into both 6.x-1.x and 6.x-2.x

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

  • Commit a2469cc on 6.x-2.x, dev-ssl-ip-allocation-refactor, dev-1205458-move_sites_out_of_platforms, 7.x-3.x, dev-588728-views-integration, dev-1403208-new_roles, dev-helmo-3.x authored by Dane Powell, committed by Steven Jones:
    Issue #1185690 by Dane Powell: Apply patch for hosting_task_log()...
  • Commit 94a0a91 on 6.x-2.x, dev-ssl-ip-allocation-refactor, dev-1205458-move_sites_out_of_platforms, 7.x-3.x, dev-588728-views-integration, dev-1403208-new_roles, dev-helmo-3.x by Steven Jones:
    Issue #1185690 by Steven Jones: Fixed hosting_task_log() entries are not...

  • Commit a2469cc on 6.x-2.x, dev-ssl-ip-allocation-refactor, dev-1205458-move_sites_out_of_platforms, 7.x-3.x, dev-588728-views-integration, dev-1403208-new_roles, dev-helmo-3.x authored by Dane Powell, committed by Steven Jones:
    Issue #1185690 by Dane Powell: Apply patch for hosting_task_log()...
  • Commit 94a0a91 on 6.x-2.x, dev-ssl-ip-allocation-refactor, dev-1205458-move_sites_out_of_platforms, 7.x-3.x, dev-588728-views-integration, dev-1403208-new_roles, dev-helmo-3.x by Steven Jones:
    Issue #1185690 by Steven Jones: Fixed hosting_task_log() entries are not...