I am posting this mostly since I have been pondering this for a long time. I do not yet have any plans or thoughts or a patch on how to approach this, which is why I am tagging this issue for 2.x.

I think we all have a serious love/hate relationship with the aegir task queue. It works, for us, so we use it. Its a brilliant thing, but it also sucks in a lot of ways. The developer experience is actually worse than the user experience.

The way drush @hostmaster hosting-tasks jumps to drush @site provision-verify is both brilliant and painful. When you are trying to write some customization it is nearly impossible to figure out where you are in the chain.

If we can make drush @hostmaster hosting-dispatch optional, and allow other systems like jenkins run our queues, it would make the tasks and logging so much more reliable and robust, and more importantly, able to integrate with other systems.

If we treat tasks as distinct entities in 7.x, this wouldn't be to tough I think.

Comments

anarcat’s picture

I agree with you - and I have started looking into this a long time ago already, but never did any code. See http://community.aegirproject.org/node/284 for my evaluation of queueing systems. My idea back then was to make hostmaster just a frontend to a queuing system, which means it could handle a lot more than the current "Drush and folks" installations (see #1044678: alien platform support).

Not sure that is doable in the 2.x timeline though.

helmo’s picture

Version: 7.x-2.x-dev » 7.x-3.x-dev

Moving to 7.x-3.x version tag (7.x-2.x never existed, we went for 6.x-2.x).

helmo’s picture

Issue summary: View changes

adding a note about tasks being entities

ergonlogic’s picture

Issue summary: View changes

I think the solution here could involve #2714065: Make the task queue a service, since our services system makes pluggability pretty easy to implement. It also doesn't make assumptions about where services run, and so would presumably also help towards moving away from the current hub-and-spoke model.

Basically, core could ship with a TaskQueue feature defining the base object, and a default Database implementation, along with a TaskQueueRunner feature that implements Cron and QueueDaemon as default implementations. hosting_add_task() et. al. could then instantiate the proper object and call the appropriate methods.

Then alternatives would be pretty straight-forward to implement as well. Of course, at this point, these services would only make sense to run on master servers, so we may want to add a shim to limit the the availability of such services to master servers, until we figure out the mesh model in a little more depth.

jon pugh’s picture

Category: Feature request » Support request
Priority: Major » Minor
Status: Active » Closed (works as designed)

I should have posted this back in November!

I created a module that wires up Jenkins to run Hostmaster tasks.

https://github.com/opendevshop/hosting_task_jenkins/

  1. On hook_insert of the task, it pings Jenkins via REST API to run a predefined jenkins job, with the task NID as the argument.
  2. The jenkins job is configured to SSH in as aegir@server_master and run drush @hostmaster hosting-task $NID

So, essentially, now that I know a little bit more about this system, I can speak more about where to go from here...

The hosting queue system is already "pluggable", essentially. The thing that runs drush @hostmaster hosting-task or hosting-tasks is arbitrary.

Regarding the discussion of externalizing the task queue, If tasks are being run on server_master, then it needs to bootstrap @hostmaster to save metadata. If they are being run elsewhere, or even tracked in a different queue system, we need a basic REST API in @hostmaster so that they can report back their metadata.

Either way, we need to keep hosting_task pretty much as is. Storing the metadata about tasks in Drupal is really useful. We can come up with a lot of clever ways to run tasks, and we can even make sure that it is possible to sync hosting_task with some other queue, but I think that Aegir should always work out of the box with as few dependencies as possible.

Supporting the possibility of an external queue like Rabbit or Celery is good (and I think API wise, Drupal itself already allows us to do that), but I do not think we should make it a part of core aegir.