I am posting this mostly since I have been pondering this for a long time. I do not yet have any plans or thoughts or a patch on how to approach this, which is why I am tagging this issue for 2.x.
I think we all have a serious love/hate relationship with the aegir task queue. It works, for us, so we use it. Its a brilliant thing, but it also sucks in a lot of ways. The developer experience is actually worse than the user experience.
The way drush @hostmaster hosting-tasks jumps to drush @site provision-verify is both brilliant and painful. When you are trying to write some customization it is nearly impossible to figure out where you are in the chain.
If we can make drush @hostmaster hosting-dispatch optional, and allow other systems like jenkins run our queues, it would make the tasks and logging so much more reliable and robust, and more importantly, able to integrate with other systems.
If we treat tasks as distinct entities in 7.x, this wouldn't be to tough I think.
Comments
Comment #1
anarcat commentedI agree with you - and I have started looking into this a long time ago already, but never did any code. See http://community.aegirproject.org/node/284 for my evaluation of queueing systems. My idea back then was to make hostmaster just a frontend to a queuing system, which means it could handle a lot more than the current "Drush and folks" installations (see #1044678: alien platform support).
Not sure that is doable in the 2.x timeline though.
Comment #2
helmo commentedMoving to 7.x-3.x version tag (7.x-2.x never existed, we went for 6.x-2.x).
Comment #2.0
helmo commentedadding a note about tasks being entities
Comment #3
ergonlogicI think the solution here could involve #2714065: Make the task queue a service, since our services system makes pluggability pretty easy to implement. It also doesn't make assumptions about where services run, and so would presumably also help towards moving away from the current hub-and-spoke model.
Basically, core could ship with a TaskQueue feature defining the base object, and a default Database implementation, along with a TaskQueueRunner feature that implements Cron and QueueDaemon as default implementations.
hosting_add_task()et. al. could then instantiate the proper object and call the appropriate methods.Then alternatives would be pretty straight-forward to implement as well. Of course, at this point, these services would only make sense to run on master servers, so we may want to add a shim to limit the the availability of such services to master servers, until we figure out the mesh model in a little more depth.
Comment #4
jon pughI should have posted this back in November!
I created a module that wires up Jenkins to run Hostmaster tasks.
https://github.com/opendevshop/hosting_task_jenkins/
drush @hostmaster hosting-task $NIDSo, essentially, now that I know a little bit more about this system, I can speak more about where to go from here...
The hosting queue system is already "pluggable", essentially. The thing that runs
drush @hostmaster hosting-taskorhosting-tasksis arbitrary.Regarding the discussion of externalizing the task queue, If tasks are being run on server_master, then it needs to bootstrap @hostmaster to save metadata. If they are being run elsewhere, or even tracked in a different queue system, we need a basic REST API in @hostmaster so that they can report back their metadata.
Either way, we need to keep hosting_task pretty much as is. Storing the metadata about tasks in Drupal is really useful. We can come up with a lot of clever ways to run tasks, and we can even make sure that it is possible to sync hosting_task with some other queue, but I think that Aegir should always work out of the box with as few dependencies as possible.
Supporting the possibility of an external queue like Rabbit or Celery is good (and I think API wise, Drupal itself already allows us to do that), but I do not think we should make it a part of core aegir.