Scenario:

Suppose you have a particular test plugin which takes a very long time to run, relative to other tests. In order to ensure that it doesn't block testing, you want to give that particular plugin type a dedicated worker ... but also want that worker to be able to process other plugin types when it would otherwise be idle.

Currently (afaik), dedicating a worker to a plugin type is an all-or-nothing action ... a worker has it's own 'worker_category' value, and pulls only from that queue. This architecture, similar to PIFR, works well for creating dedicated worker 'environments' (i.e. you wouldn't ever want a mysql bot to help offload a postgres one); but is not as ideal for different 'plugins'.

Proposal:

If we were to change the 'worker_category' variable to be an array, rather than a string, we could then define multiple 'categories' per worker. The order of elements within the array would define their priority within the set of categories supported. When fetching a job, the worker would attempt to fetch from the queue associated with the first category in the array. If this came back empty, the worker would then attempt to fetch from the queue associated with the next category in the array instead of going straight into the 'sleeping for one minute' waiting period.

In addition to the scenario above, this would also allow us to do things like set up different queues for core versus contrib, and configuring some workers to prioritize each category of tests. In the abscense of any outstanding contrib tests, the contrib worker would then pick up core tests ... but any contrib patches which came in while that core test is being executed would be bumped to the front of the line for that worker once the core patch was complete, even if there were other core tests queued up before it. (Given that contrib tests often take much less resources than a core test, letting them run immediately results in more efficient processing from an 'average wait time' perspective.

Comments

boombatower’s picture

Seems like a nice thing to have. I would probably modify the services API to allow for an array to be passed as category, or a new method instead of making multiple calls as that seems like a waste if we really use this.

This will require some sort of mechanism on the backend to allow jobs to be placed in specific queues. We can simply make a property for this, but I am not sure if we should allow any string or try to define queue somehow.

One of my longstanding things to consider is to provide an info hook for plugins which allows them to define their default queue (we could also add a hook to add custom queues). My original design was to eliminate as much custom hooks as possible and just make plugin building look like writing a drupal module. As such it auto figures out queue and such based on module name, but that is rather restricting and not flexible. Changing all that would be a fair amount of work and I am not 100% sure is related to this.

I am going to be focusing on pift rewrite since none of these features in this queue are required to launch (other than commit one).

jthorson’s picture

> I am going to be focusing on pift rewrite since none of these features in this queue are required to launch (other than commit one).

Fair enough ... my preference would be to nail down the foundational architecture (which I'd put at around 90% right now), and then move on; but so long as it's built in such a way to support this in the future without serious refactoring, I'm game.

But before going full bore on PIFT, I'd encourage some attention on http://drupal.org/node/1666146 ... 100% of the jobs I run end with 'job crashed' because of it.

jthorson’s picture

Project: Worker (Sandbox) » Worker
Version: » 7.x-1.x-dev
Component: Miscellaneous » Code
jthorson’s picture

Project: Worker » Conduit
StatusFileSize
new3 KB

Pretty simple, really!

jthorson’s picture

Status: Active » Needs review
boombatower’s picture

Status: Needs review » Needs work
+++ b/includes/api.incundefined
@@ -9,15 +9,15 @@
+ * @param $categories
+ *   Comma separated list of job categories from which to claim an item from.

Why not just use an array? Since this is also intended to be used as API for extensions a command list seems less than ideal.

This would obviously apply throughout patch.

+++ b/includes/queue.incundefined
@@ -45,28 +45,31 @@ function conduit_queue_create($job) {
+  foreach ($queues as $category) {

I think the looping should be done in conduit_api_claim() and keep queue as a 1:1 interface for the queue API.

jthorson’s picture

Why not just use an array? Since this is also intended to be used as API for extensions a command list seems less than ideal.

Really, the only reason I left it a string was to keep the patch local to a single project; rather than having an issue in Conduit and a duplicate one in Worker (thus having to co-ordinate and commit to both projects simultaneously).

This is one benefit of the combined server/client project setup in PIFR ... you can modify the communications exchange on both sides with a single patch. At least in the early development phases, I still believe a single repository would help simplify things.

I think the looping should be done in conduit_api_claim() and keep queue as a 1:1 interface for the queue API.

Makes sense.

jthorson’s picture

Issue summary: View changes

Updated issue summary.