I've set the cron queue in Barracuda hostmaster to run every minute, and it seems that the aegir user's crontab:
*/1 * * * * /var/aegir/drush/drush.php '@hostmaster' hosting-dispatch
...is indeed running every minute. In the '/admin/hosting/queues' interface, it seem to think it's running every minute (never says more than '1 minute ago' for the cron queue.
Sites seem to get a cron run (per elysia or '/admin/reports/status') every 4-10 minutes, but certainly not every minute.
Sites have elysia_cron module to manage cron, and certain cron functions really do need to run every minute (although these are few and far between, most are hourly).
Comment | File | Size | Author |
---|---|---|---|
#16 | hosting-1405904-parallel-queue.patch | 9.17 KB | Steven Jones |
#6 | Aegir-Cron-CPU-usage.png | 33.38 KB | Steven Jones |
Comments
Comment #1
omega8cc CreditAttribution: omega8cc commentedI'm not sure it is related to Barracuda, as main Hostmaster instance is running vanilla Aegir system, including cron settings.
Note that Aegir will never run cron for *all* sites every minute if the number of sites is higher than Aegir default threshold (calculated internally). This has nothing to BOA anyway, I'm afraid, unless you believe that this mod affects anything here: http://drupalcode.org/sandbox/omega8cc/1074912.git/blob/HEAD:/modules/ho...
Please try to debug this and move the issue to Hostmaster queue maybe.
Comment #2
obrienmd CreditAttribution: obrienmd commentedThank you, will debug further with that information and probably move to hostmaster.
Comment #3
obrienmd CreditAttribution: obrienmd commentedOK, I did a bit more poking around, and see that this is caused by 'count' being set to 4 when hosting_cron_queue is called.
However, in the Aegir UI, it says '20 sites per [configurable time period]'. I have 20 sites on my aegir instance, and because of elysia_cron, running cron on all 20 every minute is not a problem (and needs to happen). Is there a reason that the Aegir UI is not showing an accurate # of sites that are going to run cron per hosting cron run?
Comment #4
helmo CreditAttribution: helmo as a volunteer and at Initfour websolutions commentedI'm, not sure this is still relevant ... but any new development will have to be on 7.x-3.x or 4.x
--
This is a templated response, please re-open or comment if you think it's in error.
Comment #5
Steven Jones CreditAttribution: Steven Jones at ComputerMinds commentedSo the stuff around 'batch' queues like the cron queue is really weird!
The maths does work out but the labels being used are super confusing.
Suppose I have 100 sites and I want cron to run on them once a day, at the moment, the maths will mean that:
This means that over the 24 hours 102 cron invocations will indeed happen.
However, the cron runs will peak at the 4 hour points, and nothing will happen in between, unless it does actually take 4 hours to run those 17 crons.
The stuff about 'threads' is a bit misleading I think. There's no real sense in which the cron queue is ever going to spin up 6 threads and execute cron on them in reality.
I reckon we could do a better job of spreading the load over the period selected. Not sure of the exact maths required though :)
(I have a Mathematics degree)
Comment #6
Steven Jones CreditAttribution: Steven Jones at ComputerMinds commentedSee this image:
This is basically showing the the Aegir cron implementation on this server runs cron sort at 2 hour-ish intervals with relatively big peaks, rather than spreading it out into the massive gaps in between.
Comment #7
memtkmcc CreditAttribution: memtkmcc commentedWe have replaced the stock cron module with ported to D7 hosting_advanced_cron. It is currently tweaked to run with BOA and should be improved to make it work with vanilla Aegir, but maybe it is worth to look at?
Comment #8
Steven Jones CreditAttribution: Steven Jones at ComputerMinds commentedIt looks like that module still relies on the code in Aegir's dispatcher to start cron at specified intervals, so it would still need addressing in hosting.
Comment #9
memtkmcc CreditAttribution: memtkmcc commentedYes, correct. We don't experience these spikes only because people usually define custom cron schedule per site, plus, the module doesn't allow to run cron for too many sites at once, while globally we set
hosting_queue_cron_frequency 1
andhosting_cron_default_interval 86400
It is rather a side note, because I realize that such change would require Aegir 4.x, while the fix for the current logic can be implemented in 3.x
Comment #10
Steven Jones CreditAttribution: Steven Jones at ComputerMinds commentedSo maybe we'll get rid of all this anyway once we have #422962: Add Hosting Tasks Schedule module squared away, but here's a very initial attempt at implementing a (badly named) 'parallel' queue that spreads the load evenly over the entire time span as much as possible.
Going back to my earlier example:
But now:
Comment #11
ergonlogicAs you mention, "parallel" doesn't seem to be a good name for this. Maybe "spread", "diffuse", or one of it's synonyms? "Equalized"? "Flattened"?
There are a couple magic numbers in there. Can we convert those to descriptive constants? 86400 is obviously the number of seconds in a day, but 120 seems arbitrary. Maybe it's just me, but I'm having some trouble actually parsing what it's supposed to be doing there. Could we add some docs?
Also, this appears to be a change in the default behaviour. If we re-interpret this as a bug report, I think we can justify that change. Otherwise, we may have to hold off until Aegir 4.x (which would be a shame).
Comment #12
Steven Jones CreditAttribution: Steven Jones at ComputerMinds commentedSpread, I like it.
Here's a patch that:
(We can then change the cron queue type to be spread in 4.x)
Comment #13
Steven Jones CreditAttribution: Steven Jones at ComputerMinds commentedSorry, bad patch.
Comment #14
ergonlogicThere appears to still be one stray reference to "parallel":
* @see HOSTING_QUEUE_TYPE_PARALLEL
. Otherwise, the code looks great! I like the new hooks. It'd be pretty trivial to add a little "hosting_cron_spread" module that invokes that hook, to allow users to opt-in to the new behaviour right away.I'm leaving in "Needs review" for the time-being, since I haven't had time to actually test this. +1
Comment #15
Steven Jones CreditAttribution: Steven Jones at ComputerMinds commentedOh, good spot, here's that change.
Comment #16
Steven Jones CreditAttribution: Steven Jones at ComputerMinds commentedGah, missed another one.
Comment #18
helmo CreditAttribution: helmo as a volunteer and at Initfour websolutions commentedIt looks good, and as it does not changed the current default ...
Committed.
PS: The 'type from batch to spread' change might also make for a nice example in the hosting.api.php file.