Problem/Motivation
hook_cron() is fairly dated and limited concept. One is that you can only have a single implementation per module, so if you need to do multiple things, you need to put it all in the same function. Second, there is no built-in scheduling, every cron hook in core runs at the same time.
Most of the time these days, when I write a cron it's a single line that calls out to a service, or I skip hook_cron() entirely and just directly set up a drush command or service call in an external cron system (using platform.sh definitions for example) when I need to have some specific execution times.
Modules like ultimate_cron and in D7 elysia_cron try to work around this but run into quite a few limitations that decreases DX. ultimate_cron provides the ability to set up schedules per cron job, multiple cron jobs per module (but those you need to register by manually creating config entities which confuses many people), per-job logging, multiple threads and so on. There is apparently a pretty big need for that, ultimate_cron has 40k installations, despite only having an alpha release and being barely maintained by me.
There's a very old existing issue for the scheduling aspect, I deliberately created a new one: #154043: Cron should not remain monolithic (Implement a scheduling hook for cron).
Steps to reproduce
Proposed resolution
I've discussed this a bit with @longwave and I can see two ways forward for improving cron.
1) Custom tagged services
We implement our own system using tagged services and an interface. We could start with a basic 1:1 replacement and then start to add more features on top of that, like crontab-like scheduling mechanism. At least the first step is very simple, straight-forward to deprecate and convert to. Scheduling gets a bit more complicated. Ultimate Cron could build on top of that by reusing the discovery and adding more features on top of that.
2) Symfony Scheduler/Messenger
We integrate the new Scheduler component in Symfony 6.3: https://symfony.com/blog/new-in-symfony-6-3-scheduler-component. That's a lot more involved. The scheduler depends on the Messenger component. Which is comparable to our queue system. So we would pretty much need to replace not just hook_cron() but also the entire queue system. At first, this dependency sounded pretty weird to my, why do you need queues to have a cron system? But I think it can be best explained to see those scheduled things as time-based queue items, whereas the regular queue/messenger entries are data-based. That opens up a lot of powerful features for sites that need it, like async/parallel execution of cron jobs with multiple workers. Something that ultimate_cron tries to do with threads.
That said, there are a lot of things to figure out. For example, Symfony relies on worker processes that keep running, we don't have that and I'm pretty much certain we don't want to require that, so we still need to have a single cron URL/command that you can start in repeating intervals. But I assume that should be doable. Symfony supports immediate processing and async through transports, which is more like our queues, the way things are quite different to how we manage it at the moment. You have handlers (queue plugin), messages (queue items), transports (storage/channel for messages going through, and then routing that sets up which messages are sent to which transport. I think we could have an AsyncMessageInterface and all messages that implement that are processed through a single async transport
One interesting bit is that the symfony mailer also integrates with the messenger component. So having both of those makes it very easy for a site that sends a lot of mails or needs to send them to through some slow-ish transport to send them all async through a worker.
Remaining tasks
User interface changes
API changes
Data model changes
Release notes snippet
| Comment | File | Size | Author |
|---|---|---|---|
| #59 | 3383487-nr-bot.txt | 90 bytes | needs-review-queue-bot |
| #57 | 3383487-nr-bot.txt | 90 bytes | needs-review-queue-bot |
| #48 | 3383487-nr-bot.txt | 90 bytes | needs-review-queue-bot |
| #34 | ultimate_cron_ui.png | 30.34 KB | berdir |
Issue fork drupal-3383487
Show commands
Start within a Git clone of the project using the version control instructions.
Or, if you do not have SSH keys set up on git.drupalcode.org:
Comments
Comment #2
kim.pepperSounds very interesting!
Comment #3
berdirInvestigating the scheduler a bit more, I feel like it's probably a bit early for the scheduler. Also looks like each scheduler is its own transport and needs to be run as a worker, I'm not sure if that fits into Drupal. Plus, it's experimental and the documentation is still literally a 404 page.
Comment #4
andypostComment #5
berdir@andypost: queue issues are only relevant if we would go with the Symfony solution, which I think is fairly unrealistic at this point. And if we do, every single issue that touches the queue system would be "related" as we'd completely replace.
Comment #6
andypost@Berdir yes, I mean option 2 looks more viable to me and there's already lots of attempts to decompose cron, so queue seems the most straightforward approach. Moreover SF invested a lot into messenger's architecture
Comment #8
berdirpushed a proof of concept to the merge request, a bit weird with reusing the existing CronInterface and didn't test yet at all, but seems simple enough :)
Comment #9
berdir> @Berdir yes, I mean option 2 looks more viable to me
I don't think it was ever more "viable", it would be far more complex and so much more work. It would be more flexible and open up more possibilities, but the current state of that component (zero documentation, experimental) makes this a simple choice for me. Note that we still have the option to look into replacing our queue system separately with the messenger component, then *afterwards* we could think about the scheduler.
Especially for the initial step of just converting to tagged services without extra features. That's very straight forward. Still haven't actually tested anything yet, but I've extend the proof of concept to fully replace system_cron().
Some notes on that:
* Yes I know there are some missing docblocks, documentation that needs to be updated, dependency injection to be done. That's all not very relevant right now, what we should discuss the overall API.
* As mentioned, I reused the existing CronInterface for it, but that's really a CronRunnerInterface (which I'm not sure really should exist at all). So we should probably introduce a new Interface that's very similar to that one.
* some system_cron() things are called on interfaces/official API's others only on specific implementations. I took some shortcuts with that for now, for example Flood has it on the interface but I'm just doing the cron thing on the database implementation. We should possibly deprecate some garbage collection interface methods and just let implementations handle it. For now, flood for example would need a separate service that calls that method on the flood service.
* Lazy loading those services would be nice, but that only really gets interesting when we support scheduling.
* BC: This will break installations using ultimate_cron completely, not easy to handle that. We could keep system_cron() and so on around but somehow flag it so that the implementation in core doesn't actually call it? Or we just live it, there are only a few alternative cron runner implementations.
Setting to needs review to get some feedback on that.
Comment #10
smustgrave commentedSeems to have some test failures.
Comment #11
aaronmchaleWhile we're modernizing Cron, could we also add a Console command for running Cron through the
drupalutility? (Probably a follow-up/separate issue, but figured if we have momentum behind doing something with Cron it might be a good idea to do at the same time)Comment #12
berdir> Seems to have some test failures.
It's a proof of concept patch converting a single hook to the proposed new patch. It's far from complete for much bigger reason than that but I'm looking for architecture review, so IMHO needs review is an appropriate status.
> While we're modernizing Cron, could we also add a Console command for running Cron through the drupal utility?
The public API to run cron is unaffected with the currently proposed approach and there is no reason to add a console command in this issue as well, this is going to be more than enough on its own. If you want that, feel free to open a separate issue.
Comment #13
mfbOne suggestion I have would be making it easy for contrib modules to monitor cron jobs, by e.g. firing an event when a cron job starts and stops. But, to keep this issue smaller that could likewise be a separate feature request.
Comment #14
berdirWow, I just couldn't understand why there's an exception thrown when it's called directly as a service but not on 11.x I was debugging through the installer and that site directory protection code, and I just could not see a difference.
Turns out there is no difference. The very same exception is also thrown on 11.x in that test, the only difference is that all exceptions are caught and "logged" for cron hooks, but I didn't have that part yet for services.
The problem is a conflict between the test user agent stuff, which stores the info in a .htkey file inside the sites directly, and the site directory protection logic, which runs before in the installer which makes that directly read-only. Which then later on the test reverts again (I guess?). I really wish we could just remove all that strange protection code (#3091285: Composer scaffolding fails when permissions on default.settings.yml or default.settings.php is not writable.).
But it's not the problem of this issue, I just accidentally exposed it.
I do want to move this forward, but before I do the work of converting all those cron hooks to services, I'd like some feedback on the general idea and architecture. Specifically, as mentioned, I think this should be new interface and method, looking for suggestions on how to name that.
Comment #15
berdirI converted a few more that were easy to move to an existing service. Looking at the others, I'm a bit unsure if we should really replace all of them in this issue and deprecate hook_cron() already. It's going to be a big issue and many of them are tied to existing issues to move functions into services, so we'd have to make up new services only to deprecate them again later.
Then again, actually getting all those issues in is likely going to take years if it ever happens. But we can always convert the leftovers to services any time we want.
Comment #16
kim.pepperThere is likely a bit of overlap here with cache clearing cron runs in #3014752: Convert drupal_flush_all_caches() function to a Cache Clearer service
Comment #17
larowlanHaha yes cron's pokemon exception handling has bitten me many times in kernel tests. So many times I wrote myself some notes 😭 https://www.previousnext.com.au/blog/making-your-drupal-8-kernel-tests-f...
Comment #18
dpi@Berdir, on Symfony Scheduler, I've found it works quite well!
See https://git.drupalcode.org/project/sm_scheduler + Documentation
Sure, I'm not so sure running a worker is something we've agreed apon. But how much different is it to many of us who already have an infrastructure level cron task. The equivalent would be running the Symfony Messenger consume command with a short time limit.
I feel like Scheduler is certainly the most modern approach proposed here.
Comment #19
berdirthat does look interesting and pretty much like I expected it would from reading the docs, but there are several things that sound like pretty hard blockers:
* having to run a consume process is a no-go and definitely not the same as a cron. Even on more advanced hosting like platform.sh, that implies a separate worker container, while a cron you just define and it runs within the environment. And regular webhosting won't be able to do that at all. That's why we ship with the ability to run cron through a URL (so you can do it externally and some webhostings support specifically that) and automated_cron. So unless we find a way to run consume on-cron or on the fly like automated_cron and the idea to run queues in destruct/shutdown as well, there is no way that we'll get that into core.
* I'm confused by the whole name thing. Having to explicitly specify the names really doesn't work IMHO, we can't require you to customize that command every time you install a module? Having the ability to do so seems like a powerful feature for very few sites that process a large amount of messages but I'd expect we need something like a consume --all and maybe then to be able to separate one of them a consume --except. The counterpart to that with the current queue system is to have queue definitions that do not run on cron.
Comment #20
dpiI'm sure a unified everything/all transport could be developed, which generates for all transports.
Like I said I think you can configure it to a time limit, so long as its done with the message its currently consuming. Just like
drush queue:run. The main thing with Scheduler is we'd need to make it produce messages for the time period the process was not actively running. Because normally scheduler wont "catch up" with messages that were not produced during the time the transport wasn't running. I dont think its especially challenging. Record time of process last run... simulate generation of messages for a time period... continue to generate messages until the process exits. Repeat.Comment #21
dpiThe consume process is just one way of running transports. You could do this programmatically with code, e.g. request terminate: https://git.drupalcode.org/project/sm_scheduler/-/blob/1.x/tests/src/Ker... . This code fragment is a slim version of what
\Symfony\Component\Messenger\Workerwould normally do during the consume process.Catch-up would be the main concern.
Comment #22
aaronmchale> > While we're modernizing Cron, could we also add a Console command for running Cron through the drupal utility?
>
> The public API to run cron is unaffected with the currently proposed approach and there is no reason to add a console command in this issue as well, this is going to be more than enough on its own. If you want that, feel free to open a separate issue.
That makes sense, yes I agree it would be better as a separate issue then.
Comment #23
berdir> Catch-up would be the main concern.
Yes, and if we do need to do that ourselves then how much exactly do we benefit in the end and how interchangeable will the system really be?
For me, based on what I've seen so far, the answer is fairly clear, also as (not very active but sole) maintainer of ultimate_cron.
1. Convert hook_cron() to tagged services + an interface is a straight forward step that allows us to remove another hook. It also allows each module and component to provide an arbitrary amount of cron jobs, ultimate_cron can discover them as well instead of the current manual step and just like now, allow to customize their execution.
2. Add features as deemed useful, for example the ability to define a desired schedule/frequency. ultimate_cron can support that as the default configuration and replace the current hardcoded assumptions for core. Sure, that will need exactly the same scheduling and catch-up logic, but that works in both ways and we could still use that if we switch in later steps.
....
N. If/after core successfully replaces its queue system with the messenger component, then we can look into using the scheduler component as well, and we'd IMHO lose very little, those tagged services can be converted to message handlers, the scheduling component from step 2 can be reused and so on. And ultimate_cron can go to its ultimate destination, the trash bin :)
Comment #24
needs-review-queue-bot commentedThe Needs Review Queue Bot tested this issue. It fails the Drupal core commit checks. Therefore, this issue status is now "Needs work".
This does not mean that the patch needs to be re-rolled or the MR rebased. Read the Issue Summary, the issue tags and the latest discussion here to determine what needs to be done.
Consult the Drupal Contributor Guide to find step-by-step guides for working with issues.
Comment #25
berdirI fixed flood tests, although that's kinda weird. The testCleanup() method is only testing the memory backend aka the default in kernel tests. there is a database test as well, but not sure how useful it is to test the memory backend like that. It does point to a problem though, that it would break if someone were to replace the service with something that either doesn't add the tag or doesn't have the interface.
redis.module does that for example.
Also, Drupal\Tests\dblog\Kernel\DbLogTest::testDbLogCron is a problem too. that counts the hook implementations and then asserts the number of log entries that were created. First, seems wrong that the test coverage for that is in db log module, second, with services, it's a lot more and we don't have a way to know how many to expect.
Comment #26
larowlanLooking good, left some comments that largely mirror what you've already said in #8 and #9
Comment #27
berdirThanks for the feedback. Yeah, a separate Interface and method definitely makes sense, suggestions on the method name? Not sure what we'd call a service that has a has a cron method. I don't think CronRunner makes much sense that really seems like what the current CronInterface should be called (or CronScheduler or something). Looking at the symfony messenger/scheduler components I also don't really see an obvious concept/name we could use from there as the concepts are very different.
CronSubscriberInterface maybe, it's similar to the event system, instead of subscribing to an event, you subscribe to cron scheduling?
Do you also agree with this being an intermediate step where we introduce this as a parallel system to the hook and don't deprecate anything yet? As mentioned earlier, there are a lot of cron hooks related to things that are being converted to services in other issues, forcing just that part to a service now will just complicate that and then instead we can just update the relevant issues to directly use this system instead of going through hook_cron(). search/statistics stuff for example.
Then we can decide later if/when we want to deprecate the hook.
Comment #28
berdir> The issue summary talks about stuff like more advanced scheduling etc (e.g. only running every N cron runs etc)
Pretty sure that should be a follow-up. We can easily expand this and introduce annotations or something to allow implementations to say that they want to be called once per hour or once per day or so. And for now we can let ultimate_cron handle that, it just needs to support this new interface as well and then it can continue to function as it does now.
Comment #29
larowlanCronSubscriberInterface feels as good as any :)
I think an intermediate step makes sense, it would make it easier to make progress in increments
Comment #30
berdirAdded CronSubscriberInterface and new title, as we don't replace hook_cron() just yet.
For more scheduling features, we could then go back to using #154043: Cron should not remain monolithic (Implement a scheduling hook for cron) for that.
I'll also add a follow-up to deprecate hook_cron() and will prepare a change record.
Comment #31
dwwFWIW, +1 to
CronSubscriberInterfaceand the new scope / plan. This all sounds really good. Excited to see movement and improvements to this highly important but often neglected part of core.Comment #32
berdirOk, test are now passing, after fixing a lot of stupid bugs and typos ;)
Also created a change record: https://www.drupal.org/node/3414595.
To do:
* Follow-up issue to deprecate hook_cron() (or consider it, or at least convert more of core to services)
* documentation
* ultimate_cron issue to add support for this. I do want to test if discovery of this is working correctly there and how to handle that.
Comment #33
berdirAlso, I put the new Service directly into Drupal\Core as CronInterface and Cron are also there. Feels a bit weird, but also weird to create a new namespace just fro this new interface when the existing code is not there?
Comment #34
berdirUltimate Cron support: #3414863: Support CronSubscribers, it's not very pretty, but ultimate_cron already supports service methods, so I'm just converting them to use service:runCron as callback, with an assumption that the first part until the dot might be a module, so then assigning to that:
You could basically already do that with any service, but this allows to autodiscover those services.
Comment #35
kim.pepperI don't want bikeshed, but I really think
service:runCron()is the wrong name. We are not 'running cron' when calling this method. Instead, we are running when a cron event occurs. Something likeonCron()is better IMO.Comment #36
joachim commented+1 to #35.
> Feels a bit weird, but also weird to create a new namespace just fro this new interface when the existing code is not there?
I'd say put it in a new namespace, then file a follow-up to move the cron-ish things that are currently top-level. Otherwise, we'd adding to the classes that will need BC handling when we move it all later.
Comment #37
longwaveIt would also be great to use autoconfiguration, so you only have to implement the interface - the tag would be automatically applied to matching services, ie. in CoreServiceProvider:
Comment #38
berdirKeep in mind that with ultimate_cron, or if we add support in core eventually for some kind of cron rules/scheduling, there isn't really a single/global cron run anymore. does onCron() still make sense then?
Looking at https://en.wikipedia.org/wiki/Cron, it could also be runCronJob() or so?
Happy to pick whatever name we decide on, also for the interface, just pointing that out.
Comment #39
longwaveSo another thought that has come to mind that might help decide here: in a way, right now, we are kinda just replicating the event dispatcher, although we want to extend this in the future so some events become dispatched only when a time condition is met. It's possible one service might want to respond to different time schedules with different methods. How do we envisage that working? Similar to the getSubscribedEvents() method of EventSubscriberInterface? Or with method attributes? Or something else?
Comment #40
kim.pepper@longwave Sounds like you are suggesting something that looks closer to Symfony Messenger / Scheduler.
Comment #41
longwaveYeah, but also thinking how we get from where we are to where we want to be, without adding something new that we immediately end up deprecating or replacing.
Maybe it's enough here to have cron as an event, and if you want to run code on every cron run, you subscribe to that? Otherwise I'm not sure what a new interface and custom dispatcher adds?
Comment #42
dpiAgreed. I dont think we're aiming high enough.
-
Not to derail, just following on my thoughts on Symfony Messenger from October @ #18. I've published thoughts on Messenger + Scheduler along with a bunch of relevant posts:
https://www.previousnext.com.au/blog/symfony-messenger/post-4-scheduling...
Comment #43
berdir> So another thought that has come to mind that might help decide here: in a way, right now, we are kinda just replicating the event dispatcher, although we want to extend this in the future so some events become dispatched only when a time condition is met. It's possible one service might want to respond to different time schedules with different methods. How do we envisage that working? Similar to the getSubscribedEvents() method of EventSubscriberInterface? Or with method attributes? Or something else?
For scheduling instructions, I was thinking about using attributes:
#[CronSchedule(cron: '0 * * * *')]. I didn't think about the use case for multiple cron jobs on a single service, I think that's pretty rare, but once we have the attribute we could probably just check every method on the class for having such an attribute and support multiple if we need/want to?IMHO a separate method would complicate this quite bit in its current state as you'd always need two methods.
Benefits compared to a regular event:
* There is no need for an event class, there is no input/output
* We don't want cron jobs to be able to stop execution, so we'd need to prevent that.
* Easier discovery and ability to call the subscribers directly. As shown in #34 this is useful for ultimate_cron now and it would also be quite feasible to add optional support for that in core too with that attribute.
In regards to messenger/scheduler component, my thoughts are above, for example #23. This is IMHO an improvement that benefits both core and ultimate_cron and it's pretty simple and doable now. Getting the whole messenger and scheduler components in core and replacing our queue and cron systems completely sounds like a huge undertaking and something that I won't really be to help with. Already in the middle of such an undertaking with the mailer component and that's more than enough for me.
We could for example decide that we will need deprecate hook_cron(), so we won't force anyone to convert just for the sake of it, and then if messenger/scheduler really gets into core and replaces the queue system, we could deprecate both in favor of that. But I feel like there are a lot of option questions to be resolved before that is possible.
Comment #44
needs-review-queue-bot commentedThe Needs Review Queue Bot tested this issue. It no longer applies to Drupal core. Therefore, this issue status is now "Needs work".
This does not mean that the patch necessarily needs to be re-rolled or the MR rebased. Read the Issue Summary, the issue tags and the latest discussion here to determine what needs to be done.
Consult the Drupal Contributor Guide to find step-by-step guides for working with issues.
Comment #45
dww3 conflicts during interactive rebase. 1 and 3 were trivial to resolve by leaving use statements merged and alphabetized. #2 was ever so slightly more complicated, but still pretty trivial. Attaching the finished service definition as a .txt file.
Comment #46
dwwFWIW, strongly +1 to #35, #36, etc that
onCron()would be a much better name than the currentrunCron(), but I don't want to go changing that without a +1 from @berdir first. #37 seems unaddressed, although the replies to the other concerns about event, etc, seem solid.This is probably more accurately NW at this point, but I wanted to resolve the conflicts and get us back to a green MR in the meanwhile.
Comment #47
dwwp.s. Not to bikeshed too much, but if
onCron()is objectionable for some reason, I'd pickdoCronJob()overrunCron()orrunCronJob(). /shrugComment #48
needs-review-queue-bot commentedThe Needs Review Queue Bot tested this issue. It no longer applies to Drupal core. Therefore, this issue status is now "Needs work".
This does not mean that the patch necessarily needs to be re-rolled or the MR rebased. Read the Issue Summary, the issue tags and the latest discussion here to determine what needs to be done.
Consult the Drupal Contributor Guide to find step-by-step guides for working with issues.
Comment #49
jonathanshawI understand the concerns with
runCron(), it suggest initiating a whole global cron in the way the\Drupal::service('cron')-run()does now. But I also understand Berdir's concern thatonCron()still suggests a single global cron event, which we are moving away from.I think there's merit in talking of a "xxxCronJob" as @dww suggests or "xxxCronTask".
But how about just
execute()?Comment #50
longwaveRe #49 we already have \Drupal\Core\Executable\ExecutableInterface, not sure if we want to reuse that here or not.
Comment #52
sakthi_dev commentedResolved the conflicts but still needs to address the above comments.
Comment #53
jonathanshawRe #50 \Drupal\Core\Executable\ExecutableInterface is very much part of the plugin API, whereas tagged services are used here.
Comment #54
berdirFixed some problems with the rebase.
I have no strong opinion on the name, if onCron() is acceptable for others, then fine by me. Renamed accordingly, also moved the interface, added autoconfig and removed all hardcoded tags. Not sure if the one in KernelTestBase is needed or not.
Comment #55
berdirAuto configuration makes tests very unhappy:
Reverted that for now.
Comment #56
berdirOne thing that might work to avoid that error is that some cron tasks should be on their own service that just uses the respective factory, although that means we need to keep the respective garbage collection APIs, or we refactor it further and make garbage collection the responsibility of specific implementations, but that would likely need quite a few additional deprecations and changes (and might still need separate cron services, not sure).
Thoughts?
Whatever we do, autoconfig does seem to be a bit fragile...
Comment #57
needs-review-queue-bot commentedThe Needs Review Queue Bot tested this issue. It no longer applies to Drupal core. Therefore, this issue status is now "Needs work".
This does not mean that the patch necessarily needs to be re-rolled or the MR rebased. Read the Issue Summary, the issue tags and the latest discussion here to determine what needs to be done.
Consult the Drupal Contributor Guide to find step-by-step guides for working with issues.
Comment #58
berdirRebased. Still needs feedback on #56.
Comment #59
needs-review-queue-bot commentedThe Needs Review Queue Bot tested this issue. It no longer applies to Drupal core. Therefore, this issue status is now "Needs work".
This does not mean that the patch necessarily needs to be re-rolled or the MR rebased. Read the Issue Summary, the issue tags and the latest discussion here to determine what needs to be done.
Consult the Drupal Contributor Guide to find step-by-step guides for working with issues.
Comment #60
berdirRebased, removed statistics.module changes.
Comment #61
catchComment #62
catchPostponing this on #3442009: OOP hooks using attributes and event dispatcher - this will cover multiple implementations in the same module.
Comment #63
berdirEvaluated this using #3442009: OOP hooks using attributes and event dispatcher hooks.
* Cron implementations can be now be in services and a module can have multiple, so system_cron() for example can be split up. One limitation specifically with that it is that hooks are currently only supported in modules, so we still need services in system module that call out to the respective component. Opened #3481903: Support hooks (Hook attribute) in any registered service for that.
* Discovery for alternative cron executions like ultimate_cron. I will have to investigate this, I guess it should be possible to use invokeAllWith() to find such services and then define a cron job for them.
* Additional instructions and features, like how frequently an implementation should be called. Not directly possible, but I suppose it would be possible to add additional attributes, on top of the hook or replace the hook and deprecate that, with very few other changes. So moving things to a service as a regular hook still helps us move toward a future with more cron functionality in core.
Comment #64
nicxvan commentedJust a heads up multiple implementations may be more complex than initially thought due to https://www.drupal.org/project/drupal/issues/3484754
Comment #65
nicxvan commentedThe issue this was postponed on is in.
Needs a rebase though.