Problem/Motivation

Drupal has two ways to handle long-running tasks:

1. Queue API, by default queues are run on cron, often sites run queues via jenkins on different schedules, there are projects like 'waiting queue' which aim to minimise the time before queue items are picked up.

2. Batch API - designed for the browser, although drush can also run batches, completely blocks the user's interaction with the site while it's running. Batches are user-specific so if one gets interrupted, that's pretty much it. Provides feedback, you know when it's finished.

From personal experience, the Queue API is great to work with and expectations are very clear. Batch API is hard to work with and debug.

In the UI, Batch API always blocking isn't necessarily optimal. Bulk operations on entities for example could be put into a queue, and there could be a progress bar in a block somewhere which runs down the queue, but allows the user to click around the rest of the site while that's happening. There are cases though like updates from the UI where things really do need to block and require strict ordering.

Proposed resolution

In automated_cron module introduce the possibility to process queues instantly on the terminate event. Following two configurations will be supported
- maximum number of items. This is the maximum number of items per queue that will be processed instantly. If there are more items added to the queue then it will be processed on the next subsequent run of the queue or cron. Only the queues to which items were added in that session will be processed. If this value is set to zero then instant queue processing will be disabled.
- maximum concurrent processing: If multiple submits create queue items, then the maximum number of concurrent queue processing can be configured. This must ensure that the site is not overloaded with queue processing. This will be defaulted to 1.

Remaining tasks

User interface changes

API changes

Data model changes

Cron jobs fall into a couple of main categories:

- things that have to be run periodically and don't care about site events - mainly garbage collection like session and cache.
- batch processing of things - search indexing of new nodes, purging of deleted stuff.

For the latter case, these are increasingly moving to the queue API, although it's not 100% consistent in core.

Issues like #943772: field_delete_field() and others fail for inactive fields, and the one I can't find about indexing nodes for searching immediately, might be helped by a poor mans queue runner.

Drupal 7 has a poor mans cron. Currently the implementation is very basic - 1px gif/xhr requests were causing Drupal to be bootstrapped twice each request, and at one point there was a proposal to do ob_flush() during a shutdown function but this didn't take on, so we ended up just running cron inline instead, which is sucky but I argued for that in the end.

With the queue runner, it'd be more a case of setting $_SESSION['run_queues'] after a form submit, check that on the next page, if it's set, add the 1px gif or whatever to that page, which hits /queue_run with a token. This would only ever be triggered by form submissions so it'd not have the page caching issues of cron runs.

Things it could be useful for:

- field deletion
- mass deletes of other stuff
- operations that trigger menu rebuilds or similar expensive operations, that don't necessarily have to happen inline with the request - just very shortly afterwards.
- indexing nodes in the core search module immediately after posting instead of waiting for cron.

Issue fork drupal-1189464

Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Comments

catch’s picture

Issue tags: +Performance

We don't need to use 1px gifs, between these two we can issue http requests from within the request out to the Drupal site then bail without checking the result, which takes around 1ms.

https://github.com/biznickman/PHP-Async

http://drupal.org/project/httprl

berdir’s picture

tsphethean’s picture

So I've started having a think about this, but for the async in Guzzle we'll need to add the Async plugin to core (http://guzzlephp.org/guide/plugins.html#async-plugin) - will that be a problem, and is too late in the cycle to get this in before feature freeze?

Was also thinking this queue runner would need to be able to be disabled (in the performance section or elsewhere?) as some use cases may want to prevent anything running "randomly"?

For implementation, I was thinking of extending hook_cron_queue_info() to add a flag for whether the queue specified should be executed by this queue runner. There is hook_queue_info which has been exposed by the contrib Queue UI module (http://drupal.org/project/queue_ui) but I guess getting that into core is pushing it?

berdir’s picture

I don't think that is a problem at all, the issue linked above in fact already adds that plugin.

tsphethean’s picture

Ok great, so for working on it I'll apply the patch from #1599622: Run poormanscron via a non-blocking HTTP request and we can make that a dependency.

I think I was editing my post as you replied, do you think identifying the queues to be run in queue runner should be configured in hook_cron_queue_info()?

tsphethean’s picture

One further thought on this, why would we only want to spawn these queue runner processes from a form submit? Looking at how the poormanscron will work (and works in D7) we'll be spawning a request to /cron/%process every time we reach the cron processing threshold, is there any reason why we wouldn't want to do the same for queues?

Version: 8.0.x-dev » 8.1.x-dev

Drupal 8.0.6 was released on April 6 and is the final bugfix release for the Drupal 8.0.x series. Drupal 8.0.x will not receive any further development aside from security fixes. Drupal 8.1.0-rc1 is now available and sites should prepare to update to 8.1.0.

Bug reports should be targeted against the 8.1.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.2.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

andypost’s picture

Version: 8.1.x-dev » 8.2.x-dev
Related issues: +#1599622: Run poormanscron via a non-blocking HTTP request

Version: 8.2.x-dev » 8.3.x-dev

Drupal 8.2.0-beta1 was released on August 3, 2016, which means new developments and disruptive changes should now be targeted against the 8.3.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.3.x-dev » 8.4.x-dev

Drupal 8.3.0-alpha1 will be released the week of January 30, 2017, which means new developments and disruptive changes should now be targeted against the 8.4.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.4.x-dev » 8.5.x-dev

Drupal 8.4.0-alpha1 will be released the week of July 31, 2017, which means new developments and disruptive changes should now be targeted against the 8.5.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

catch’s picture

Title: Add a 'poor mans queue runner' to core » Add a browser-based queue runner
Component: base system » ajax system
Issue summary: View changes
Related issues: +#1797526: Make batch a wrapper around the queue system

Tried updating the issue summary.

Version: 8.5.x-dev » 8.6.x-dev

Drupal 8.5.0-alpha1 will be released the week of January 17, 2018, which means new developments and disruptive changes should now be targeted against the 8.6.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

catch’s picture

I think the first thing to do here would be to implement something which just helps to run-off a few queue items in-between cron jobs, for #2878119: Whether queued or not, update the media thumbnail and metadata before beginning the entity save database transaction and similar issues.

So:

- Add an item to the queue that we want processing asap

- Add a session flag with the queue name we want to run off.

- Each request, look for that session flag, and if it's there, render a 1px gif

- The 1px gif links to a URL with the queue name and a token. Configuration/settings sets the number of items to process (default to 5 or 10?) or do it a bit like batch where we do as many as we can within a second or two.

dawehner’s picture

@catch
Do you think this should be a module, just like the automatic_cron module?

catch’s picture

@dawehner a new module sounds good to me.

ndobromirov’s picture

Why not use a dedicated service that will be triggered on post-send request. User will have the response already sent, Drupal will start processing the queue that was just added (almost in sync). This will allow this system to be useful in decoupled scenarios, not only when we render markup and 1px .GIF files.

catch’s picture

Version: 8.6.x-dev » 8.7.x-dev

Drupal 8.6.0-alpha1 will be released the week of July 16, 2018, which means new developments and disruptive changes should now be targeted against the 8.7.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.7.x-dev » 8.8.x-dev

Drupal 8.7.0-alpha1 will be released the week of March 11, 2019, which means new developments and disruptive changes should now be targeted against the 8.8.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.8.x-dev » 8.9.x-dev

Drupal 8.8.0-alpha1 will be released the week of October 14th, 2019, which means new developments and disruptive changes should now be targeted against the 8.9.x-dev branch. (Any changes to 8.9.x will also be committed to 9.0.x in preparation for Drupal 9’s release, but some changes like significant feature additions will be deferred to 9.1.x.). For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

Version: 8.9.x-dev » 9.1.x-dev

Drupal 8.9.0-beta1 was released on March 20, 2020. 8.9.x is the final, long-term support (LTS) minor release of Drupal 8, which means new developments and disruptive changes should now be targeted against the 9.1.x-dev branch. For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

Version: 9.1.x-dev » 9.2.x-dev

Drupal 9.1.0-alpha1 will be released the week of October 19, 2020, which means new developments and disruptive changes should now be targeted for the 9.2.x-dev branch. For more information see the Drupal 9 minor version schedule and the Allowed changes during the Drupal 9 release cycle.

Version: 9.2.x-dev » 9.3.x-dev

Drupal 9.2.0-alpha1 will be released the week of May 3, 2021, which means new developments and disruptive changes should now be targeted for the 9.3.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 9.3.x-dev » 9.4.x-dev

Drupal 9.3.0-rc1 was released on November 26, 2021, which means new developments and disruptive changes should now be targeted for the 9.4.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

catch’s picture

Title: Add a browser-based queue runner » Add an 'instant' queue runner with Fibers
Version: 9.4.x-dev » 10.0.x-dev
Component: ajax system » base system
Related issues: +#3257726: [meta] Use Fibers for concurrency in core

Given the queue API can't provide feedback, we can skip the browser altogether now that Fibers is available: #3257726: [meta] Use Fibers for concurrency in core

berdir’s picture

Can a fiber live longer than the main thread or whatever you want to call it? That's why we use that terminate stuff that can run after the response has been sent and the connection closed. If this blocks and keeps the connection open and therefore not sending out the response then that seems problematic and unpredictable?

catch’s picture

Title: Add an 'instant' queue runner with Fibers » Add an 'instant' queue runner

It can't, and I got overexcited opening this issue before fully reading through the docs. If we were to try to use it for this, it would end up needing to be something like Guzzle's non-blocking HTTP request (which might get refactored on top of a Fiber). Fibers don't allow for simultaneous execution by themselves.

andypost’s picture

There's already working solution, usable for 9.4 too via react PHP https://mglaman.dev/blog/using-reactphp-run-drupal-tasks

Nowadays both amp and react php works with symfony microkernel https://uvinum.engineering/how-we-are-adding-async-php-to-our-stack-3bb7...

catch’s picture

Issue summary: View changes

@andypost amp or reactphp might help on the cli, but this requires running a daemon waiting for queue items to come in, which isn't going to be an option on a lot of hosting environments. That's more what https://www.drupal.org/project/advancedqueue and similar are trying to do.

What I am thinking about here is a module we can ship with core, similar to automated_cron, that will run some queues down in the browser. It is not as good as a waiting queue, but it would allow us to do things like add functionality like https://www.drupal.org/project/image_style_warmer to core.

When we originally added automated cron to core, one of the ideas was to add a 1px gif following POST requests. This was rejected at the time because it was not guaranteed to run often enough on a site without much authenticated activity, but I think it would be enough here since we're explicitly hoping to execute queue items created by things like saving an entity form.

catch’s picture

After #3295790: Post-response task running (destructable services) are actually blocking; add test coverage and warn for common misconfiguration there's a possible path forwards here. I think we could add it to automated_cron

1. Automated cron adds a decorator for the queue service, this keeps a record of any queue that has an item added during a request.

2. That same decorator implements DestructableInterface and runs some (configurable via container parameter?) number of queue items from the list of queues it has at the end of the request - so could do do 1 item each from five queues or 5 items from one queue or similar.

We cant guarantee that the queue items that were added will be the ones that were created during the request, but some queue items will be processed and anything that doesn't gets picked up by the next cron run.

This would allow us to fix long-standing issues like #504012: Use a queue for node create/update indexing, that issue could continue to mark the node for reindex, but also add a queue item that will reindex it when it runs, then automated cron picks that queue item up and reindexes it at the end of the request.

It would also open up the possibility to bring the functionality of https://www.drupal.org/project/image_style_warmer and similar modules into core - we just create the queue item and if you have automated cron it might handle it, but if also it would work if you have a dedicated permanent queue runner or frequent drush jobs just for running queues too.

Version: 10.0.x-dev » 11.x-dev

Drupal core is moving towards using a “main” branch. As an interim step, a new 11.x branch has been opened, as Drupal.org infrastructure cannot currently fully support a branch named main. New developments and disruptive changes should now be targeted for the 11.x branch. For more information, see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

sukr_s’s picture

I'm proposing the following implementation in automated_cron module
- Instant queue processor will have the following configurations
- Min number of items to trigger the queue. Setting to zero will disable instant queue processing
- Max number of concurrent queue processing
- Implement a queue.database manager extending the core queue.database which will track the items being added to the queue. All queuing function will be delegated to the core implementation.
- Add a new route with no access check that will process the queue
- At end of a request if the number of queue items configured is reached then an async queue processing will be triggered using the mechanism in https://github.com/biznickman/PHP-Async. I've been using this approach and works quite well.

This will not use any browser based implementation or 1x1 gifs.

Any thoughts, suggestions, edits or objections? I'll try to implement the same shortly.

catch’s picture

@sukr_s when we originally implemented automated cron in core, we looked at making asynchronous http requests to a Drupal route, but eventually gave up on it because some hosting environments do not cleanly support making http requests to themselves. This is why automated_cron uses the terminate event, to run after the response has been sent to the browser but before script termination. The post response logic works a lot more consistently after #3295790: Post-response task running (destructable services) are actually blocking; add test coverage and warn for common misconfiguration.

So I would just directly process the queue items in the terminate event listener without trying to deal with async http requests. This also avoid having to worry about an access-free route for processing the queue items, which we'd probably have to lock down with a token or similar.

- Min number of items to trigger the queue. Setting to zero will disable instant queue processing
- Max number of concurrent queue processing

Why not a maximum number of queue items to process, and if there's one or more queue item added during the request, process them at the end? If the maximum number is 0, nothing ever gets processed. I don't see what a minimum number of items gets us.

sukr_s’s picture

@catch

So I would just directly process the queue items in the terminate event listener without trying to deal with async http requests

With this suggestion, I'm assuming that processing in the event listener will be non-blocking, otherwise async call is better. Will go with your suggestion for now.

Why not a maximum number of queue items to process,

Do you mean not to process more than the maximum number of items in the queue at a time. I was thinking of setting the time limit to zero so that all the items in the queue will be processed. Otherwise in each terminate event we will have to check the remaining number of items in the queue and trigger again, which would mean additional db calls in the terminate event.

and if there's one or more queue item added during the request, process them at the end?

Yes that's the current thought as well that the process queue call would be done in the terminate event to avoid multiple calls in cases where multiple items are added to the queue.

I don't see what a minimum number of items gets us.

If there was a need to process in a batch instead of immediate processing, perhaps an unwanted frills.

catch’s picture

With this suggestion, I'm assuming that processing in the event listener will be non-blocking, otherwise async call is better. Will go with your suggestion for now.

Yes it's non-blocking - the terminate event runs after the response is sent to the browser. Prior to Drupal 10.2-ish this only worked on certain server configurations but since the issue linked above it works on nearly all.

Do you mean not to process more than the maximum number of items in the queue at a time. I was thinking of setting the time limit to zero so that all the items in the queue will be processed. Otherwise in each terminate event we will have to check the remaining number of items in the queue and trigger again, which would mean additional db calls in the terminate event.

The queue could potentially have thousands of items in it, that could lead to OOM errors etc if we tried to process everything. The way I thought of this working was that we would track which queue and how many items are added during a request, probably in a class property of the decorator or similar.

Say someone submits a node form, and that adds one queue item to two different queues, A and B, then in the terminate event, we'd process one item from queue A and one item from queue B. Obviously there's no guarantee that these are the same queue items at all, but this feature is mostly intended for lower traffic sites where that's more likely to be the case.

However if submitting the node form added 500 items to the queue (e.g. it adds queue item for every person following an issue or something), we'd only process the configured maximum of items then stop (anything else would have to eventually be picked up by cron, which also processes queue items).

sukr_s’s picture

Issue summary: View changes
Status: Active » Needs review
catch’s picture

Status: Needs review » Needs work

Left some comments/questions on the MR.

sukr_s’s picture

Status: Needs work » Needs review

Addressed MR Comments. Moved common code to a trait as discussed with catch on slack.

amateescu’s picture

Adding a related issue that could benefit from this.

smustgrave’s picture

Status: Needs review » Needs work

Left some comments on MR. mostly small stuff.

sukr_s’s picture

Status: Needs work » Needs review
smustgrave’s picture

Status: Needs review » Needs work

Addressed some points but not all. Left some comments on MR.

Wan't to make sure I can still review and mark so didn't go down the rabbit hole.

sukr_s’s picture

Status: Needs work » Needs review

resolved last 2 review comments. signature made consistent with new Cron class signature.

needs-review-queue-bot’s picture

Status: Needs review » Needs work
StatusFileSize
new90 bytes

The Needs Review Queue Bot tested this issue. It no longer applies to Drupal core. Therefore, this issue status is now "Needs work".

This does not mean that the patch necessarily needs to be re-rolled or the MR rebased. Read the Issue Summary, the issue tags and the latest discussion here to determine what needs to be done.

Consult the Drupal Contributor Guide to find step-by-step guides for working with issues.

sukr_s’s picture

Status: Needs work » Needs review
needs-review-queue-bot’s picture

Status: Needs review » Needs work
StatusFileSize
new90 bytes

The Needs Review Queue Bot tested this issue. It no longer applies to Drupal core. Therefore, this issue status is now "Needs work".

This does not mean that the patch necessarily needs to be re-rolled or the MR rebased. Read the Issue Summary, the issue tags and the latest discussion here to determine what needs to be done.

Consult the Drupal Contributor Guide to find step-by-step guides for working with issues.

sukr_s’s picture

Status: Needs work » Needs review
needs-review-queue-bot’s picture

Status: Needs review » Needs work
StatusFileSize
new90 bytes

The Needs Review Queue Bot tested this issue. It no longer applies to Drupal core. Therefore, this issue status is now "Needs work".

This does not mean that the patch necessarily needs to be re-rolled or the MR rebased. Read the Issue Summary, the issue tags and the latest discussion here to determine what needs to be done.

Consult the Drupal Contributor Guide to find step-by-step guides for working with issues.

Version: 11.x-dev » main

Drupal core is now using the main branch as the primary development branch. New developments and disruptive changes should now be targeted to the main branch.

Read more in the announcement.

catch’s picture

Took a detailed look at the MR, I think there is still a lot of complexity here that is probably not needed and noticed some more things:

1. For search_api instant indexing + https://www.drupal.org/project/image_style_warmer they might want to add their items to the instant queue runner for sites that don't necessarily have automated_cron enabled.

2. Adding all queue items to the instant runner in automated cron might be heavy-handed, not all queue items benefit from being processed immediately.

3. I don't think the concurrency protection is necessary, this only impacts requests that actually add items to the queue (and less likely to happen if we don't do #2 by default).

4. The trait feels a bit too high level, I think extracting the code that runs a single queue item would be more re-usable, and allow dropping the semaphore usage.

These are fairly large changes from the current approach so going to start a new MR here.

catch’s picture

Issue tags: +Needs tests

Not sure we need to do the automatic cron integration here, that could be its own issue - add an extra configuration key for whether to instant-process queue items and ideally we'd find a way to decorate all the queue backends without having to add a subclass for each one.

In its current state, the new MR should allow for search_api modules and potentially core search to add items to the queue runner themselves so they get processed end of request.

No test coverage yet, since this depends on kernel destruct it needs at least a kernel test or maybe functional test, would be easiest with an existing queue (test or real).

catch changed the visibility of the branch 1189464-add-an-instant to hidden.

catch’s picture

Status: Needs work » Needs review
Issue tags: -Needs tests

Added a kernel test.

Hiding the other branch for now.

heddn’s picture

Discussed actually moving core search to use this but Nat mentioned that would be a bit more work to switch over to a queue than we generally do in a task like this. Left a few comments on the MR. This looks mostly good.

catch’s picture

Adding the search issues.

I tried to think of any existing core queues that could use this, but couldn't really yet. However search_api and image cache prewarming are very obvious use cases in contrib.

smustgrave’s picture

Issue tags: +Needs change record

Could we write a CR for this? Know I work on a project that may be interested in this as our queue fills up too fast sometimes and running instantly sounds intriguing.

catch’s picture

#3316209: Leakage usernames (timing attack) from password reset form is a candidate for using this API and also a security improvement.