This feature existed in D7 but got removed from D8 to ensure a smaller API / initial commit.
Can be revisited post-beta.

CommentFileSizeAuthor
#131 interdiff_125_131.txt630 byteskhiminrm
#131 2918866-131.patch27.2 KBkhiminrm
#125 2918866-125.patch28.37 KBtbkot
#124 2918866-124.patch28.32 KBtbkot
#118 advancedqueue-2918866-118-variable-fix.patch777 bytesadamfranco
#107 2918866-107.patch88.14 KBemil stoianov
#88 2918866-88.patch39.79 KBemil stoianov
#84 2918866-84.patch29.62 KBsmovs
#81 2918866-80.patch30.65 KBsmovs
#79 2918866-79.patch30.35 KBcrzdev
#76 2918866-76.patch30.27 KBzengenuity
#69 reroll_diff_2918866_58-69.txt7.02 KBankithashetty
#69 2918866-69.patch30.65 KBankithashetty
#58 interdiff-2918866-56-58.txt684 bytesgordon
#58 2918866-58.patch31.69 KBgordon
#57 interdiff-2918866-49-56.txt2.83 KBgordon
#56 2918866-56.patch29.92 KBdaniel korte
#53 2918866-53.patch29.92 KBdaniel korte
#52 2918866-52.patch29.92 KBdaniel korte
#51 2918866-50.patch29.89 KBdaniel korte
#49 interdiff-2918866-42-49.txt16.48 KBgordon
#49 2918866-49.patch33.18 KBgordon
#48 interdiff-2918866-42-48.txt16.48 KBgordon
#48 2918866-48.patch33.29 KBgordon
#42 interdiff_41-42.txt3.53 KBbrunodbo
#42 2918866-42.advancedqueue.Add-support-for-unique-jobs.patch23.16 KBbrunodbo
#41 interdiff.2918866.38-41.txt8.14 KBjoachim
#41 2918866-41.advancedqueue.Add-support-for-unique-jobs.patch23.19 KBjoachim
#38 2918866-38.advancedqueue.Add-support-for-unique-jobs.patch21.46 KBbrunodbo
#31 2918866-31.advancedqueue.Add-support-for-unique-jobs.patch21.24 KBbrunodbo
#29 2918866-29.advancedqueue.Add-support-for-unique-jobs.patch21.29 KBbrunodbo
#27 2918866-27.advancedqueue.Add-support-for-unique-jobs.patch15.33 KBbrunodbo
#20 2918866-20.advancedqueue.Add-support-for-unique-jobs.patch20.62 KBjoachim
#17 advancedqueue-unique-jobs-2918866-17.patch6.89 KBevaldas.uzkuras
#15 advancedqueue-unique-jobs-2918866-15.patch6.89 KBevaldas.uzkuras
#13 advancedqueue-unique-jobs-2918866-11.patch6.46 KBevaldas.uzkuras
#11 advancedqueue-unique-jobs-2918866-10.patch6.42 KBevaldas.uzkuras
#9 advancedqueue-unique-jobs-2918866-9.patch6.41 KBevaldas.uzkuras
Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Comments

bojanz created an issue. See original summary.

killes@www.drop.org’s picture

What would be the best way to implement this? Seems like an important feature.

joachim’s picture

Off the top of my head, I'd do this:

- add a hash field to the advancedqueue table
- add a method to the job plugin interface to let each job type plugin define how a job payload gets turned into a hash (as there may be some data in the payload that should be ignored when considering what makes two jobs 'the same')
- enqueueJob() queries for the hash

killes@www.drop.org’s picture

Question is: would the maintainers be inclined to accept such a patch or should I try to implement my own backend?

bojanz’s picture

Yes, I definitely want this.
Here's the issue that added the D7 version of the feature: #2846958: Support unique item creation.

Open questions:
1) The hashing approach VS forcing the caller to manually specify a uniqueness key.
I like the idea of a hash, though I'm not sure sure the job type comparison override is needed.
After all, the main problem we're solving is the same code being run twice, thus the jobs are usually identical.

Resque plugins always do uniqueness by constructing an ID out of the payload, which is equivalent to our hash idea.

2) Storage class implementation:
a. Add a $job->isUnique() and expand the SQL logic inside enqueueJobs() to perform the DB merge when needed. Have an empty SupportsUniqueJobsInterface that tells the outside world whether it's supported, or silently ignored.
b. Add separate enqueueUniqueJob() and enqueueUniqueJobs() methods via SupportsUniqueJobsInterface

Leaning towards a) at the moment.

What do you think?

killes@www.drop.org’s picture

My usecase is as follows:

I need to update 50k products to keep prices up to date.

These products come from different suppliers, each using their own unique product ID (which however may clash with another supplier's).

So, I would like to use a key composed of supplier and product id, to keep it 1) unique and 2) be able to see on a glance which supplier it is from. ie I do not want to have to use a hash.

joachim’s picture

> Resque plugins always do uniqueness by constructing an ID out of the payload, which is equivalent to our hash idea.

That would probably mean better performance, as you'd get unique IDs which are much shorter than a hash string and therefore, I assume, quicker to query for.

You'd still need a method on the plugin to turn the payload into an ID.

bojanz’s picture

Not necessarily. Our hash would be 32 chars, which is pretty short:

            // crc32b is the fastest but has collisions due to its short length.
            // sha1 and md5 are forbidden by many projects and organizations.
            // This is the next fastest option.
            $hash = hash('tiger128,3', $data);
evaldas.uzkuras’s picture

Status: Active » Needs review
StatusFileSize
new6.41 KB

Made a unique jobs support. Quite similar to that was made in D7.

Used key instead of the hash, because I think, that job creator should decide how to handle uniqueness.
In my use case, I have different job types for the same object, so it was essential to have own key to override other jobs.

Status: Needs review » Needs work

The last submitted patch, 9: advancedqueue-unique-jobs-2918866-9.patch, failed testing. View results

evaldas.uzkuras’s picture

Status: Needs work » Needs review
StatusFileSize
new6.42 KB

Trying to fix MySQL error: the key is too long.

Status: Needs review » Needs work

The last submitted patch, 11: advancedqueue-unique-jobs-2918866-10.patch, failed testing. View results

evaldas.uzkuras’s picture

Status: Needs work » Needs review
StatusFileSize
new6.46 KB

Fixed - Undefined index: unique_key

Status: Needs review » Needs work

The last submitted patch, 13: advancedqueue-unique-jobs-2918866-11.patch, failed testing. View results

evaldas.uzkuras’s picture

StatusFileSize
new6.89 KB

Fixed last failed test

evaldas.uzkuras’s picture

Status: Needs work » Needs review
evaldas.uzkuras’s picture

StatusFileSize
new6.89 KB

FIxed coding standard.

evaldas.uzkuras’s picture

joachim’s picture

> Have an empty SupportsUniqueJobsInterface that tells the outside world whether it's supported, or silently ignored.

My first thought was to define whether a backend plugin supports uniqueness or not on the plugin annotation.

The advantage of using the plugin definition is that a child class can change the support from TRUE to FALSE, whereas you can't remove an interface from a child class in PHP.

joachim’s picture

The patch at #17 doesn't work in the way that previous comments were suggesting.

It's requiring the caller to handle passing in a unique key of their own devising, which is fiddly & also open to errors. (Case in point: while writing my patch, my test was failing because the code I wrote to create a unique key from a hash wasn't considering the queue name in the hash!)

So here is an updated patch with an aproach in like with what @bojanz was suggesting.
(Though @Evaldas Užkuras should be credited as I've kept all the .install file code, which was much appreciated as I hate writing DB update code!)

@bojanz concurred with this in #commerce on slack, which I'm taking the liberty of quoting:

> my intention was to allow callers to skip defining a unique key
> every time they enqueue a job
> I found that a bit annoying in D7

in #5:

> I like the idea of a hash, though I'm not sure sure the job type comparison override is needed.

I agree. Let's leave that off for now, as I don't know of any use cases where the job type plugin might want to customize how the hash is created. It's therefore most likely a YAGNI.

So for the hash, we need to ensure a job is unique only within its type and its queue. Otherwise, we could queue two different jobs for the same order, and the second one would be ignored! Therefore the hashing uses the job queue, job type, and the serialized payload:

 * This supports unique jobs. Two jobs of the same type and in the same queue
 * are considered unique if their payloads are identical. Jobs with the same
 * payload in different queues, or of different types, are not considered to be
 * identical.

> Have an empty SupportsUniqueJobsInterface that tells the outside world whether it's supported, or silently ignored.

Backends may or may not support unique jobs.

However, silently ignoring jobs that should be unique is dangerous, and as we've seen at #2931290: Multiple recurring orders get created if cron is badly configured can be extremely detrimental to a site (with Commerce Recurring, it leads to duplicate orders, which will in time lead to customers being overcharged, which is pretty bad!). Commerce Recurring's jobs absolutely MUST NOT be used with a backend that does not support unique jobs, and therefore, I think we should guard against this here.

So therefore, I want to introduce a concept of a job type *requiring* that its jobs be treated as unique. This can be declared on the job type plugin annotation.

I don't think we need finer granularity than this by allowing this on a job-by-job basis. The use cases I've seen, it's the job type as a whole that requires job payloads to be treated as unique. (If for some reason you need some jobs to be unique and some not, you can define a 2nd job type that inherits the class and has a different annotation, or you can cheat and chuck a timestamp into your payload ;)

However, what I don't massively like about the code I've written for this is that Database::enqueueJobs() has to call parent::enqueueJobs() to check that the backend is compatible with the jobs to queue. And obviously, any other backend implementations that exist in the wild will have to add that call to the parent too. Suggestions on how to make this more elegant welcome!

brunodbo’s picture

Tested the patch in #20: I added ensure_unique = TRUE to my job type plugin's annotation, and am seeing the hash being created when a job is enqueued.

I'm seeing the following behavior though, maybe I'm misunderstanding how this is supposed to work:

- When I run cron, an existing job record's 'available date' is being updated (I haven't specified a delay for this job when enqueuing it)
- During the same cron run, the previous 'version' (i.e., with the previous 'available date') of the same job gets processed, leading to the same job being processed over and over again in subsequent cron runs. In my testing, the job never reached completion.

Is this expected? I was expecting the job record wouldn't change at all once it's been created, if the job type requires uniqueness, to guard against multiple processing. Does the prevention of multiple processing has to be taken care of in the job type plugin somehow?

joachim’s picture

> - When I run cron, an existing job record's 'available date' is being updated (I haven't specified a delay for this job when enqueuing it)

That's an interesting question. IIRC the 'available_date' property on a job is the date it's available from, and by default, that's the moment it was placed in the queue. Arguably, attempting to re-queue it shouldn't change that date, because it's still the case that the job was available from the time it was queued the first time.

> - During the same cron run, the previous 'version' (i.e., with the previous 'available date') of the same job gets processed, leading to the same job being processed over and over again in subsequent cron runs. In my testing, the job never reached completion.

Ouch.

So the job got loaded and processed by one cron hook, and then requeued by another cron hook? But why didn't cron hook 1 mark the job as complete?

Can you explain the different cron hooks you had running, and their exact order?

brunodbo’s picture

> Arguably, attempting to re-queue it shouldn't change that date, because it's still the case that the job was available from the time it was queued the first time.

I would say that the attempt to re-queue a job shouldn't change the date it should be available. Otherwise, there might be cases that a job never becomes available? For example, if I queue a job that should send a message at 'now + 30 days', and the 'available date' for that job keeps getting changed with 'now + 30 days', the job would never become available.

Should we add a check in \Drupal\advancedqueue\Plugin\AdvancedQueue\Backend\Database::enqueueJobs to see if the job exists, and only call $job->setAvailableTime($this->time->getCurrentTime() + $delay); if it's a new job? Or simply bail if we have an existing job?

> So the job got loaded and processed by one cron hook, and then requeued by another cron hook? But why didn't cron hook 1 mark the job as complete?

Yep, that's exactly what it looks like.

> Can you explain the different cron hooks you had running, and their exact order?

I have the 'Advanced queue' cron hook running first, and then my cron hook that queues the job runs right after (using Ultimate cron to check the order, but didn't change the order). Is it possible that the job gets re-queued because it isn't completed yet by the time my cron hook runs? When I disable my cron hook that does the queuing, the existing job completes (state 'Success').

joachim’s picture

> Is it possible that the job gets re-queued because it isn't completed yet by the time my cron hook runs?

Do you have cron jobs running in parallel? My understanding is that with a normal setup they run in sequence, so AQ should finish processing the job and decide whether it succeeded before your own cron hook tries to queue things.

Can you add some logging statements to your code to get some finer detail on the order of operations?

Also, if you were able to reproduce this in a kernel test, that would be great!

joachim’s picture

> I would say that the attempt to re-queue a job shouldn't change the date it should be available. Otherwise, there might be cases that a job never becomes available? For example, if I queue a job that should send a message at 'now + 30 days', and the 'available date' for that job keeps getting changed with 'now + 30 days', the job would never become available.

Ok, I'm convinced :)

> Should we add a check in \Drupal\advancedqueue\Plugin\AdvancedQueue\Backend\Database::enqueueJobs to see if the job exists, and only call $job->setAvailableTime($this->time->getCurrentTime() + $delay); if it's a new job? Or simply bail if we have an existing job?

Yes, quite probably. Which will make it slower than the merge query, unless there's a way we can set the merge query to not bother making a change at all if the item already exists.

We basically shouldn't change an existing job at all when attempting to queue a duplicate.

brunodbo’s picture

> Do you have cron jobs running in parallel? My understanding is that with a normal setup they run in sequence, so AQ should finish processing the job and decide whether it succeeded before your own cron hook tries to queue things.

Apparently Ultimate Cron runs cron jobs in parallel, I just learned from its project page (I had only installed it yesterday to review the cron jobs' order) :) So with Ultimate Cron installed, my job gets queued and processed during the same cron run, since the cron jobs run in parallel.

I did some more poking around, added some logging code as you suggested, and realized that the 'unexpected' re-queuing I mentioned in #21 is my own doing :D. During my queuing cron hook, I look for users that are flagged, and queue a job to send a message to those users. But then in the queued job, I don't do anything that would prevent the same job from getting queued again (e.g., users aren't unflagged, since the job just sends a message)...

> We basically shouldn't change an existing job at all when attempting to queue a duplicate.

What if we add a $job->exists() method that checks for a job with a matching hash, so we can call that at the start of \Drupal\advancedqueue\Plugin\AdvancedQueue\Backend\Database::enqueueJobs? I can imagine such a method could be useful in other places as well.

brunodbo’s picture

This adds a hasDuplicateJob method to the Database backend to check whether the to-be-queued job already has a matching job. Then in enqueueJobs(), we check for a duplicate, and return if we have one. If we don't, we fill in the unique_key field in the insert query. Not sure if we still need the merge query?

The above also allows me to do something like this when queuing jobs, so enqueueJob() doesn't get called in the first place:

if (!$queue->getBackend()->hasDuplicateJob($job)) {
  $queue->enqueueJob($job);
}

What do you think?

I'm also wondering whether we should move the unique key creation to the Job class, though it makes sense that the backend is responsible for deciding how to implement uniqueness...

Status: Needs review » Needs work
brunodbo’s picture

Status: Needs work » Needs review
StatusFileSize
new21.29 KB

Better patch.

Status: Needs review » Needs work

The last submitted patch, 29: 2918866-29.advancedqueue.Add-support-for-unique-jobs.patch, failed testing. View results
- codesniffer_fixes.patch Interdiff of automated coding standards fixes only.

brunodbo’s picture

Not sure yet why DatabaseBackendTest is failing, but this fixes an issue in the foreach in enqueueJobs() (continue instead of return), and should fix the coding standards test fail.

brunodbo’s picture

Status: Needs work » Needs review

Status: Needs review » Needs work
joachim’s picture

> I did some more poking around, added some logging code as you suggested, and realized that the 'unexpected' re-queuing I mentioned in #21 is my own doing :D. During my queuing cron hook, I look for users that are flagged, and queue a job to send a message to those users. But then in the queued job, I don't do anything that would prevent the same job from getting queued again (e.g., users aren't unflagged, since the job just sends a message)...

Thanks for investigating that one!

So should it be the case that we don't queue a duplicate if the original is currently claimed?

We definitely need to consider a completed job as no longer unique -- Commerce Recurring requires this, for instance, as every billing period (eg every month) it will queue a job for the same order.

A few points about the latest patch (some of which are to do with my previous work, feel free to ignore those and I will get round to them later!) --

  1. +++ b/src/Plugin/AdvancedQueue/Backend/BackendBase.php
    @@ -133,6 +145,28 @@ abstract class BackendBase extends PluginBase implements BackendInterface, Conta
    +  /**
    +   * {@inheritdoc}
    +   */
    +  public function enqueueJobs(array $jobs, $delay = 0) {
    

    On reflection, I think I want to move this code to its own method.

    It doesn't fix the problem that plugin implementations need to call it, but it's cleaner and better DX than having to call parent::enqueueJobs().

  2. +++ b/src/Plugin/AdvancedQueue/Backend/Database.php
    @@ -10,12 +11,17 @@ use Symfony\Component\DependencyInjection\ContainerInterface;
    + * This supports unique jobs. Two jobs of the same type and in the same queue
    + * are considered unique if their payloads are identical. Jobs with the same
    + * payload in different queues, or of different types, are not considered to be
    + * identical.
    

    Needs to mention job status as well.

  3. +++ b/src/Plugin/AdvancedQueue/Backend/Database.php
    @@ -143,6 +165,56 @@ class Database extends BackendBase implements SupportsDeletingJobsInterface, Sup
    +      ->condition('unique_key', $unique_key)
    +      ->fields('a', ['job_id'])
    +      ->countQuery()
    +      ->execute()
    +      ->fetchField();
    

    This needs to consider the job status, as mentioned above. A completed job doesn't count as a duplicate.

  4. +  public function hasDuplicateJob(Job $job) {
    

    This should be added to SupportsUniqueJobsInterface.

brunodbo’s picture

> So should it be the case that we don't queue a duplicate if the original is currently claimed?

I think so. That would mean that a job is unique (and shouldn't be re-queued) until it reaches the completed state.

We'll also a need a way to reflect the job's state in the unique key handling, to avoid integrity constraint violations for the 'unique_key' field when queuing a next iteration of the job (i.e., when a job is completed, and we queue it again). Should we update/empty the 'unique_key' field when a job reaches completion?

joachim’s picture

> We'll also a need a way to reflect the job's state in the unique key handling,

That's not going to work. A job that's being queued is always going to be in the 'queued' state (or whatever, post-breakfast, don't have the code in front of me). We have a use case where the same order gets queued, say, once a month. So that's going to have the same hash each time.

> to avoid integrity constraint violations for the 'unique_key' field when queuing a next iteration of the job (i.e., when a job is completed, and we queue it again). Should we update/empty the 'unique_key' field when a job reaches completion?

... and because of this use case, we actually shouldn't have a database-level unique key constraint on the unique key column. That's actually a good catch, as up till now, our use case of 'same order should get queued a month later' would cause a DB crash!! We should add a test for this.

We're not making use of that constraint anyway, because we now do a DB load to check for a job before doing an insert.

joachim’s picture

Oh, another thing to do -- we should log a notice when a duplicate job is rejected. At the moment, the patch just swallows the error. But that could lead to real developer headaches if there's a bug that's incorrectly causing a job to be a duplicate.

BTW, I really appreciate all your help with this issue! I thought the patch was pretty much done, and you've brought loads of problems to light! Yay for peer review!

brunodbo’s picture

Happy to help!

#34:

1. Attempt at refactoring this in attached patch. Method name suggestions welcome :)

2. Done, though could be made easier to read. Maybe split it up in 2 paragraphs?

3. WIP: jobs will be re-queued if previous jobs with the same hash have the state 'success'. There is still an issue with the query in \Drupal\advancedqueue\Plugin\AdvancedQueue\Backend\Database::hasDuplicateJob (which is likely the reason the tests are failing): $unique_key has a different value than what is stored in the db, so it might be something with the way the hash field is being checked in the query?

4. Done

#36:

> we actually shouldn't have a database-level unique key constraint on the unique key column.

Unique key constraint removed in attached patch.

> We should add a test for this.

Where should this test go? Testing newbie here...

#37:

Should we add the logger in this patch (AQ doesn't have a logger channel yet AFAICT)? Or should that be its own issue?

brunodbo’s picture

Status: Needs work » Needs review

Status: Needs review » Needs work
joachim’s picture

Status: Needs work » Needs review
StatusFileSize
new23.19 KB
new8.14 KB

Changes since the last patch:

- changed docs for hasDuplicateJob() not to mention key, as it's an implementation detail.
- fixed duplicate job check to consider queued or processing states only.
- add mention of states to hasDuplicateJob() docs.
- fixed duplicate job check not working due to no queue ID.
- changed method name.
- make protected, add to docs.
- removed pointless base method
- fixed non unique backend not calling helper.
- added logging.

Still todo, maybe:

- I'm not convinced hasDuplicateJob() should be public, on second thoughts. Queue users shouldn't need to worry about this, they should be able to let the queue handle this for them.
- I'm tempted to rename the unique_key DB column to something like hash, as it's not unique in the database sense of the word, so could confused developers.

brunodbo’s picture

I'm not convinced hasDuplicateJob() should be public, on second thoughts. Queue users shouldn't need to worry about this, they should be able to let the queue handle this for them.

Yep, think so too - seems like that should be an internal thing. Didn't change this in the attached patch yet - should we do away with the interface then (since interface methods should be public)?

I'm tempted to rename the unique_key DB column to something like hash, as it's not unique in the database sense of the word, so could confused developers.

Yeah, I got confused by that as well :) Changed that in the patch here, and rename createJobUniqueKey() to createJobHash()

Also rewrote the docs for hasDuplicateJob() a little, to (hopefully) make it easier to read.

jonathanshaw’s picture

should we do away with the interface then (since interface methods should be public)?

We still need a way for a backend to be able to declare that it can handle jobs that require uniqueness, so that we can throw an exception if a job that requires uniqueness is passed to a backend that can't provide that.

But this doesn't have to be done by an interface: we could add a new annotation to the backend plugin type instead.

thronedigital’s picture

Any updates?

gordon’s picture

Status: Needs review » Needs work

I have done some investigation into this as I required this for a client but it was not a good fit even though this would make the implementation very clean.

1. the method of using a hash for the uniqueness based upon the payload I think is not a good idea. Since it is based upon the entire payload you can't add in additional variable parameters. eg. I want the uniqueness based up on the node id, however I need to add in other data as well which I don't want to be a part of the uniqueness such as the a workflow status which I may want to change later.

2. So if you try to enqueue a new version of an existing job, if will simply discard the new job, in my case I don't care about the existing job but want to replace it with the new one, which may have a new available date, or other changes.

3. Also the enqueuing will crash if you put try to put a job which is unique onto a queue which backend doesn't support uniqueness. Since there is only the database backend this is not a huge problem but will create issues if/when alternate backends are created.

joachim’s picture

>1. the method of using a hash for the uniqueness based upon the payload I think is not a good idea. Since it is based upon the entire payload you can't add in additional variable parameters. eg. I want the uniqueness based up on the node id, however I need to add in other data as well which I don't want to be a part of the uniqueness such as the a workflow status which I may want to change later.

To address that, we can do this:

> - add a method to the job plugin interface to let each job type plugin define how a job payload gets turned into a hash (as there may be some data in the payload that should be ignored when considering what makes two jobs 'the same')

jonathanshaw’s picture

2. So if you try to enqueue a new version of an existing job, if will simply discard the new job, in my case I don't care about the existing job but want to replace it with the new one, which may have a new available date, or other changes.

This use case interests me too.

gordon’s picture

Status: Needs work » Needs review
StatusFileSize
new33.29 KB
new16.48 KB

I have made some changes so that it will now accept an unique id. So as a part of the definition you can when creating a job you can specify it and it will be used. However it will only use this if the job type is a unique type, otherwise

$job->getUniqueId();

Will alway return NULL.

If you do not specify the unique id, then it will fall back to generating a hash from the payload.

Also I have changed the unique flag on the job to be one of the following

UNIQUE_DUPLICATES
UNIQUE_DONT_OVERWRITE
UNIQUE_OVERWRITE

So now you can specify if you want to overwrite the unique items or ignore it.

Lastly so you can I have changed how Job::create() works so you can specify the unique id using this interface rather than using new Job() to do this.

$job = Job::create('unique_queue', ['nid' => 123], ['unique_id' => 'node:123']);

So in the 3rd parameter it is just a $definitions array and gets added to the payload, queue id and queue status.

I also added an index on the unique id field so that it will run faster.

I also found a couple of bugs in that the state and the available were getting updated on the job after it was being converted into an array.

Lastly I found a couple of bugs in that the

gordon’s picture

StatusFileSize
new33.18 KB
new16.48 KB

found some bugs that broke it. fixed now.

jonathanshaw’s picture

Status: Needs review » Needs work

We lack test coverage for the changes introduced in #48.

daniel korte’s picture

StatusFileSize
new29.89 KB

Reroll plus some minor clean up

daniel korte’s picture

StatusFileSize
new29.92 KB

Sorry, ignore #51

daniel korte’s picture

StatusFileSize
new29.92 KB

Third time is a charm.

jonathanshaw’s picture

Status: Needs work » Needs review

Let's see what the testbot says

Status: Needs review » Needs work

The last submitted patch, 53: 2918866-53.patch, failed testing. View results
- codesniffer_fixes.patch Interdiff of automated coding standards fixes only.

daniel korte’s picture

StatusFileSize
new29.92 KB

Whoops, wrong folder.

gordon’s picture

StatusFileSize
new2.83 KB

Just adding an interdiff for the changes between 49 and 56

gordon’s picture

StatusFileSize
new31.69 KB
new684 bytes

Since I am running this in live, I had a problem when updating because I would run the update on this patch again. I added in a check to make sure it does try to run this twice.

martijn de wit’s picture

Status: Needs work » Needs review

Status: Needs review » Needs work

The last submitted patch, 58: 2918866-58.patch, failed testing. View results
- codesniffer_fixes.patch Interdiff of automated coding standards fixes only.

andypost’s picture

andypost’s picture

  1. +++ b/src/Entity/Queue.php
    @@ -127,6 +134,11 @@ class Queue extends ConfigEntityBase implements QueueInterface {
    +    if ($this->jobTypeManager()->jobTypeRequiresUniqueness($job->getType()) && !$this->getBackend() instanceof SupportsUniqueJobsInterface) {
    +      throw new \InvalidArgumentException($this->t("Backend :type doesn't support unique jobs", [':type' => $this->getBackend()->getLabel()]));
    

    maybe it should just skip queuing when item already queued or other reaction where same item in processing

  2. +++ b/src/Job.php
    @@ -398,6 +408,52 @@ class Job {
    +  public function getUniqueId() {
    +    if (!\Drupal::service('plugin.manager.advancedqueue_job_type')->jobTypeRequiresUniqueness($this->type)) {
    +      return NULL;
    

    needs injection of service

  3. +++ b/src/Job.php
    @@ -398,6 +408,52 @@ class Job {
    +    // crc32b is the fastest but has collisions due to its short length.
    +    // sha1 and md5 are forbidden by many projects and organizations.
    +    // This is the next fastest option.
    +    $hash = hash('tiger128,3', $data);
    

    this hashing better to keep configurable via some container parameter

  4. +++ b/src/Plugin/AdvancedQueue/Backend/BackendBase.php
    @@ -133,6 +145,36 @@ abstract class BackendBase extends PluginBase implements BackendInterface, Conta
    +          throw new InvalidBackendException(sprintf(
    +            "Job of type %s with payload %s cannot be queued in queue %s because the queue's backend does not support unique jobs.",
    +            $job->getType(),
    +            serialize($job->getPayload()),
    

    serialization could be dangerous/impossible

jonathanshaw’s picture

serialization could be dangerous/impossible

Really? I thought the payload got stored in a database field by the the database backend, so that probably does some serlializing.

andypost’s picture

This exception is thrown when nothing did check if the job serializable ... so you'll get serialization exception instead of expected

+++ b/src/Plugin/AdvancedQueue/Backend/BackendBase.php
@@ -133,6 +145,36 @@ abstract class BackendBase extends PluginBase implements BackendInterface, Conta
+        if ($this->jobTypeManager->jobTypeRequiresUniqueness($job->getType())) {
+          throw new InvalidBackendException(sprintf(
+            "Job of type %s with payload %s cannot be queued in queue %s because the queue's backend does not support unique jobs.",
+            $job->getType(),
+            serialize($job->getPayload()),
+            $this->queueId

+++ b/src/Plugin/AdvancedQueue/Backend/Database.php
@@ -127,21 +150,70 @@ class Database extends BackendBase implements SupportsDeletingJobsInterface, Sup
+        $this->loggerFactory->get('advanced_queue')->notice('Duplicate job of type %type was not queued in queue %queue. Payload was %payload.', [
...
+          '%payload' => json_encode($job->getPayload()),
...
+        $this->loggerFactory->get('advanced_queue')->notice('Overwriting job %id of type %type in queue %queue. Payload was %payload.', [
...
+          '%payload' => json_encode($job->getPayload()),

moreover payload serialized as json other places so it more confusing

andypost’s picture

johnpitcairn’s picture

Patch at #58 no longer applies to latest dev (or rc2).

jcnventura’s picture

Issue tags: +Needs reroll
jonathanshaw’s picture

  /**
   * Determines whether the give job type requires support for unique jobs.
   *
   * @param string $job_type_id
   *   The job type ID.
   *
   * @return bool
   *   TRUE if the job type requires a queue backend that supports unique jobs;
   *   FALSE if support for unique jobs is not necessary.
   */
  public function jobTypeRequiresUniqueness($job_type_id) {
    $job_type_definition = $this->getDefinitions()[$job_type_id];
    return $job_type_definition['ensure_unique'] != AdvancedQueueJobType::UNIQUE_DONT_OVERWRITE;
  }

This should be return $job_type_definition['ensure_unique'] !== AdvancedQueueJobType::UNIQUE_DUPLICATES;
Otherwise the UNIQUE_DONT_OVERWRITE option gets ignored and duplicates are allowed.

ankithashetty’s picture

Issue tags: -Needs reroll
StatusFileSize
new30.65 KB
new7.02 KB

Re-rolled the patch in #58, made changes requested in #68 and fixed indentation issues in couple of files.

Retaining status as "Needs work" to address #62, thank you!

freality’s picture

This current solution has the uniqueness hash being created within \Drupal\advancedqueue\Job. Which means a JobType has no influence on its hash criteria. It would be great if hash creation was delegated to JobType; like JobTypeInterface::createHash(Job $job): string;.

Furthermore, all job definitions should have a hash value; with JobTypeBase defining a default implementation [which could simply returns a uuid, db serial, some random string val]. There'd be no need for any logic to determine if a job requires uniqueness.

chi’s picture

A couple of hypotetical issues related to creating hash from the payload.

1. What if payload format is changed during deployment. The payload may still have same meaning for consumers but the hash will be different.

2. What if payload contains some volatile data, i.e. timestamps. The consumers could ingore redundant data but to ensure jub uniqueness they must be removed manually.

joachim’s picture

I think both those cases can be dealt with by overriding createJobHash():

1. Write compatibility code in createJobHash() which ensures the hashes are made in the same way. Admittedly, that does mean you're forever using the OLD format. But changing payload format is a deployment event which arguably requires the queue to be emptied completely and repopulated, because the data is changing fundamentally.

2. Filter out the volatile data in createJobHash()

chi’s picture

Another relevant issue is ability to remove existing jobs. The typicall use case is like follows. A a job has been scheduled to notify users about some event, i.e. a new post on site. The related entity has been removed and you need to cancel the job before emails go out.

jsacksick’s picture

Issue tags: +Release blocker, +Prague2022
jonathanshaw’s picture

Another relevant issue is ability to remove existing jobs. The typicall use case is like follows. A a job has been scheduled to notify users about some event, i.e. a new post on site. The related entity has been removed and you need to cancel the job before emails go out.

I think this has to be out of scope here. Jobs should check they are still relevant at the point when they are executed, if there is a chance they are not. It's true you could try to (ab)use the job hash mechanism to do something like this, but I don't think we should burden ourselves with that as a consideration here. This feature is proving difficult to land even as it is.

zengenuity’s picture

StatusFileSize
new30.27 KB

Re-roll of the patch from #69.

rinasek’s picture

jcnventura’s picture

Issue tags: -Release blocker

This is a really nice to have, but the module is already in use in 4K sites, and doesn't have this feature yet. We're in Release Candidate status for 2 years already. Removing the 'Release blocker' tag, but would be good if this gets to RTBC soon(ish).

It can always be added to 1.1-beta1.

crzdev’s picture

StatusFileSize
new30.35 KB

Latest patch #76 unable to apply into 1.0.0-rc6 (or 8.x-1.x), adding reroll

SmovS made their first commit to this issue’s fork.

smovs’s picture

StatusFileSize
new30.65 KB

Updated patch #79 for applying to module ver. 8.x-1.0-rc7

smovs’s picture

StatusFileSize
new29.62 KB

Updated patch #79 for applying to module ver. 8.x-1.0-rc7 and removed logging in src/Plugin/AdvancedQueue/Backend/Database.php.
This logging adds a lot of logs while deploying and that is hard for maintenance.

jonathanshaw’s picture

@SmovS please post an interdiff when posting patches, or please make a commit to an existing MR. If you do this, we can see the proposed change to the patch; in this case it would make it easier to see the logging code you have removed.

+++ b/src/Plugin/AdvancedQueue/Backend/Database.php
@@ -2,22 +2,32 @@
+use Drupal\Core\Logger\LoggerChannelFactoryInterface;

@@ -26,6 +36,13 @@ class Database extends BackendBase implements SupportsDeletingJobsInterface, Sup
+  /**
+   * The logger factory.
+   *
+   * @var \Drupal\Core\Logger\LoggerChannelFactoryInterface
+   */
+  protected $loggerFactory;
+

@@ -37,13 +54,18 @@ class Database extends BackendBase implements SupportsDeletingJobsInterface, Sup
+   * @param \Drupal\Core\Logger\LoggerChannelFactoryInterface $logger_factory
+   *   The logger factory.
...
+    $this->loggerFactory = $logger_factory;

@@ -55,7 +77,9 @@ class Database extends BackendBase implements SupportsDeletingJobsInterface, Sup
+      $container->get('logger.factory')

All this logger code can be removed too as nothing is using $this->logger AFAICS.

borisson_’s picture

Added a review to the MR

emil stoianov’s picture

StatusFileSize
new39.79 KB

Latest changes by @smovs but as a patch file for those who need it.

joachim’s picture

> public $ensure_unique = self::UNIQUE_OVERWRITE;

I think the default value for this should be UNIQUE_DUPLICATES, otherwise behaviour will be changed for existing job type plugins.

EDIT: also, the constant names are confusing!

I thought UNIQUE_DONT_OVERWRITE = 'leave' meant 'make a new job'. But the comment for it says 'Do not enqueue the new job'.

joachim’s picture

Unit tests are currently broken because the MR introduces a dependency on the kernel to the Job class:

  public function toArray() {
SNIP
      'unique_id' => $this->getUniqueId(),
  }

// and then:

  public function getUniqueId() {
    if (!\Drupal::service('plugin.manager.advancedqueue_job_type')->jobTypeRequiresUniqueness($this->type)) {
SNIP
  }

Not sure how best to resolve this. Though I am not really keen on toArray() calling getUniqueId() to get the value rather than just returning what is set in the class. The order of precedence in getUniqueId / setUniqueId / constructor for handling $this->uniqueId doesn't seem right to me.

joachim’s picture

Another issue: we're using the phrase 'unique ID' in comments and method names and var names.

But there is *already* a unique ID -- the Job ID $job->id which is a unique ID. And it's fairly common in core to casually prefix the word 'ID' with 'unique'.

So I think it's going to be really confusing to differentiate between 'the job ID, which is a unique ID', and the 'unique ID which is a hash to check for uniqueness'.

joachim’s picture

I'd really like to rename the constants for these, as they're confusing:

  /**
   * Allow duplicate jobs.
   */
  const UNIQUE_DUPLICATES = 'allow'; // Doesn't say what happens to duplicates!

  /**
   * Do not enqueue the new job.
   */
  const UNIQUE_DONT_OVERWRITE = 'leave'; // 'don't overwrite' could mean the duplicate is written to a new job, which is the 'allow' case!

  /**
   * Overwrite existing job.
   */
  const UNIQUE_OVERWRITE = 'overwrite';

Something like:

  const UNIQUE_ALLOW_DUPLICATES = 'allow';
  const UNIQUE_SKIP_DUPLICATES = 'leave';
  const UNIQUE_OVERWRITE_DUPLICATES = 'overwrite';

Furthermore, what about making them an Enum now that we have those in PHP?

joachim’s picture

I'm also a bit confused about the need for BackendBase::ensureUniqueJobsSupport(). This says:

   * Backends that do not support unique jobs should call this from
   * enqueueJobs() to ensure that jobs that require uniqueness are not queued.

But the Queue class's enqueueJob/s() methods already have a check.

And the module's README has this code example showing that jobs should be queued by calling these:

$queue = Queue::load('default');
$queue->enqueueJob($job);

However! Kernel tests, both in this MR and the main branch, repeatedly do this:

    $this->firstQueue->getBackend()->enqueueJobs([$first_job, $third_job]);

Again, not sure how best to resolve it as it's a bit of a confusing situation.

jcnventura’s picture

Status: Needs work » Needs review
Issue tags: -KickstartPrague2022 +Needs issue summary update

Everyone.. I truly dislike issues going over 100 comments, and this one is really in danger of that.. Can I ask those that are interested in this to review and test this, please??

@joachim Can you please update the issue summary to explain what is the current status of this, and if there's still any blocker / caveat that would prevent this to be marked RTBC by someone else?

Setting this to "Needs review" as I'm not sure the "Needs work" that was added back in #69 is still applicable.

stmh’s picture

Status: Needs review » Reviewed & tested by the community
Issue tags: +HamburgContributionWeekend2024, +ContributionWeekend2024

I reviewed the code and have no objections in merging it back. I also tested the new functionality in a project where I need the functionality and it works flawlessly.

Mark as RTBC

joachim’s picture

I think this needs work, as per my various comments -- see #93 and #94.

stmh’s picture

Regarding your comments:

I think its safe to remove the comment, as enqueueJob and enqueueJobs will throw an exception if you try to add a JobItem that needs uniqueness and the queue does not support it.

Regarding the second comment. The Queue itself provides two function enqueueJob and enqueueJobs to add jobs, which looks fine to me.

joachim’s picture

Oh and #92 as well.

stmh’s picture

@joachim Oh yes, let's do this. I had my problems understanding them as well.

jcnventura’s picture

Status: Reviewed & tested by the community » Needs work

OK. Then we need to work on #92 and removing the first comment of #93?

stmh changed the visibility of the branch advancedqueue-2918866-2918866-add-support-for to hidden.

stmh’s picture

Worked on #92 and #93, see my latest commits.

joachim’s picture

Pushed a bunch of fixes. Tests now pass.

Still to do:

1. My comment in #93. I'm not sure how best to resolve this and it would be useful to have input from the original author of that code. But it looks like the checking in Queue was added in #48 back in 2019, and the checking in BackendBase even before that, so I don't know if those devs are even still around. Given that, I'd be inclined to remove the checks in Queue and let backends use the ensureUniqueJobsSupport() helper.

2. getDuplicateJobId could probably be made protected. That would make SupportsUniqueJobsInterface an empty interface, but there's already precedent for that in the backend interfaces.

3. I REALLY want to rename the DB column we're adding from unique_id to something like job_hash. It's going to be a source of confusion and bugs with the current name. I appreciate that sites are already using this patch.... but that's a risk you take when using a patch.

4. The MR makes changes to existing hook_update_N() implementations, which is out of scope AND generally not a good idea:

-  $schema = Database::getConnection()->schema();
-  $schema->addIndex('advancedqueue', 'queue_state', ['state'], $spec['advancedqueue']);
+  $database_schema = \Drupal::database()->schema();
+  $database_schema->addIndex('advancedqueue', 'queue_state', ['state'], $spec['advancedqueue']);

These were first introduced in #51 -- and later patches have added the same changes to newer implementations that were added to the module in the meantime!

bojanz’s picture

3. I REALLY want to rename the DB column we're adding from unique_id to something like job_hash. It's going to be a source of confusion and bugs with the current name. I appreciate that sites are already using this patch.... but that's a risk you take when using a patch.

FWIW I've been using "fingerprint" as the column name for this purpose in a few other places. It is fairly common in various libraries.

I do agree that "job_hash" sounds better than "unique_id" at least. Every ID is unique so the column name sounds confusing.

jcnventura’s picture

Just wanted to comment that I think this should not be committed to the module before we release 1.0, as adding new features is somewhat against what we should be doing in "rc" status.

Then again, we should probably release 1.0 ASAP unless someone has any objections, and if there is any, please comment on #3220292: Roadmap to 1.0.

jcnventura’s picture

1.0 has now been released. This can easily go into 1.1, we just need to get this in RTBC.

emil stoianov’s picture

StatusFileSize
new88.14 KB

The patch with latest updates for those who are not comfortable downloading merge requests.

jonathanshaw’s picture

I think part of the problem we're having with naming things and figuring out how this should be organised is that the semantics of 'unique' are unhelpful.

Let's discard the whole term "unique" and focus on the language of duplicates and how we handle duplicates: we need to detect duplicates (by hash), and then manage them.

Concretely this leads me to the following suggestions:
instead of a single Drupal\advancedqueue\Plugin\AdvancedQueue\Backend\SupportsUniqueJobsInterface we have:
Drupal\advancedqueue\Plugin\AdvancedQueue\Backend\SupportsDetectingDuplicateJobsInterface

Our annotations become:

/**
   * Allow duplicate jobs works with any backend.
   */
  public const DUPLICATES_ALLOW = 'allow';

  /**
   * Do not enqueue duplicate jobs. Requires a backend implementing SupportsDetectingDuplicateJobsInterface.
   */
  public const DUPLICATES_SKIP = 'skip';

  /**
   * Merge with existing job. Requires a backend implementing SupportsDetectingDuplicateJobsInterface and SupportsDeletingJobsInterface.
   */
  public const DUPLICATES_MERGE = 'merge';

Notice that DUPLICATES_MERGE I've said requires SupportsDeletingJobsInterface instead of some new SupportsUpdatingJobsInterface - this is because the simplest possible way of updating a job is to delete the old one and insert a new one.

I'm saying 'merge' here instead of the previous 'overwrite' because the basic idea in order to avoid committing us to what the payload should be of the final updated job; in this first implementation we should just overwrite the old job with the new one, but in future we might allow the job type plugin to have an opinion about how the payload of old and new job are combined.

jonathanshaw’s picture

I agree that BackendBase::ensureUniqueJobsSupport() does not seem necessary. It's unused in the Database backend which is the only one we offer.

I agree SupportsUniqueJobsInterface doesn't need a public getDuplicateJobId() method.

I dislike the jobTypeRequiresUniqueness() and getJobTypeUniquenessType() methods on the JobTypeManager. I understand the logic for wanting to check this without instantiating a plugin, but the semantics seems very ugly. I think it would be nicer to have this as just a single method on the Job class: Job::getDuplicatesPolicy() which itself calls out to JobTypeManager::getDefinition().

I agree $this->getBackend()->supportsJobType($job_type); is a nicer semantic than trying to figure this out on the Queue entity or in the JobTypeManager. Actually if we do this, then strictly speaking we don't even need the SupportsUnique/Duplicates interface at all.

jonathanshaw’s picture

I've opened a new MR as I'm making a significant number of changes here:
- using 'duplicates' as the key term instead of 'unique'
- introducing the term 'duplicates policy' to describe the duplicates handling strategy specified on a job type.
- using 'fingerprint' instead of 'unique id' or 'job hash'
- removing jobTypeRequiresUniqueness() and getJobTypeUniquenessType() methods from the JobTypeManager, in favour of new method Job::getDuplicatesPolicy()
- removing SupportsUniqueJobsInterface altogether, adding no new interface
- adding a verifyJobSupport() method to BackendInterface and BackendBase.

However, doing this I've realised that a lot of the reason why this issues is getting tricky goes back to #20:

Backends may or may not support unique jobs.

However, silently ignoring jobs that should be unique is dangerous, ... Commerce Recurring's jobs absolutely MUST NOT be used with a backend that does not support unique jobs, and therefore, I think we should guard against this here.

So therefore, I want to introduce a concept of a job type *requiring* that its jobs be treated as unique. This can be declared on the job type plugin annotation.

This goal makes sense. However, it is not really acheivable for the reasons pointed to in #93. It is perfectly possible and reasonable for custom code using this module to enqueue jobs with:
$queue->getBackend()->enqueueJobs([$jobs]);
This means that calling code has direct access to the backend plugins. We can't stop calling code from passing jobs to backend plugins that can't properly handle them, and we can't force custom or contrib backend plugins to verify they can handle a job properly.

Everything we are doing with SupportsUniqueJobsInterface, Queue::verifyQueueSatisfiesJob(), BackendInterface::verifyJobSupport(), JobTypeManager::jobTypeRequiresUniqueness() etc is basically wishful thinking. We are trying to create protective mechanisms but we can't ensure that anyone uses them.

I think at this point we should:
- rip out these mechanisms (which in MR11 are basically simplified to BackendInterface::verifyJobSupport())
- create a new issue for discussing and working on this rather tricky problem, which may apply to other future features not just this duplicates feature
- document on the annotation type that duplicates handling may not be safe when using backend other than the Database backend.

joachim’s picture

> Let's discard the whole term "unique" and focus on the language of duplicates and how we handle duplicates: we need to detect duplicates (by hash), and then manage them.

I sort of see what you're getting at, but the word 'duplicate' carries the same connotations as 'unique' -- ensuring there's only one of something.

> I think it would be nicer to have this as just a single method on the Job class: Job::getDuplicatesPolicy() which itself calls out to JobTypeManager::getDefinition().
> - removing jobTypeRequiresUniqueness() and getJobTypeUniquenessType() methods from the JobTypeManager, in favour of new method Job::getDuplicatesPolicy()

The Job class can't depend on the service container!!!! I *removed* that in my latest commits. It crashes the unit tests otherwise.

> Notice that DUPLICATES_MERGE I've said requires SupportsDeletingJobsInterface instead of some new SupportsUpdatingJobsInterface - this is because the simplest possible way of updating a job is to delete the old one and insert a new one.

Then it's not 'merge', it's 'replace'. Saying 'merge' when it's actually deleting the old one is misleading.

bojanz’s picture

Is there enough value in actually support the "merge" strategy? Or the concept of strategies at all?

In all other implementations I've used, a job/task is either unique/deduplicated (meaning that a job with the same fingerprint is rejected on enqueue) or it isn't.

My assumption without reading the code would be that I do:

// If I don't provide a fingerprint, it is autogenerated for me.
$job = Job::createUnique("my_type", $payload, ['fingerprint' => 'my-unique-value']);

enqueueJobs() can check if $job->getFingerprint() is non-empty, and if the backend does not implement SupportsUniqueJobsInterface, throw an exception.

jcnventura’s picture

I tend to support #113 as well. From the point of view of maintenance, it makes the code a lot simpler, and I think also from the point of view of an API user. As an API user, I'd try to create the job, and if I get the exception, I either propagate the error up, or I do something to make the job unique and try again. Normally, the API user is the one that would better understand the consequences of trying to create duplicate jobs.

jonathanshaw’s picture

Status: Needs work » Needs review

Thanks to everyone for the feedback.

What I've done is:
- simplified the job type annotation to a boolean
- given the job type plugin responsibility for deciding how duplicates should be handled
- made the base job type plugin's implementation simply throw an exception if there's a duplicate
- given the Queue entity, not the Database backend, responsibility for looking for duplicates.

This means that all of the duplicates functionality added in this issue is completely bypassed if you call $queue->getBackend()->enqueue(). But if you call $queue->enqueue() you get safe behavior whatever backend you use.

I believe in a follow-up issue we should deprecate Queue::getBackend(). This function creates a leaky abstraction of sorts; by putting API users in direct contact with the backend, we lose any ability to intermediate. Instead API users should interact with the queue entity, and we can use that to provide safety and backwards compatability as we add new features like this in future.

jonathanshaw’s picture

I like where this has got too. It seems clean and simple, but flexible enough to accomodate any use case.

jonathanshaw’s picture

Green at last.

adamfranco’s picture

StatusFileSize
new777 bytes

@jonathanshaw, there's a typo in Merge request 11 on lines 132-133 of advancedqueue.install. The variable initialized is $schema but undefined $database_schema name is used. See attached patch.

jonathanshaw’s picture

Thanks @adamfranco, I fixed the variable name in the update hook.

astonvictor’s picture

Status: Needs review » Needs work

Hi @jonathanjfshaw,

thanks for your work and your MR.

I tested your changes by:
- creating a new AdvancedQueueJobType plugin with allow_duplicates = false configuration;
- added a code to create a new job after submitting a custom form.

as a result after creating the second job with the same value, I got a fatal error - Drupal\advancedqueue\Exception\DuplicateJobException: Job rejected because it duplicates existing jobs.

I guess, we should simply not add the second job to the queue and add a log with a notice.

FYI tested with drupal 10.1.5, PHP 8.1

What do you think?

jonathanshaw’s picture

Status: Needs work » Needs review

I got a fatal error - Drupal\advancedqueue\Exception\DuplicateJobException: Job rejected because it duplicates existing jobs.

I guess, we should simply not add the second job to the queue and add a log with a notice.

Yes, that behavior is by design. This strategy was chosen by the maintainer in #114:

As an API user, I'd try to create the job, and if I get the exception, I either propagate the error up, or I do something to make the job unique and try again. Normally, the API user is the one that would better understand the consequences of trying to create duplicate jobs.

astonvictor’s picture

Status: Needs review » Reviewed & tested by the community

yeah, it works for me with try-catch.

I guess it should be somehow documented on the module page / readme.

tBKoT made their first commit to this issue’s fork.

tbkot’s picture

StatusFileSize
new28.32 KB

Patch with the latest changes from MR to be used in #2931290: Multiple recurring orders get created if cron is badly configured

tbkot’s picture

StatusFileSize
new28.37 KB

Updated patch for 1.0 version instead of 1.x-dev branch

  • jcnventura committed 20d1bb3e on 8.x-1.x
    Issue #2918866 by jonathanshaw, joachim, SmovS, brunodbo, Evaldas...

jcnventura’s picture

Priority: Normal » Major
Status: Reviewed & tested by the community » Fixed
Issue tags: -Needs issue summary update +Needs change record

Thanks everyone! And there were a lot of you. As to how to document, this.. I guess it will be part of the release notes when the next release is created. If someone would be kind enough to create a change record, that would be amazing.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

khiminrm’s picture

StatusFileSize
new27.2 KB
new630 bytes

Hi!

When testing the patch, I've noticed that it's not always working properly.

I will try to explain. I've tested it with commerce_recurring.
The job payload is simple: ['order_id' => $order->id()]
So I've run the 'Commerce Recurring' cron job to process a recurring order. The job has created two queue items:
with types commerce_recurring_order_close and commerce_recurring_order_renew with payload {"order_id":"230185"}.
Then I've run the 'Advanced Queue' cron job. It has processed the queue items. One of them was failed due to another issue not related to this one. And another one was successfully processed. So I had two queue items with states failure and

success</code at those moment.
But after running the  'Commerce Recurring' cron job again, two new queue items with the same payload and fingerprint were created. So the items were duplicated.
And if ran the the  'Commerce Recurring' cron job again without running the the 'Advanced Queue' cron job before it, the duplicates are not created, because the items had state <code>queued

and they were found as duplicated.

So, I think we should remove condition where we check only two states in the getDuplicateJobs() method:
->condition('state', [Job::STATE_QUEUED, Job::STATE_PROCESSING, Job::STATE_FAILURE], 'IN');

Please, review and re-test if possible. I'm attaching the updated patch from #125 for 1.0 and interdiff.

Thanks in advance!

jsacksick’s picture

Hm... Perhaps in some cases we still want to allow queuing jobs with the same fingerprint?
One usecase I could think of is a job that notifies a 3rd party system after an entity was updated, you probably wouldn't to notify twice for the same update, but would still want to notify about subsequent updates but I'm guessing in this case the payload should contain what's supposedly unique about the update i.e. the changed timestamp for example in addition to the entity ID?

Not sure, but in the case of these recurring jobs, we never want to queue duplicate jobs...

khiminrm’s picture

bojanz’s picture

By definition a fingerprint must be unique. Therefore it should be impossible to queue two jobs with the same fingerprint. Producers who wish to run a job after all can influence the fingerprint by including an always-unique component such as a timestamp.

jsacksick’s picture

@bojanz: Agreed, wanted to come back and edit my message, but Roman posted a patch in that sense, a fingerprint should be unique and it'd be the responsability of the job type to make sure it is for usecases I mentioned in #132.

zaporylie’s picture

Draft Change Record was created: https://www.drupal.org/node/3468315. Feel free to adjust and publish before making a tagged release that includes this feature. Thanks, everyone for a great job on this, hope to see it as part of the tagged release in a very short time 😉

jonathanshaw’s picture

We overlooked something: we didn't harden the logic in the queue to handle the possibility that the job plugin's handleDuplicateJobs() method might return null to cancel the job.

I opened a follow-up: #3476814: Handle nulled duplicate jobs