Add support for unique jobs

Lithuanian

Kaunas, LT

commented 29 May 2018 at 12:20

Status:

Active

» Needs review

Status	File	Size
new	advancedqueue-unique-jobs-2918866-9.patch	6.41 KB

Made a unique jobs support. Quite similar to that was made in D7.

Used key instead of the hash, because I think, that job creator should decide how to handle uniqueness.
In my use case, I have different job types for the same object, so it was essential to have own key to override other jobs.

Log in or register to post comments

Comment #10

29 May 2018 at 12:23

Status:

Needs review

» Needs work

The last submitted patch, 9: advancedqueue-unique-jobs-2918866-9.patch, failed testing. View results

Log in or register to post comments

Comment #11

Lithuanian

Kaunas, LT

commented 29 May 2018 at 12:51

Status:

Needs work

» Needs review

Status	File	Size
new	advancedqueue-unique-jobs-2918866-10.patch	6.42 KB

1 file was hidden/shown/deleted

Status	File	Size
hidden	advancedqueue-unique-jobs-2918866-9.patch	6.41 KB

Trying to fix MySQL error: the key is too long.

Log in or register to post comments

Comment #12

29 May 2018 at 12:59

Status:

Needs review

» Needs work

The last submitted patch, 11: advancedqueue-unique-jobs-2918866-10.patch, failed testing. View results

Log in or register to post comments

Comment #13

Lithuanian

Kaunas, LT

commented 29 May 2018 at 13:05

Status:

Needs work

» Needs review

Status	File	Size
new	advancedqueue-unique-jobs-2918866-11.patch	6.46 KB

1 file was hidden/shown/deleted

Status	File	Size
hidden	advancedqueue-unique-jobs-2918866-10.patch	6.42 KB

Fixed - Undefined index: unique_key

Log in or register to post comments

Comment #14

29 May 2018 at 13:08

Status:

Needs review

» Needs work

The last submitted patch, 13: advancedqueue-unique-jobs-2918866-11.patch, failed testing. View results

Log in or register to post comments

Comment #15

Lithuanian

Kaunas, LT

commented 29 May 2018 at 16:48

Status	File	Size
new	advancedqueue-unique-jobs-2918866-15.patch	6.89 KB

1 file was hidden/shown/deleted

Status	File	Size
hidden	advancedqueue-unique-jobs-2918866-11.patch	6.46 KB

Fixed last failed test

Log in or register to post comments

Comment #16

Lithuanian

Kaunas, LT

commented 29 May 2018 at 16:57

Status:

Needs work

» Needs review

Log in or register to post comments

Comment #17

Lithuanian

Kaunas, LT

commented 29 May 2018 at 17:00

Status	File	Size
new	advancedqueue-unique-jobs-2918866-17.patch	6.89 KB

FIxed coding standard.

Log in or register to post comments

Comment #18

Lithuanian

Kaunas, LT

commented 30 May 2018 at 13:18

1 file was hidden/shown/deleted

Status	File	Size
hidden	advancedqueue-unique-jobs-2918866-15.patch	6.89 KB

Log in or register to post comments

Comment #19

joachim commented 30 November 2018 at 13:16

> Have an empty SupportsUniqueJobsInterface that tells the outside world whether it's supported, or silently ignored.

My first thought was to define whether a backend plugin supports uniqueness or not on the plugin annotation.

The advantage of using the plugin definition is that a child class can change the support from TRUE to FALSE, whereas you can't remove an interface from a child class in PHP.

Log in or register to post comments

Comment #20

joachim commented 30 November 2018 at 14:47

Status	File	Size
new	2918866-20.advancedqueue.Add-support-for-unique-jobs.patch	20.62 KB

The patch at #17 doesn't work in the way that previous comments were suggesting.

It's requiring the caller to handle passing in a unique key of their own devising, which is fiddly & also open to errors. (Case in point: while writing my patch, my test was failing because the code I wrote to create a unique key from a hash wasn't considering the queue name in the hash!)

So here is an updated patch with an aproach in like with what @bojanz was suggesting.
(Though @Evaldas Užkuras should be credited as I've kept all the .install file code, which was much appreciated as I hate writing DB update code!)

@bojanz concurred with this in #commerce on slack, which I'm taking the liberty of quoting:

> my intention was to allow callers to skip defining a unique key
> every time they enqueue a job
> I found that a bit annoying in D7

in #5:

> I like the idea of a hash, though I'm not sure sure the job type comparison override is needed.

I agree. Let's leave that off for now, as I don't know of any use cases where the job type plugin might want to customize how the hash is created. It's therefore most likely a YAGNI.

So for the hash, we need to ensure a job is unique only within its type and its queue. Otherwise, we could queue two different jobs for the same order, and the second one would be ignored! Therefore the hashing uses the job queue, job type, and the serialized payload:

 * This supports unique jobs. Two jobs of the same type and in the same queue
 * are considered unique if their payloads are identical. Jobs with the same
 * payload in different queues, or of different types, are not considered to be
 * identical.

> Have an empty SupportsUniqueJobsInterface that tells the outside world whether it's supported, or silently ignored.

Backends may or may not support unique jobs.

However, silently ignoring jobs that should be unique is dangerous, and as we've seen at #2931290: Multiple recurring orders get created if cron is badly configured can be extremely detrimental to a site (with Commerce Recurring, it leads to duplicate orders, which will in time lead to customers being overcharged, which is pretty bad!). Commerce Recurring's jobs absolutely MUST NOT be used with a backend that does not support unique jobs, and therefore, I think we should guard against this here.

So therefore, I want to introduce a concept of a job type *requiring* that its jobs be treated as unique. This can be declared on the job type plugin annotation.

I don't think we need finer granularity than this by allowing this on a job-by-job basis. The use cases I've seen, it's the job type as a whole that requires job payloads to be treated as unique. (If for some reason you need some jobs to be unique and some not, you can define a 2nd job type that inherits the class and has a different annotation, or you can cheat and chuck a timestamp into your payload ;)

However, what I don't massively like about the code I've written for this is that Database::enqueueJobs() has to call parent::enqueueJobs() to check that the backend is compatible with the jobs to queue. And obviously, any other backend implementations that exist in the wild will have to add that call to the parent too. Suggestions on how to make this more elegant welcome!

Log in or register to post comments

Comment #21

brunodbo

he/him

commented 12 December 2018 at 19:00

Tested the patch in #20: I added ensure_unique = TRUE to my job type plugin's annotation, and am seeing the hash being created when a job is enqueued.

I'm seeing the following behavior though, maybe I'm misunderstanding how this is supposed to work:

- When I run cron, an existing job record's 'available date' is being updated (I haven't specified a delay for this job when enqueuing it)
- During the same cron run, the previous 'version' (i.e., with the previous 'available date') of the same job gets processed, leading to the same job being processed over and over again in subsequent cron runs. In my testing, the job never reached completion.

Is this expected? I was expecting the job record wouldn't change at all once it's been created, if the job type requires uniqueness, to guard against multiple processing. Does the prevention of multiple processing has to be taken care of in the job type plugin somehow?

Log in or register to post comments

Comment #22

joachim commented 12 December 2018 at 20:35

> - When I run cron, an existing job record's 'available date' is being updated (I haven't specified a delay for this job when enqueuing it)

That's an interesting question. IIRC the 'available_date' property on a job is the date it's available from, and by default, that's the moment it was placed in the queue. Arguably, attempting to re-queue it shouldn't change that date, because it's still the case that the job was available from the time it was queued the first time.

> - During the same cron run, the previous 'version' (i.e., with the previous 'available date') of the same job gets processed, leading to the same job being processed over and over again in subsequent cron runs. In my testing, the job never reached completion.

Ouch.

So the job got loaded and processed by one cron hook, and then requeued by another cron hook? But why didn't cron hook 1 mark the job as complete?

Can you explain the different cron hooks you had running, and their exact order?

Log in or register to post comments

Comment #23

brunodbo

he/him

commented 12 December 2018 at 21:55

> Arguably, attempting to re-queue it shouldn't change that date, because it's still the case that the job was available from the time it was queued the first time.

I would say that the attempt to re-queue a job shouldn't change the date it should be available. Otherwise, there might be cases that a job never becomes available? For example, if I queue a job that should send a message at 'now + 30 days', and the 'available date' for that job keeps getting changed with 'now + 30 days', the job would never become available.

Should we add a check in \Drupal\advancedqueue\Plugin\AdvancedQueue\Backend\Database::enqueueJobs to see if the job exists, and only call $job->setAvailableTime($this->time->getCurrentTime() + $delay); if it's a new job? Or simply bail if we have an existing job?

> So the job got loaded and processed by one cron hook, and then requeued by another cron hook? But why didn't cron hook 1 mark the job as complete?

Yep, that's exactly what it looks like.

> Can you explain the different cron hooks you had running, and their exact order?

I have the 'Advanced queue' cron hook running first, and then my cron hook that queues the job runs right after (using Ultimate cron to check the order, but didn't change the order). Is it possible that the job gets re-queued because it isn't completed yet by the time my cron hook runs? When I disable my cron hook that does the queuing, the existing job completes (state 'Success').

Log in or register to post comments

Comment #24

joachim commented 13 December 2018 at 15:49

> Is it possible that the job gets re-queued because it isn't completed yet by the time my cron hook runs?

Do you have cron jobs running in parallel? My understanding is that with a normal setup they run in sequence, so AQ should finish processing the job and decide whether it succeeded before your own cron hook tries to queue things.

Can you add some logging statements to your code to get some finer detail on the order of operations?

Also, if you were able to reproduce this in a kernel test, that would be great!

Log in or register to post comments

Comment #25

joachim commented 13 December 2018 at 15:50

> I would say that the attempt to re-queue a job shouldn't change the date it should be available. Otherwise, there might be cases that a job never becomes available? For example, if I queue a job that should send a message at 'now + 30 days', and the 'available date' for that job keeps getting changed with 'now + 30 days', the job would never become available.

Ok, I'm convinced :)

> Should we add a check in \Drupal\advancedqueue\Plugin\AdvancedQueue\Backend\Database::enqueueJobs to see if the job exists, and only call $job->setAvailableTime($this->time->getCurrentTime() + $delay); if it's a new job? Or simply bail if we have an existing job?

Yes, quite probably. Which will make it slower than the merge query, unless there's a way we can set the merge query to not bother making a change at all if the item already exists.

We basically shouldn't change an existing job at all when attempting to queue a duplicate.

Log in or register to post comments

Comment #26

brunodbo

he/him

commented 13 December 2018 at 20:48

> Do you have cron jobs running in parallel? My understanding is that with a normal setup they run in sequence, so AQ should finish processing the job and decide whether it succeeded before your own cron hook tries to queue things.

Apparently Ultimate Cron runs cron jobs in parallel, I just learned from its project page (I had only installed it yesterday to review the cron jobs' order) :) So with Ultimate Cron installed, my job gets queued and processed during the same cron run, since the cron jobs run in parallel.

I did some more poking around, added some logging code as you suggested, and realized that the 'unexpected' re-queuing I mentioned in #21 is my own doing :D. During my queuing cron hook, I look for users that are flagged, and queue a job to send a message to those users. But then in the queued job, I don't do anything that would prevent the same job from getting queued again (e.g., users aren't unflagged, since the job just sends a message)...

> We basically shouldn't change an existing job at all when attempting to queue a duplicate.

What if we add a $job->exists() method that checks for a job with a matching hash, so we can call that at the start of \Drupal\advancedqueue\Plugin\AdvancedQueue\Backend\Database::enqueueJobs? I can imagine such a method could be useful in other places as well.

Log in or register to post comments

Comment #27

brunodbo

he/him

commented 14 December 2018 at 00:13

Status	File	Size
new	2918866-27.advancedqueue.Add-support-for-unique-jobs.patch	15.33 KB

This adds a hasDuplicateJob method to the Database backend to check whether the to-be-queued job already has a matching job. Then in enqueueJobs(), we check for a duplicate, and return if we have one. If we don't, we fill in the unique_key field in the insert query. Not sure if we still need the merge query?

The above also allows me to do something like this when queuing jobs, so enqueueJob() doesn't get called in the first place:

if (!$queue->getBackend()->hasDuplicateJob($job)) {
  $queue->enqueueJob($job);
}

What do you think?

I'm also wondering whether we should move the unique key creation to the Job class, though it makes sense that the backend is responsible for deciding how to implement uniqueness...

Log in or register to post comments

Comment #28

14 December 2018 at 00:18

Status:

Needs review

» Needs work

The last submitted patch, 27: 2918866-27.advancedqueue.Add-support-for-unique-jobs.patch, failed testing. View results

Log in or register to post comments

Comment #29

brunodbo

he/him

commented 14 December 2018 at 01:15

Status:

Needs work

» Needs review

Status	File	Size
new	2918866-29.advancedqueue.Add-support-for-unique-jobs.patch	21.29 KB

Better patch.

Log in or register to post comments

Comment #30

14 December 2018 at 01:17

Status:

Needs review

» Needs work

The last submitted patch, 29: 2918866-29.advancedqueue.Add-support-for-unique-jobs.patch, failed testing. View results
- codesniffer_fixes.patch Interdiff of automated coding standards fixes only.

Log in or register to post comments

Comment #31

brunodbo

he/him

commented 14 December 2018 at 01:53

Status	File	Size
new	2918866-31.advancedqueue.Add-support-for-unique-jobs.patch	21.24 KB

2 files were hidden/shown/deleted

Status	File	Size
hidden	2918866-27.advancedqueue.Add-support-for-unique-jobs.patch	15.33 KB
hidden	2918866-29.advancedqueue.Add-support-for-unique-jobs.patch	21.29 KB

Not sure yet why DatabaseBackendTest is failing, but this fixes an issue in the foreach in enqueueJobs() (continue instead of return), and should fix the coding standards test fail.

Log in or register to post comments

Comment #32

brunodbo

he/him

commented 14 December 2018 at 01:52

Status:

Needs work

» Needs review

Log in or register to post comments

Comment #33

14 December 2018 at 02:04

Status:

Needs review

» Needs work

The last submitted patch, 31: 2918866-31.advancedqueue.Add-support-for-unique-jobs.patch, failed testing. View results

Log in or register to post comments

Comment #34

joachim commented 14 December 2018 at 10:38

> I did some more poking around, added some logging code as you suggested, and realized that the 'unexpected' re-queuing I mentioned in #21 is my own doing :D. During my queuing cron hook, I look for users that are flagged, and queue a job to send a message to those users. But then in the queued job, I don't do anything that would prevent the same job from getting queued again (e.g., users aren't unflagged, since the job just sends a message)...

Thanks for investigating that one!

So should it be the case that we don't queue a duplicate if the original is currently claimed?

We definitely need to consider a completed job as no longer unique -- Commerce Recurring requires this, for instance, as every billing period (eg every month) it will queue a job for the same order.

A few points about the latest patch (some of which are to do with my previous work, feel free to ignore those and I will get round to them later!) --

+++ b/src/Plugin/AdvancedQueue/Backend/BackendBase.php
@@ -133,6 +145,28 @@ abstract class BackendBase extends PluginBase implements BackendInterface, Conta
+  /**
+   * {@inheritdoc}
+   */
+  public function enqueueJobs(array $jobs, $delay = 0) {

On reflection, I think I want to move this code to its own method.

It doesn't fix the problem that plugin implementations need to call it, but it's cleaner and better DX than having to call parent::enqueueJobs().

+++ b/src/Plugin/AdvancedQueue/Backend/Database.php
@@ -10,12 +11,17 @@ use Symfony\Component\DependencyInjection\ContainerInterface;
+ * This supports unique jobs. Two jobs of the same type and in the same queue
+ * are considered unique if their payloads are identical. Jobs with the same
+ * payload in different queues, or of different types, are not considered to be
+ * identical.

Needs to mention job status as well.

+++ b/src/Plugin/AdvancedQueue/Backend/Database.php
@@ -143,6 +165,56 @@ class Database extends BackendBase implements SupportsDeletingJobsInterface, Sup
+      ->condition('unique_key', $unique_key)
+      ->fields('a', ['job_id'])
+      ->countQuery()
+      ->execute()
+      ->fetchField();

This needs to consider the job status, as mentioned above. A completed job doesn't count as a duplicate.

```
+  public function hasDuplicateJob(Job $job) {
```
This should be added to SupportsUniqueJobsInterface.

Log in or register to post comments

Comment #35

brunodbo

he/him

commented 15 December 2018 at 01:19

> So should it be the case that we don't queue a duplicate if the original is currently claimed?

I think so. That would mean that a job is unique (and shouldn't be re-queued) until it reaches the completed state.

We'll also a need a way to reflect the job's state in the unique key handling, to avoid integrity constraint violations for the 'unique_key' field when queuing a next iteration of the job (i.e., when a job is completed, and we queue it again). Should we update/empty the 'unique_key' field when a job reaches completion?

Log in or register to post comments

Comment #36

joachim commented 15 December 2018 at 09:07

> We'll also a need a way to reflect the job's state in the unique key handling,

That's not going to work. A job that's being queued is always going to be in the 'queued' state (or whatever, post-breakfast, don't have the code in front of me). We have a use case where the same order gets queued, say, once a month. So that's going to have the same hash each time.

> to avoid integrity constraint violations for the 'unique_key' field when queuing a next iteration of the job (i.e., when a job is completed, and we queue it again). Should we update/empty the 'unique_key' field when a job reaches completion?

... and because of this use case, we actually shouldn't have a database-level unique key constraint on the unique key column. That's actually a good catch, as up till now, our use case of 'same order should get queued a month later' would cause a DB crash!! We should add a test for this.

We're not making use of that constraint anyway, because we now do a DB load to check for a job before doing an insert.

Log in or register to post comments

Comment #37

joachim commented 15 December 2018 at 09:42

Oh, another thing to do -- we should log a notice when a duplicate job is rejected. At the moment, the patch just swallows the error. But that could lead to real developer headaches if there's a bug that's incorrectly causing a job to be a duplicate.

BTW, I really appreciate all your help with this issue! I thought the patch was pretty much done, and you've brought loads of problems to light! Yay for peer review!

Log in or register to post comments

Comment #38

brunodbo

he/him

commented 18 December 2018 at 03:02

Status	File	Size
new	2918866-38.advancedqueue.Add-support-for-unique-jobs.patch	21.46 KB

Happy to help!

#34:

1. Attempt at refactoring this in attached patch. Method name suggestions welcome :)

2. Done, though could be made easier to read. Maybe split it up in 2 paragraphs?

3. WIP: jobs will be re-queued if previous jobs with the same hash have the state 'success'. There is still an issue with the query in \Drupal\advancedqueue\Plugin\AdvancedQueue\Backend\Database::hasDuplicateJob (which is likely the reason the tests are failing): $unique_key has a different value than what is stored in the db, so it might be something with the way the hash field is being checked in the query?

4. Done

#36:

> we actually shouldn't have a database-level unique key constraint on the unique key column.

Unique key constraint removed in attached patch.

> We should add a test for this.

Where should this test go? Testing newbie here...

#37:

Should we add the logger in this patch (AQ doesn't have a logger channel yet AFAICT)? Or should that be its own issue?

Log in or register to post comments

Comment #39

brunodbo

he/him

commented 18 December 2018 at 03:03

Status:

Needs work

» Needs review

Log in or register to post comments

Comment #40

18 December 2018 at 03:08

Status:

Needs review

» Needs work

The last submitted patch, 38: 2918866-38.advancedqueue.Add-support-for-unique-jobs.patch, failed testing. View results

Log in or register to post comments

Comment #41

joachim commented 21 December 2018 at 14:48

Status:

Needs work

» Needs review

Status	File	Size
new	2918866-41.advancedqueue.Add-support-for-unique-jobs.patch	23.19 KB
new	interdiff.2918866.38-41.txt	8.14 KB

Changes since the last patch:

- changed docs for hasDuplicateJob() not to mention key, as it's an implementation detail.
- fixed duplicate job check to consider queued or processing states only.
- add mention of states to hasDuplicateJob() docs.
- fixed duplicate job check not working due to no queue ID.
- changed method name.
- make protected, add to docs.
- removed pointless base method
- fixed non unique backend not calling helper.
- added logging.

Still todo, maybe:

- I'm not convinced hasDuplicateJob() should be public, on second thoughts. Queue users shouldn't need to worry about this, they should be able to let the queue handle this for them.
- I'm tempted to rename the unique_key DB column to something like hash, as it's not unique in the database sense of the word, so could confused developers.

Log in or register to post comments

Comment #42

brunodbo

he/him

commented 26 January 2019 at 21:05

Status	File	Size
new	2918866-42.advancedqueue.Add-support-for-unique-jobs.patch	23.16 KB
new	interdiff_41-42.txt	3.53 KB

I'm not convinced hasDuplicateJob() should be public, on second thoughts. Queue users shouldn't need to worry about this, they should be able to let the queue handle this for them.

Yep, think so too - seems like that should be an internal thing. Didn't change this in the attached patch yet - should we do away with the interface then (since interface methods should be public)?

I'm tempted to rename the unique_key DB column to something like hash, as it's not unique in the database sense of the word, so could confused developers.

Yeah, I got confused by that as well :) Changed that in the patch here, and rename createJobUniqueKey() to createJobHash()

Also rewrote the docs for hasDuplicateJob() a little, to (hopefully) make it easier to read.

Log in or register to post comments

Comment #43

English

Stroud, UK

commented 3 August 2019 at 11:31

should we do away with the interface then (since interface methods should be public)?

We still need a way for a backend to be able to declare that it can handle jobs that require uniqueness, so that we can throw an exception if a job that requires uniqueness is passed to a backend that can't provide that.

But this doesn't have to be done by an interface: we could add a new annotation to the backend plugin type instead.

Log in or register to post comments

Comment #44

thronedigital commented 15 October 2019 at 23:52

Any updates?

Log in or register to post comments

Comment #45

gordon commented 19 December 2019 at 23:03

Status:

Needs review

» Needs work

I have done some investigation into this as I required this for a client but it was not a good fit even though this would make the implementation very clean.

1. the method of using a hash for the uniqueness based upon the payload I think is not a good idea. Since it is based upon the entire payload you can't add in additional variable parameters. eg. I want the uniqueness based up on the node id, however I need to add in other data as well which I don't want to be a part of the uniqueness such as the a workflow status which I may want to change later.

2. So if you try to enqueue a new version of an existing job, if will simply discard the new job, in my case I don't care about the existing job but want to replace it with the new one, which may have a new available date, or other changes.

3. Also the enqueuing will crash if you put try to put a job which is unique onto a queue which backend doesn't support uniqueness. Since there is only the database backend this is not a huge problem but will create issues if/when alternate backends are created.

Log in or register to post comments

Comment #46

joachim commented 20 December 2019 at 06:55

>1. the method of using a hash for the uniqueness based upon the payload I think is not a good idea. Since it is based upon the entire payload you can't add in additional variable parameters. eg. I want the uniqueness based up on the node id, however I need to add in other data as well which I don't want to be a part of the uniqueness such as the a workflow status which I may want to change later.

To address that, we can do this:

> - add a method to the job plugin interface to let each job type plugin define how a job payload gets turned into a hash (as there may be some data in the payload that should be ignored when considering what makes two jobs 'the same')

Log in or register to post comments

Comment #47

English

Stroud, UK

commented 20 December 2019 at 10:31

2. So if you try to enqueue a new version of an existing job, if will simply discard the new job, in my case I don't care about the existing job but want to replace it with the new one, which may have a new available date, or other changes.

This use case interests me too.

Log in or register to post comments

Comment #48

gordon commented 23 December 2019 at 05:44

Status:

Needs work

» Needs review

Status	File	Size
new	2918866-48.patch	33.29 KB
new	interdiff-2918866-42-48.txt	16.48 KB

I have made some changes so that it will now accept an unique id. So as a part of the definition you can when creating a job you can specify it and it will be used. However it will only use this if the job type is a unique type, otherwise

$job->getUniqueId();

Will alway return NULL.

If you do not specify the unique id, then it will fall back to generating a hash from the payload.

Also I have changed the unique flag on the job to be one of the following

UNIQUE_DUPLICATES
UNIQUE_DONT_OVERWRITE
UNIQUE_OVERWRITE

So now you can specify if you want to overwrite the unique items or ignore it.

Lastly so you can I have changed how Job::create() works so you can specify the unique id using this interface rather than using new Job() to do this.

$job = Job::create('unique_queue', ['nid' => 123], ['unique_id' => 'node:123']);

So in the 3rd parameter it is just a $definitions array and gets added to the payload, queue id and queue status.

I also added an index on the unique id field so that it will run faster.

I also found a couple of bugs in that the state and the available were getting updated on the job after it was being converted into an array.

Lastly I found a couple of bugs in that the

Log in or register to post comments

Comment #49

gordon commented 24 December 2019 at 00:55

Status	File	Size
new	2918866-49.patch	33.18 KB
new	interdiff-2918866-42-49.txt	16.48 KB

2 files were hidden/shown/deleted

Status	File	Size
hidden	2918866-48.patch	33.29 KB
hidden	interdiff-2918866-42-48.txt	16.48 KB

found some bugs that broke it. fixed now.

Log in or register to post comments

Comment #50

English

Stroud, UK

commented 4 February 2020 at 10:45

Status:

Needs review

» Needs work

We lack test coverage for the changes introduced in #48.

Log in or register to post comments

Comment #51

he/him/his

English

Brooklyn, NY

commented 1 March 2020 at 17:14

Status	File	Size
new	2918866-50.patch	29.89 KB

Reroll plus some minor clean up

Log in or register to post comments

Comment #52

he/him/his

English

Brooklyn, NY

commented 1 March 2020 at 17:31

Status	File	Size
new	2918866-52.patch	29.92 KB

1 file was hidden/shown/deleted

Status	File	Size
hidden	2918866-50.patch	29.89 KB

Sorry, ignore #51

Log in or register to post comments

Comment #53

he/him/his

English

Brooklyn, NY

commented 1 March 2020 at 17:39

Status	File	Size
new	2918866-53.patch	29.92 KB

1 file was hidden/shown/deleted

Status	File	Size
hidden	2918866-52.patch	29.92 KB

Third time is a charm.

Log in or register to post comments

Comment #54

English

Stroud, UK

commented 2 March 2020 at 00:27

Status:

Needs work

» Needs review

Let's see what the testbot says

Log in or register to post comments

Comment #55

2 March 2020 at 00:30

Status:

Needs review

» Needs work

The last submitted patch, 53: 2918866-53.patch, failed testing. View results
- codesniffer_fixes.patch Interdiff of automated coding standards fixes only.

Log in or register to post comments

Comment #56

he/him/his

English

Brooklyn, NY

commented 2 March 2020 at 01:01

Status	File	Size
new	2918866-56.patch	29.92 KB

1 file was hidden/shown/deleted

Status	File	Size
hidden	2918866-53.patch	29.92 KB

Whoops, wrong folder.

Log in or register to post comments

Comment #57

gordon commented 18 March 2020 at 23:50

Status	File	Size
new	interdiff-2918866-49-56.txt	2.83 KB

Just adding an interdiff for the changes between 49 and 56

Log in or register to post comments

Comment #58

gordon commented 19 March 2020 at 00:03

Status	File	Size
new	2918866-58.patch	31.69 KB
new	interdiff-2918866-56-58.txt	684 bytes

Since I am running this in live, I had a problem when updating because I would run the update on this patch again. I added in a check to make sure it does try to run this twice.

Log in or register to post comments

Comment #59

martijn de wit

Dutch

🇳🇱 The Netherlands

commented 12 May 2020 at 15:11

Status:

Needs work

» Needs review

Log in or register to post comments

Comment #60

12 May 2020 at 15:15

Status:

Needs review

» Needs work

The last submitted patch, 58: 2918866-58.patch, failed testing. View results
- codesniffer_fixes.patch Interdiff of automated coding standards fixes only.

Log in or register to post comments

Comment #61

andypost

he/him

Russian

commented 14 August 2020 at 00:10

12 files were hidden/shown/deleted

Status	File	Size
hidden	advancedqueue-unique-jobs-2918866-17.patch	6.89 KB
hidden	2918866-20.advancedqueue.Add-support-for-unique-jobs.patch	20.62 KB
hidden	2918866-31.advancedqueue.Add-support-for-unique-jobs.patch	21.24 KB
hidden	2918866-38.advancedqueue.Add-support-for-unique-jobs.patch	21.46 KB
hidden	2918866-41.advancedqueue.Add-support-for-unique-jobs.patch	23.19 KB
hidden	interdiff.2918866.38-41.txt	8.14 KB
hidden	2918866-42.advancedqueue.Add-support-for-unique-jobs.patch	23.16 KB
hidden	interdiff_41-42.txt	3.53 KB
hidden	2918866-49.patch	33.18 KB
hidden	interdiff-2918866-42-49.txt	16.48 KB
hidden	2918866-56.patch	29.92 KB
hidden	interdiff-2918866-49-56.txt	2.83 KB

There's already implementation in https://www.drupal.org/project/queue_unique (it using md5 hash, that's only thing to improve)

Log in or register to post comments

Comment #62

andypost

he/him

Russian

commented 16 August 2020 at 19:46

+++ b/src/Entity/Queue.php
@@ -127,6 +134,11 @@ class Queue extends ConfigEntityBase implements QueueInterface {
+    if ($this->jobTypeManager()->jobTypeRequiresUniqueness($job->getType()) && !$this->getBackend() instanceof SupportsUniqueJobsInterface) {
+      throw new \InvalidArgumentException($this->t("Backend :type doesn't support unique jobs", [':type' => $this->getBackend()->getLabel()]));

maybe it should just skip queuing when item already queued or other reaction where same item in processing

+++ b/src/Job.php
@@ -398,6 +408,52 @@ class Job {
+  public function getUniqueId() {
+    if (!\Drupal::service('plugin.manager.advancedqueue_job_type')->jobTypeRequiresUniqueness($this->type)) {
+      return NULL;

needs injection of service

+++ b/src/Job.php
@@ -398,6 +408,52 @@ class Job {
+    // crc32b is the fastest but has collisions due to its short length.
+    // sha1 and md5 are forbidden by many projects and organizations.
+    // This is the next fastest option.
+    $hash = hash('tiger128,3', $data);

this hashing better to keep configurable via some container parameter

+++ b/src/Plugin/AdvancedQueue/Backend/BackendBase.php
@@ -133,6 +145,36 @@ abstract class BackendBase extends PluginBase implements BackendInterface, Conta
+          throw new InvalidBackendException(sprintf(
+            "Job of type %s with payload %s cannot be queued in queue %s because the queue's backend does not support unique jobs.",
+            $job->getType(),
+            serialize($job->getPayload()),

serialization could be dangerous/impossible

Log in or register to post comments

Comment #63

English

Stroud, UK

commented 16 August 2020 at 20:12

serialization could be dangerous/impossible

Really? I thought the payload got stored in a database field by the the database backend, so that probably does some serlializing.

Log in or register to post comments

Comment #64

andypost

he/him

Russian

commented 16 August 2020 at 21:29

This exception is thrown when nothing did check if the job serializable ... so you'll get serialization exception instead of expected

+++ b/src/Plugin/AdvancedQueue/Backend/BackendBase.php
@@ -133,6 +145,36 @@ abstract class BackendBase extends PluginBase implements BackendInterface, Conta
+        if ($this->jobTypeManager->jobTypeRequiresUniqueness($job->getType())) {
+          throw new InvalidBackendException(sprintf(
+            "Job of type %s with payload %s cannot be queued in queue %s because the queue's backend does not support unique jobs.",
+            $job->getType(),
+            serialize($job->getPayload()),
+            $this->queueId

+++ b/src/Plugin/AdvancedQueue/Backend/Database.php
@@ -127,21 +150,70 @@ class Database extends BackendBase implements SupportsDeletingJobsInterface, Sup
+        $this->loggerFactory->get('advanced_queue')->notice('Duplicate job of type %type was not queued in queue %queue. Payload was %payload.', [
...
+          '%payload' => json_encode($job->getPayload()),
...
+        $this->loggerFactory->get('advanced_queue')->notice('Overwriting job %id of type %type in queue %queue. Payload was %payload.', [
...
+          '%payload' => json_encode($job->getPayload()),

moreover payload serialized as json other places so it more confusing

Log in or register to post comments

Comment #65

andypost

he/him

Russian

commented 16 August 2020 at 21:51

Log in or register to post comments

Comment #66

johnpitcairn commented 25 June 2021 at 06:10

Patch at #58 no longer applies to latest dev (or rc2).

Log in or register to post comments

Comment #67

English

commented 25 June 2021 at 07:54

Issue tags:

+Needs reroll

Log in or register to post comments

Comment #68

English

Stroud, UK

commented 3 August 2021 at 20:58

  /**
   * Determines whether the give job type requires support for unique jobs.
   *
   * @param string $job_type_id
   *   The job type ID.
   *
   * @return bool
   *   TRUE if the job type requires a queue backend that supports unique jobs;
   *   FALSE if support for unique jobs is not necessary.
   */
  public function jobTypeRequiresUniqueness($job_type_id) {
    $job_type_definition = $this->getDefinitions()[$job_type_id];
    return $job_type_definition['ensure_unique'] != AdvancedQueueJobType::UNIQUE_DONT_OVERWRITE;
  }

This should be return $job_type_definition['ensure_unique'] !== AdvancedQueueJobType::UNIQUE_DUPLICATES;
Otherwise the UNIQUE_DONT_OVERWRITE option gets ignored and duplicates are allowed.

Log in or register to post comments

Comment #69

ankithashetty

Karnataka, India

commented 4 August 2021 at 13:57

Issue tags:

-Needs reroll

Status	File	Size
new	2918866-69.patch	30.65 KB
new	reroll_diff_2918866_58-69.txt	7.02 KB

Re-rolled the patch in #58, made changes requested in #68 and fixed indentation issues in couple of files.

Retaining status as "Needs work" to address #62, thank you!

Log in or register to post comments

Comment #70

freality commented 9 August 2021 at 14:20

This current solution has the uniqueness hash being created within \Drupal\advancedqueue\Job. Which means a JobType has no influence on its hash criteria. It would be great if hash creation was delegated to JobType; like JobTypeInterface::createHash(Job $job): string;.

Furthermore, all job definitions should have a hash value; with JobTypeBase defining a default implementation [which could simply returns a uuid, db serial, some random string val]. There'd be no need for any logic to determine if a job requires uniqueness.

Log in or register to post comments

Comment #71

chi commented 12 May 2022 at 14:26

A couple of hypotetical issues related to creating hash from the payload.

1. What if payload format is changed during deployment. The payload may still have same meaning for consumers but the hash will be different.

2. What if payload contains some volatile data, i.e. timestamps. The consumers could ingore redundant data but to ensure jub uniqueness they must be removed manually.

Log in or register to post comments

Comment #72

joachim commented 12 May 2022 at 15:14

I think both those cases can be dealt with by overriding createJobHash():

1. Write compatibility code in createJobHash() which ensures the hashes are made in the same way. Admittedly, that does mean you're forever using the OLD format. But changing payload format is a deployment event which arguably requires the queue to be emptied completely and repopulated, because the data is changing fundamentally.

2. Filter out the volatile data in createJobHash()

Log in or register to post comments

Comment #73

chi commented 12 May 2022 at 16:26

Another relevant issue is ability to remove existing jobs. The typicall use case is like follows. A a job has been scheduled to notify users about some event, i.e. a new post on site. The related entity has been removed and you need to cancel the job before emails go out.

Log in or register to post comments

Comment #74

jsacksick commented 7 September 2022 at 07:11

Issue tags:

+Release blocker, +Prague2022

Log in or register to post comments

Comment #75

English

Stroud, UK

commented 7 September 2022 at 09:59

Another relevant issue is ability to remove existing jobs. The typicall use case is like follows. A a job has been scheduled to notify users about some event, i.e. a new post on site. The related entity has been removed and you need to cancel the job before emails go out.

I think this has to be out of scope here. Jobs should check they are still relevant at the point when they are executed, if there is a chance they are not. It's true you could try to (ab)use the job hash mechanism to do something like this, but I don't think we should burden ourselves with that as a consideration here. This feature is proving difficult to land even as it is.

Log in or register to post comments

Comment #76

zengenuity commented 9 September 2022 at 14:51

Status	File	Size
new	2918866-76.patch	30.27 KB

Re-roll of the patch from #69.

Log in or register to post comments

Comment #77

rinasek commented 12 September 2022 at 10:04

Issue tags:

-Prague2022

+KickstartPrague2022

Log in or register to post comments

Comment #78

English

commented 4 November 2022 at 10:56

Issue tags:

-Release blocker

This is a really nice to have, but the module is already in use in 4K sites, and doesn't have this feature yet. We're in Release Candidate status for 2 years already. Removing the 'Release blocker' tag, but would be good if this gets to RTBC soon(ish).

It can always be added to 1.1-beta1.

Log in or register to post comments

Comment #79

crzdev commented 1 December 2022 at 14:21

Status	File	Size
new	2918866-79.patch	30.35 KB

Latest patch #76 unable to apply into 1.0.0-rc6 (or 8.x-1.x), adding reroll

Log in or register to post comments

Comment #80

20 January 2023 at 13:17

SmovS made their first commit to this issue’s fork.

Log in or register to post comments

Comment #81

smovs commented 20 January 2023 at 14:54

Status	File	Size
new	2918866-80.patch	30.65 KB

Updated patch #79 for applying to module ver. 8.x-1.0-rc7

Log in or register to post comments

Comment #82

20 January 2023 at 15:03

SmovS opened merge request !8

Log in or register to post comments

Comment #83

20 January 2023 at 15:16

SmovS closed merge request !8

Log in or register to post comments

Comment #84

smovs commented 20 January 2023 at 15:21

Status	File	Size
new	2918866-84.patch	29.62 KB

Updated patch #79 for applying to module ver. 8.x-1.0-rc7 and removed logging in src/Plugin/AdvancedQueue/Backend/Database.php.
This logging adds a lot of logs while deploying and that is hard for maintenance.

Log in or register to post comments

Comment #85

20 January 2023 at 15:54

SmovS opened merge request !9

Log in or register to post comments

Comment #86

English

Stroud, UK

commented 22 January 2023 at 11:40

@SmovS please post an interdiff when posting patches, or please make a commit to an existing MR. If you do this, we can see the proposed change to the patch; in this case it would make it easier to see the logging code you have removed.

+++ b/src/Plugin/AdvancedQueue/Backend/Database.php
@@ -2,22 +2,32 @@
+use Drupal\Core\Logger\LoggerChannelFactoryInterface;

@@ -26,6 +36,13 @@ class Database extends BackendBase implements SupportsDeletingJobsInterface, Sup
+  /**
+   * The logger factory.
+   *
+   * @var \Drupal\Core\Logger\LoggerChannelFactoryInterface
+   */
+  protected $loggerFactory;
+

@@ -37,13 +54,18 @@ class Database extends BackendBase implements SupportsDeletingJobsInterface, Sup
+   * @param \Drupal\Core\Logger\LoggerChannelFactoryInterface $logger_factory
+   *   The logger factory.
...
+    $this->loggerFactory = $logger_factory;

@@ -55,7 +77,9 @@ class Database extends BackendBase implements SupportsDeletingJobsInterface, Sup
+      $container->get('logger.factory')

All this logger code can be removed too as nothing is using $this->logger AFAICS.

Log in or register to post comments

Comment #87

borisson_

Dutch

Mechelen, 🇧🇪

commented 8 November 2023 at 10:14

Added a review to the MR

Log in or register to post comments

Comment #88

emil stoianov commented 3 January 2024 at 14:13

Status	File	Size
new	2918866-88.patch	39.79 KB

Latest changes by @smovs but as a patch file for those who need it.

Log in or register to post comments

Comment #89

joachim commented 8 January 2024 at 15:26

> public $ensure_unique = self::UNIQUE_OVERWRITE;

I think the default value for this should be UNIQUE_DUPLICATES, otherwise behaviour will be changed for existing job type plugins.

EDIT: also, the constant names are confusing!

I thought UNIQUE_DONT_OVERWRITE = 'leave' meant 'make a new job'. But the comment for it says 'Do not enqueue the new job'.

Log in or register to post comments

Comment #90

joachim commented 8 January 2024 at 16:44

Unit tests are currently broken because the MR introduces a dependency on the kernel to the Job class:

  public function toArray() {
SNIP
      'unique_id' => $this->getUniqueId(),
  }

// and then:

  public function getUniqueId() {
    if (!\Drupal::service('plugin.manager.advancedqueue_job_type')->jobTypeRequiresUniqueness($this->type)) {
SNIP
  }

Not sure how best to resolve this. Though I am not really keen on toArray() calling getUniqueId() to get the value rather than just returning what is set in the class. The order of precedence in getUniqueId / setUniqueId / constructor for handling $this->uniqueId doesn't seem right to me.

Log in or register to post comments

Comment #91

joachim commented 8 January 2024 at 16:51

Another issue: we're using the phrase 'unique ID' in comments and method names and var names.

But there is *already* a unique ID -- the Job ID $job->id which is a unique ID. And it's fairly common in core to casually prefix the word 'ID' with 'unique'.

So I think it's going to be really confusing to differentiate between 'the job ID, which is a unique ID', and the 'unique ID which is a hash to check for uniqueness'.

Log in or register to post comments

Comment #92

joachim commented 8 January 2024 at 17:13

I'd really like to rename the constants for these, as they're confusing:

  /**
   * Allow duplicate jobs.
   */
  const UNIQUE_DUPLICATES = 'allow'; // Doesn't say what happens to duplicates!

  /**
   * Do not enqueue the new job.
   */
  const UNIQUE_DONT_OVERWRITE = 'leave'; // 'don't overwrite' could mean the duplicate is written to a new job, which is the 'allow' case!

  /**
   * Overwrite existing job.
   */
  const UNIQUE_OVERWRITE = 'overwrite';

Something like:

  const UNIQUE_ALLOW_DUPLICATES = 'allow';
  const UNIQUE_SKIP_DUPLICATES = 'leave';
  const UNIQUE_OVERWRITE_DUPLICATES = 'overwrite';

Furthermore, what about making them an Enum now that we have those in PHP?

Log in or register to post comments

Comment #93

joachim commented 8 January 2024 at 17:22

I'm also a bit confused about the need for BackendBase::ensureUniqueJobsSupport(). This says:

   * Backends that do not support unique jobs should call this from
   * enqueueJobs() to ensure that jobs that require uniqueness are not queued.

But the Queue class's enqueueJob/s() methods already have a check.

And the module's README has this code example showing that jobs should be queued by calling these:

$queue = Queue::load('default');
$queue->enqueueJob($job);

However! Kernel tests, both in this MR and the main branch, repeatedly do this:

    $this->firstQueue->getBackend()->enqueueJobs([$first_job, $third_job]);

Again, not sure how best to resolve it as it's a bit of a confusing situation.

Log in or register to post comments

Comment #94

English

commented 23 January 2024 at 17:01

Status:	Needs work	» Needs review
Issue tags:	-KickstartPrague2022	+Needs issue summary update

Everyone.. I truly dislike issues going over 100 comments, and this one is really in danger of that.. Can I ask those that are interested in this to review and test this, please??

@joachim Can you please update the issue summary to explain what is the current status of this, and if there's still any blocker / caveat that would prevent this to be marked RTBC by someone else?

Setting this to "Needs review" as I'm not sure the "Needs work" that was added back in #69 is still applicable.

Log in or register to post comments

Comment #95

stmh commented 27 January 2024 at 11:48

Status:	Needs review	» Reviewed & tested by the community
Issue tags:		+HamburgContributionWeekend2024, +ContributionWeekend2024

I reviewed the code and have no objections in merging it back. I also tested the new functionality in a project where I need the functionality and it works flawlessly.

Mark as RTBC

Log in or register to post comments

Comment #96

joachim commented 27 January 2024 at 11:55

I think this needs work, as per my various comments -- see #93 and #94.

Log in or register to post comments

Comment #97

stmh commented 27 January 2024 at 12:17

Regarding your comments:

I think its safe to remove the comment, as enqueueJob and enqueueJobs will throw an exception if you try to add a JobItem that needs uniqueness and the queue does not support it.

Regarding the second comment. The Queue itself provides two function enqueueJob and enqueueJobs to add jobs, which looks fine to me.

Log in or register to post comments

Comment #98

joachim commented 27 January 2024 at 15:02

Oh and #92 as well.

Log in or register to post comments

Comment #99

stmh commented 27 January 2024 at 15:12

@joachim Oh yes, let's do this. I had my problems understanding them as well.

Log in or register to post comments

Comment #100

English

commented 29 January 2024 at 09:24

Status:

Reviewed & tested by the community

» Needs work

OK. Then we need to work on #92 and removing the first comment of #93?

Log in or register to post comments

Comment #101

29 January 2024 at 10:52

stmh changed the visibility of the branch advancedqueue-2918866-2918866-add-support-for to hidden.

Log in or register to post comments

Comment #102

stmh commented 29 January 2024 at 11:03

Worked on #92 and #93, see my latest commits.

Log in or register to post comments

Comment #103

joachim commented 6 February 2024 at 17:05

Pushed a bunch of fixes. Tests now pass.

Still to do:

1. My comment in #93. I'm not sure how best to resolve this and it would be useful to have input from the original author of that code. But it looks like the checking in Queue was added in #48 back in 2019, and the checking in BackendBase even before that, so I don't know if those devs are even still around. Given that, I'd be inclined to remove the checks in Queue and let backends use the ensureUniqueJobsSupport() helper.

2. getDuplicateJobId could probably be made protected. That would make SupportsUniqueJobsInterface an empty interface, but there's already precedent for that in the backend interfaces.

3. I REALLY want to rename the DB column we're adding from unique_id to something like job_hash. It's going to be a source of confusion and bugs with the current name. I appreciate that sites are already using this patch.... but that's a risk you take when using a patch.

4. The MR makes changes to existing hook_update_N() implementations, which is out of scope AND generally not a good idea:

-  $schema = Database::getConnection()->schema();
-  $schema->addIndex('advancedqueue', 'queue_state', ['state'], $spec['advancedqueue']);
+  $database_schema = \Drupal::database()->schema();
+  $database_schema->addIndex('advancedqueue', 'queue_state', ['state'], $spec['advancedqueue']);

These were first introduced in #51 -- and later patches have added the same changes to newer implementations that were added to the module in the meantime!

Log in or register to post comments

Comment #104

bojanz commented 26 February 2024 at 12:16

3. I REALLY want to rename the DB column we're adding from unique_id to something like job_hash. It's going to be a source of confusion and bugs with the current name. I appreciate that sites are already using this patch.... but that's a risk you take when using a patch.

FWIW I've been using "fingerprint" as the column name for this purpose in a few other places. It is fairly common in various libraries.

I do agree that "job_hash" sounds better than "unique_id" at least. Every ID is unique so the column name sounds confusing.

Log in or register to post comments

Comment #105

English

commented 26 February 2024 at 14:11

Just wanted to comment that I think this should not be committed to the module before we release 1.0, as adding new features is somewhat against what we should be doing in "rc" status.

Then again, we should probably release 1.0 ASAP unless someone has any objections, and if there is any, please comment on #3220292: Roadmap to 1.0.

Log in or register to post comments

Comment #106

English

commented 26 February 2024 at 22:00

9 files were hidden/shown/deleted

Status	File	Size
hidden	2918866-58.patch	31.69 KB
hidden	interdiff-2918866-56-58.txt	684 bytes
hidden	2918866-69.patch	30.65 KB
hidden	reroll_diff_2918866_58-69.txt	7.02 KB
hidden	2918866-76.patch	30.27 KB
hidden	2918866-79.patch	30.35 KB
hidden	2918866-80.patch	30.65 KB
hidden	2918866-84.patch	29.62 KB
hidden	2918866-88.patch	39.79 KB

1.0 has now been released. This can easily go into 1.1, we just need to get this in RTBC.

Log in or register to post comments

Comment #107

emil stoianov commented 5 March 2024 at 14:27

Status	File	Size
new	2918866-107.patch	88.14 KB

The patch with latest updates for those who are not comfortable downloading merge requests.

Log in or register to post comments

Comment #108

English

Stroud, UK

commented 5 March 2024 at 18:53

I think part of the problem we're having with naming things and figuring out how this should be organised is that the semantics of 'unique' are unhelpful.

Let's discard the whole term "unique" and focus on the language of duplicates and how we handle duplicates: we need to detect duplicates (by hash), and then manage them.

Concretely this leads me to the following suggestions:
instead of a single Drupal\advancedqueue\Plugin\AdvancedQueue\Backend\SupportsUniqueJobsInterface we have:
Drupal\advancedqueue\Plugin\AdvancedQueue\Backend\SupportsDetectingDuplicateJobsInterface

Our annotations become:

/**
   * Allow duplicate jobs works with any backend.
   */
  public const DUPLICATES_ALLOW = 'allow';

  /**
   * Do not enqueue duplicate jobs. Requires a backend implementing SupportsDetectingDuplicateJobsInterface.
   */
  public const DUPLICATES_SKIP = 'skip';

  /**
   * Merge with existing job. Requires a backend implementing SupportsDetectingDuplicateJobsInterface and SupportsDeletingJobsInterface.
   */
  public const DUPLICATES_MERGE = 'merge';

Notice that DUPLICATES_MERGE I've said requires SupportsDeletingJobsInterface instead of some new SupportsUpdatingJobsInterface - this is because the simplest possible way of updating a job is to delete the old one and insert a new one.

I'm saying 'merge' here instead of the previous 'overwrite' because the basic idea in order to avoid committing us to what the payload should be of the final updated job; in this first implementation we should just overwrite the old job with the new one, but in future we might allow the job type plugin to have an opinion about how the payload of old and new job are combined.

Log in or register to post comments

Comment #109

English

Stroud, UK

commented 5 March 2024 at 19:24

I agree that BackendBase::ensureUniqueJobsSupport() does not seem necessary. It's unused in the Database backend which is the only one we offer.

I agree SupportsUniqueJobsInterface doesn't need a public getDuplicateJobId() method.

I dislike the jobTypeRequiresUniqueness() and getJobTypeUniquenessType() methods on the JobTypeManager. I understand the logic for wanting to check this without instantiating a plugin, but the semantics seems very ugly. I think it would be nicer to have this as just a single method on the Job class: Job::getDuplicatesPolicy() which itself calls out to JobTypeManager::getDefinition().

I agree $this->getBackend()->supportsJobType($job_type); is a nicer semantic than trying to figure this out on the Queue entity or in the JobTypeManager. Actually if we do this, then strictly speaking we don't even need the SupportsUnique/Duplicates interface at all.

Log in or register to post comments

Comment #110

5 March 2024 at 22:13

jonathanshaw opened merge request !11

Log in or register to post comments

Comment #111

English

Stroud, UK

commented 5 March 2024 at 22:29

I've opened a new MR as I'm making a significant number of changes here:
- using 'duplicates' as the key term instead of 'unique'
- introducing the term 'duplicates policy' to describe the duplicates handling strategy specified on a job type.
- using 'fingerprint' instead of 'unique id' or 'job hash'
- removing jobTypeRequiresUniqueness() and getJobTypeUniquenessType() methods from the JobTypeManager, in favour of new method Job::getDuplicatesPolicy()
- removing SupportsUniqueJobsInterface altogether, adding no new interface
- adding a verifyJobSupport() method to BackendInterface and BackendBase.

However, doing this I've realised that a lot of the reason why this issues is getting tricky goes back to #20:

Backends may or may not support unique jobs.

However, silently ignoring jobs that should be unique is dangerous, ... Commerce Recurring's jobs absolutely MUST NOT be used with a backend that does not support unique jobs, and therefore, I think we should guard against this here.

So therefore, I want to introduce a concept of a job type *requiring* that its jobs be treated as unique. This can be declared on the job type plugin annotation.

This goal makes sense. However, it is not really acheivable for the reasons pointed to in #93. It is perfectly possible and reasonable for custom code using this module to enqueue jobs with:
$queue->getBackend()->enqueueJobs([$jobs]);
This means that calling code has direct access to the backend plugins. We can't stop calling code from passing jobs to backend plugins that can't properly handle them, and we can't force custom or contrib backend plugins to verify they can handle a job properly.

Everything we are doing with SupportsUniqueJobsInterface, Queue::verifyQueueSatisfiesJob(), BackendInterface::verifyJobSupport(), JobTypeManager::jobTypeRequiresUniqueness() etc is basically wishful thinking. We are trying to create protective mechanisms but we can't ensure that anyone uses them.

I think at this point we should:
- rip out these mechanisms (which in MR11 are basically simplified to BackendInterface::verifyJobSupport())
- create a new issue for discussing and working on this rather tricky problem, which may apply to other future features not just this duplicates feature
- document on the annotation type that duplicates handling may not be safe when using backend other than the Database backend.

Log in or register to post comments

Comment #112

joachim commented 5 March 2024 at 22:40

> Let's discard the whole term "unique" and focus on the language of duplicates and how we handle duplicates: we need to detect duplicates (by hash), and then manage them.

I sort of see what you're getting at, but the word 'duplicate' carries the same connotations as 'unique' -- ensuring there's only one of something.

> I think it would be nicer to have this as just a single method on the Job class: Job::getDuplicatesPolicy() which itself calls out to JobTypeManager::getDefinition().
> - removing jobTypeRequiresUniqueness() and getJobTypeUniquenessType() methods from the JobTypeManager, in favour of new method Job::getDuplicatesPolicy()

The Job class can't depend on the service container!!!! I *removed* that in my latest commits. It crashes the unit tests otherwise.

> Notice that DUPLICATES_MERGE I've said requires SupportsDeletingJobsInterface instead of some new SupportsUpdatingJobsInterface - this is because the simplest possible way of updating a job is to delete the old one and insert a new one.

Then it's not 'merge', it's 'replace'. Saying 'merge' when it's actually deleting the old one is misleading.

Log in or register to post comments

Comment #113

bojanz commented 6 March 2024 at 09:10

Is there enough value in actually support the "merge" strategy? Or the concept of strategies at all?

In all other implementations I've used, a job/task is either unique/deduplicated (meaning that a job with the same fingerprint is rejected on enqueue) or it isn't.

My assumption without reading the code would be that I do:

// If I don't provide a fingerprint, it is autogenerated for me.
$job = Job::createUnique("my_type", $payload, ['fingerprint' => 'my-unique-value']);

enqueueJobs() can check if $job->getFingerprint() is non-empty, and if the backend does not implement SupportsUniqueJobsInterface, throw an exception.

Log in or register to post comments

Comment #114

English

commented 6 March 2024 at 09:16

I tend to support #113 as well. From the point of view of maintenance, it makes the code a lot simpler, and I think also from the point of view of an API user. As an API user, I'd try to create the job, and if I get the exception, I either propagate the error up, or I do something to make the job unique and try again. Normally, the API user is the one that would better understand the consequences of trying to create duplicate jobs.

Log in or register to post comments

Comment #115

English

Stroud, UK

commented 6 March 2024 at 18:41

Status:

Needs work

» Needs review

Thanks to everyone for the feedback.

What I've done is:
- simplified the job type annotation to a boolean
- given the job type plugin responsibility for deciding how duplicates should be handled
- made the base job type plugin's implementation simply throw an exception if there's a duplicate
- given the Queue entity, not the Database backend, responsibility for looking for duplicates.

This means that all of the duplicates functionality added in this issue is completely bypassed if you call $queue->getBackend()->enqueue(). But if you call $queue->enqueue() you get safe behavior whatever backend you use.

I believe in a follow-up issue we should deprecate Queue::getBackend(). This function creates a leaky abstraction of sorts; by putting API users in direct contact with the backend, we lose any ability to intermediate. Instead API users should interact with the queue entity, and we can use that to provide safety and backwards compatability as we add new features like this in future.

Log in or register to post comments

Comment #116

English

Stroud, UK

commented 7 March 2024 at 11:23

I like where this has got too. It seems clean and simple, but flexible enough to accomodate any use case.

Log in or register to post comments

Comment #117

English

Stroud, UK

commented 28 March 2024 at 12:13

Green at last.

Log in or register to post comments

Comment #118

adamfranco commented 5 April 2024 at 19:30

Status	File	Size
new	advancedqueue-2918866-118-variable-fix.patch	777 bytes

@jonathanshaw, there's a typo in Merge request 11 on lines 132-133 of advancedqueue.install. The variable initialized is $schema but undefined $database_schema name is used. See attached patch.

Log in or register to post comments

Comment #119

English

Stroud, UK

commented 6 April 2024 at 21:10

Thanks @adamfranco, I fixed the variable name in the update hook.

Log in or register to post comments

Comment #120

astonvictor commented 8 May 2024 at 10:27

Status:

Needs review

» Needs work

Hi @jonathanjfshaw,

thanks for your work and your MR.

I tested your changes by:
- creating a new AdvancedQueueJobType plugin with allow_duplicates = false configuration;
- added a code to create a new job after submitting a custom form.

as a result after creating the second job with the same value, I got a fatal error - Drupal\advancedqueue\Exception\DuplicateJobException: Job rejected because it duplicates existing jobs.

I guess, we should simply not add the second job to the queue and add a log with a notice.

FYI tested with drupal 10.1.5, PHP 8.1

What do you think?

Log in or register to post comments

Comment #121

English

Stroud, UK

commented 9 May 2024 at 18:02

Status:

Needs work

» Needs review

I got a fatal error - Drupal\advancedqueue\Exception\DuplicateJobException: Job rejected because it duplicates existing jobs.

I guess, we should simply not add the second job to the queue and add a log with a notice.

Yes, that behavior is by design. This strategy was chosen by the maintainer in #114:

As an API user, I'd try to create the job, and if I get the exception, I either propagate the error up, or I do something to make the job unique and try again. Normally, the API user is the one that would better understand the consequences of trying to create duplicate jobs.

Log in or register to post comments

Comment #122

astonvictor commented 10 May 2024 at 05:10

Status:

Needs review

» Reviewed & tested by the community

yeah, it works for me with try-catch.

I guess it should be somehow documented on the module page / readme.

Log in or register to post comments

Comment #123

28 May 2024 at 14:00

tBKoT made their first commit to this issue’s fork.

Log in or register to post comments

Comment #124

tbkot commented 28 May 2024 at 15:59

Status	File	Size
new	2918866-124.patch	28.32 KB

Patch with the latest changes from MR to be used in #2931290: Multiple recurring orders get created if cron is badly configured

Log in or register to post comments

Comment #125

tbkot commented 28 May 2024 at 20:06

Status	File	Size
new	2918866-125.patch	28.37 KB

1 file was hidden/shown/deleted

Status	File	Size
hidden	2918866-124.patch	28.32 KB

Updated patch for 1.0 version instead of 1.x-dev branch

Log in or register to post comments

Comment #126

2 June 2024 at 16:33

jcnventura closed merge request !9

Log in or register to post comments

Comment #127

2 June 2024 at 16:55

jcnventura committed 20d1bb3e on 8.x-1.x

Issue #2918866 by jonathanshaw, joachim, SmovS, brunodbo, Evaldas...

Log in or register to post comments

Comment #128

2 June 2024 at 16:56

jcnventura closed merge request !11

Log in or register to post comments

Comment #129

English

commented 2 June 2024 at 16:58

Priority:	Normal	» Major
Status:	Reviewed & tested by the community	» Fixed
Issue tags:	-Needs issue summary update	+Needs change record

Thanks everyone! And there were a lot of you. As to how to document, this.. I guess it will be part of the release notes when the next release is created. If someone would be kind enough to create a change record, that would be amazing.

Log in or register to post comments

Comment #130

16 June 2024 at 16:59

Status:

Fixed

» Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

Log in or register to post comments

Comment #131

khiminrm commented 9 July 2024 at 09:11

Status	File	Size
new	2918866-131.patch	27.2 KB
new	interdiff_125_131.txt	630 bytes

Hi!

When testing the patch, I've noticed that it's not always working properly.

I will try to explain. I've tested it with commerce_recurring.
The job payload is simple: ['order_id' => $order->id()]
So I've run the 'Commerce Recurring' cron job to process a recurring order. The job has created two queue items:
with types commerce_recurring_order_close and commerce_recurring_order_renew with payload {"order_id":"230185"}.
Then I've run the 'Advanced Queue' cron job. It has processed the queue items. One of them was failed due to another issue not related to this one. And another one was successfully processed. So I had two queue items with states failure and

success</code at those moment.
But after running the  'Commerce Recurring' cron job again, two new queue items with the same payload and fingerprint were created. So the items were duplicated.
And if ran the the  'Commerce Recurring' cron job again without running the the 'Advanced Queue' cron job before it, the duplicates are not created, because the items had state <code>queued

and they were found as duplicated.

So, I think we should remove condition where we check only two states in the getDuplicateJobs() method:
->condition('state', [Job::STATE_QUEUED, Job::STATE_PROCESSING, Job::STATE_FAILURE], 'IN');

Please, review and re-test if possible. I'm attaching the updated patch from #125 for 1.0 and interdiff.

Thanks in advance!

Log in or register to post comments

Comment #132

jsacksick commented 9 July 2024 at 09:26

Hm... Perhaps in some cases we still want to allow queuing jobs with the same fingerprint?
One usecase I could think of is a job that notifies a 3rd party system after an entity was updated, you probably wouldn't to notify twice for the same update, but would still want to notify about subsequent updates but I'm guessing in this case the payload should contain what's supposedly unique about the update i.e. the changed timestamp for example in addition to the entity ID?

Not sure, but in the case of these recurring jobs, we never want to queue duplicate jobs...

Log in or register to post comments

Comment #133

khiminrm commented 9 July 2024 at 10:30

Created new issue https://www.drupal.org/project/advancedqueue/issues/3460188

Log in or register to post comments

Comment #134

bojanz commented 9 July 2024 at 14:43

By definition a fingerprint must be unique. Therefore it should be impossible to queue two jobs with the same fingerprint. Producers who wish to run a job after all can influence the fingerprint by including an always-unique component such as a timestamp.

Log in or register to post comments

Comment #135

jsacksick commented 9 July 2024 at 14:48

@bojanz: Agreed, wanted to come back and edit my message, but Roman posted a patch in that sense, a fingerprint should be unique and it'd be the responsability of the job type to make sure it is for usecases I mentioned in #132.

Log in or register to post comments

Comment #136

zaporylie

he/him

Polish

commented 15 August 2024 at 12:06

Draft Change Record was created: https://www.drupal.org/node/3468315. Feel free to adjust and publish before making a tagged release that includes this feature. Thanks, everyone for a great job on this, hope to see it as part of the tagged release in a very short time 😉

Log in or register to post comments

Comment #137