Queue system createItem() should return ID, and $item properties should be documented [#1928056]

Comment	File	Size	Author
#36	core-1928056-D7-36.patch	3.52 KB	ntsekov

#32	core-1928056-D7-32.patch	3.37 KB	ShaxA

#31	1928056-D7-31.patch	3.36 KB	mgifford

#29	1928056-D7-29.patch	2.61 KB	mgifford

#24	1928056-24-D7-do-not-test.patch	2.6 KB	pwolanin

#23	1928056-23-D8.patch	2.95 KB	pwolanin

#4	1928056-4-D7-do-not-test.patch	1.86 KB	pwolanin

#3	1928056-3-D8.patch	2.2 KB	pwolanin

#2	1928056-2-D8.patch	2.2 KB	pwolanin

Comment #1

msonnabaum CreditAttribution: msonnabaum commented 26 February 2013 at 15:36

This seems pretty reasonable to me. Interested to hear what others think.

Log in or register to post comments

Comment #2

pwolanin CreditAttribution: pwolanin commented 26 February 2013 at 16:12

Status:

Active

» Needs review

File	Size
1928056-2-D8.patch	2.2 KB

turns out the memory queue didn't even respect the existing interface definition.

Minimal changes (2 LOC) plus code comments.

Log in or register to post comments

Comment #3

pwolanin CreditAttribution: pwolanin commented 26 February 2013 at 16:14

File	Size
1928056-3-D8.patch	2.2 KB

typo fix.

Log in or register to post comments

Comment #4

pwolanin CreditAttribution: pwolanin commented 26 February 2013 at 16:24

File	Size
1928056-4-D7-do-not-test.patch	1.86 KB

D7 backport.

Log in or register to post comments

Comment #5

neclimdul

He/Him

CreditAttribution: neclimdul commented 26 February 2013 at 16:48

Looking at the original issue the return value wasn't really discussed past not returning the item so you couldn't try to do work on the item in the queue. I don't see a problem with returning the ID because it still matches the intent of the interface.

As far as backporting, I assume no queue system would return 0 or an empty string or something as a valid identifier so it shouldn't break any existing code creating items, however there are existing implementations outside of core and this would be an API change. As long as d7 maintainers are aware and OK with this seems fine.

Log in or register to post comments

Comment #6

chx CreditAttribution: chx commented 26 February 2013 at 17:08

Status:

Needs review

» Reviewed & tested by the community

Go for it.

Log in or register to post comments

Comment #7

tsphethean CreditAttribution: tsphethean commented 26 February 2013 at 17:39

Status:

Reviewed & tested by the community

» Needs review

Not sure I understand why this would be done as a modification to Queue API. Queues with a STOMP interface (i.e. AcitveMQ, RabbitMQ etc) will not return an ID AFAIK. Do we also propose a means of retrieving an item by its ID (not a queue)?

Feels like something which would live in contrib as a wrapper or extension to the createItem() method in case it is supported. If something is being put on a queue, then the ID of the queue item is irrelevant. The ID of the "thing" going on to the queue is important (I often use the id of the data item to form a queue specific lock to cover race conditions etc)

+1 on more documentation though :-)

Log in or register to post comments

Comment #8

pwolanin CreditAttribution: pwolanin commented 26 February 2013 at 18:05

I didn't propose retrieving an item by ID, since that clearly breaks the queue paradigm.

Is there a Drupal implementation of this interface on top of a STOMP interface? Looking at the ActiveMQ docs quicks, I see reference to a message ID, but looking at the Drupal RabbitMQ queue implementation, I don't see that it could get an ID back.

In any case, my goal here was to make it easy to use the unique ID that's typically generated by back-ends instead of generating and adding an additional ID to the data itself. An interface to a back-end that doesn't return an ID could create an ID via uniqid() or a uuid call, though I'd agree that's not ideal either.

Log in or register to post comments

Comment #9

tsphethean CreditAttribution: tsphethean commented 26 February 2013 at 18:54

I created a Stomp module, www.drupal.org/project/stomp but have only implemented on activemq and it's only a youngish module so not perfect but so far as I can see the php stomp library has no way to return an id.

My point regarding retrieve by item id was really to illustrate that it seemed off for queue API to return an id when the API has no use for it. That was why it felt more like a contrib use case

The point of the queue should be fire an forget in most cases shouldn't it?

Log in or register to post comments

Comment #10

pwolanin CreditAttribution: pwolanin commented 26 February 2013 at 19:32

Well, the APi is a little wonky in that e.g. fore releaseItem() or deleteItem() you have to pass the object back with the IS property unmolested or the API doesn't work.

As above, if you need to insure that a piece of work is eventually done, it's not necessarily "fire and forget". As I said, I was looking at this as a way to make efficient use of the IDs being generated. Plus some queues (mysql, redis, etc) do potentially offer a way to retrieve by ID (not part of the official API), so one can imagine a use case.

Log in or register to post comments

Comment #11

neclimdul

He/Him

CreditAttribution: neclimdul commented 26 February 2013 at 19:40

Yeah, I was wondering about this too. The point would be in cases where the queue allows introspection to allow a UI track the state of completion. So it wouldn't be fetching specific items to work on but to check that jobs had been completed or they are still waiting to be run. This is the sort of thing you'd want when you're batching a process and have a user that would like to see its state of completion.

If we can't support this, there is the possibility that such a workflow could be handled with a database and having the worker code report back things like "I've been claimed and am running" "i finished" to a shared database or something.

Going to continue thinking on this, thanks for catching the STOMP issue tsphethean.

Log in or register to post comments

Comment #12

msonnabaum CreditAttribution: msonnabaum commented 26 February 2013 at 19:42

Compatibility with STOMP is worth discussing, since that's more geared towards message queues than job queues.

I'm of the opinion that the core queue api should focus more on being compatible with popular job queues, beanstalkd being the least common denominator in terms of introspect-ability.

That said, even though we dont have any methods that deal with ids, we do have methods that need something like an id to function (deleteItem, releaseItem), they just take the entire queue item and its expected to have an id.

Maybe instead of returning the id we should return the whole queue item?

Log in or register to post comments

Comment #13

pwolanin CreditAttribution: pwolanin commented 26 February 2013 at 19:53

in #5 neclimdul indicates that not returning the whole item was a deliberate design decision.

re: methods that taking an item expecting an ID, I think think that's why documenting the presence of that property is useful, though clearly a back-end might not use the ID?

Log in or register to post comments

Comment #14

tsphethean CreditAttribution: tsphethean commented 26 February 2013 at 20:32

Agh wrote a long post on the train then my mobile browser lost it :(

What I think I was saying:

#11 - neclimdul - "to check that jobs had been completed or they are still waiting to be run" isnt this replacing the job of whatever backend was selected? i.e. if we're using ActiveMQ, then AMQ provides all the interfaces for inspecting queue progress, jobs left to run, etc. I imagine Redis and Beanstalkd provide the same? The Queue UI module aims to do this with Drupal DB queues, and future versions are planned to allow deeper introspection into the queue. I'd be nervous of making provision for functionality which should lie outside of Drupal.

I can see the value re: batching, but still not sure this is Drupal's responsibility.

#12 - msonnabaum - "Maybe instead of returning the id we should return the whole queue item?" I like this idea but thinking with a STOMP hat on, I dont think this is feasible, as again the interface just returns success/failure, so to return the item you have to "claim" it and then immediately release it. Would work, but not sure on the performance impact...

In general, I think contructing the queue item object within the Queue interface with some defined required properties such as id, data etc would help with implementing backends. The STOMP module implements (for example):

public function deleteItem($item) {
  $this->stomp->ack($item->headers['message-id']);
}

because on claimItem() the whole object returned by the STOMP lib is assigned to $item (clunky but...). Better object construction would make objects more consistent and easier to work with. Unfortunately you can only get at this message-id when reading a frame from the stomp queue.

Log in or register to post comments

Comment #15

tsphethean CreditAttribution: tsphethean commented 26 February 2013 at 20:35

Thinking about the comment in #12 of

more geared towards message queues than job queues

makes me wonder whether we need an alternative (extended?) messagequeue interface which might get us round some of these objections? Job queues will be more geared around doing things to specific items, whilst message queues just see another message. Might give us a reason to work in #1913504: Queue API should support item prioritisation since prioritisation is more message-queue-esque too :-)

Log in or register to post comments

Comment #16

neclimdul

He/Him

CreditAttribution: neclimdul commented 26 February 2013 at 20:38

Yes it was a deliberate decision. I believe it was dww that insisted. reference: http://drupal.org/node/391340#comment-1552618 As he put it in the linked comment,

we don't return the $item in createItem() since you shouldn't do anything once it's in the queue unless you claim it first.

I believe you're right pwolanin. The metadata of the item object could be dependent on the implementation so it wasn't documented further. Its expected that what ever claimItem returned would be enough to satisfy the needs of delete and release what ever those needs might be.

Programming for the least common denominator has always been a bit of a hurdle for the QueueAPI though and made it sometimes under featured so I'm by no means suggesting we stop the discussion. I'm still kinda thinking about this.

Log in or register to post comments

Comment #17

neclimdul

He/Him

CreditAttribution: neclimdul commented 26 February 2013 at 20:41

@tsphethean yeah, not the entire queue. Say I'm provisioning a site and that has 5 tasks, each of which are fired into a queue to be handled by an external process. I've got a graphic that shows the state of those tasks so I can see when my site will be ready. Inspecting the entire queue will not help with this but some queues give the ability to inspect the status of the individual job so would could support this interface. I hope that's a more clear explanation.

Log in or register to post comments

Comment #18

tsphethean CreditAttribution: tsphethean commented 26 February 2013 at 22:07

@neclimdul - I see the value of the use case, just not convinced that's a task for queue api.

It feels more like an extension, or improvement, on batch api. Alternatively you're looking at populating a table (or other store) of tasks, each of which is a task Id, task status etc, and then each item in the queue contains the task id so that as items are processed the status of the task can be updated. Thats not uncommon, and is close to a scenario I've built, but is still a bespoke use of a queue rather than something a generic api should be built around IMHO.

Log in or register to post comments

Comment #19

Nick_vh

he/him

Ghent

CreditAttribution: Nick_vh commented 28 February 2013 at 18:58

So I stumbled upon this exact issue when trying out a queue implementation. There is no point in only returning a bool if there is more information that could be used. Perhaps it could be documented what the pro's and con's are when programming against the Queue API while using this return value.

For me this can go in sooner rather than later.

Log in or register to post comments

Comment #20

tsphethean CreditAttribution: tsphethean commented 28 February 2013 at 20:16

if there is more information that could be used
I still don't think the queue item ID is the thing to be used though....

As I said above, I can understand the scenario where the data in the queue item is something you might want to keep track of (maybe you're processing an order and you want to know when that order is done, or you're publishing a series of nodes, and you want to know what the state of a particular node is) but that all sits outside Queue API.

Absolutely agree that if there is a need for a system of keeping tracking things that are put on to a queue then we can build that and return an id to whatever calls it, but when we can't guarantee that the thing we're going to queue into is going to give us back any kind of ID, let alone a unique one, I think this has potential to be really dangerous.

Log in or register to post comments

Comment #21

msonnabaum CreditAttribution: msonnabaum commented 1 March 2013 at 04:02

Absolutely agree that if there is a need for a system of keeping tracking things that are put on to a queue then we can build that and return an id to whatever calls it, but when we can't guarantee that the thing we're going to queue into is going to give us back any kind of ID, let alone a unique one, I think this has potential to be really dangerous.

That argument doesn't really hold, because we already have deleteItem and releaseItem, and they need *some* kind of id within the queue object. We just don't say where it has to go.

I'm not sure I agree with dww's reasoning for killing that return value in the original patch. You can argue that you shouldn't do anything with the queue item after you create it…but that's kind of obvious? You just passed in the data, the only thing you're getting back is the data you put in plus an id. The only thing we're doing there is explicitly hiding the id from you, even though both beanstalkd and SQS return it. That hardly seems valuable.

Log in or register to post comments

Comment #22

tsphethean CreditAttribution: tsphethean commented 1 March 2013 at 11:01

The difference there though is that before you release the item or delete it, you must have claimed it.

Maybe an alternative would be Drupal generating some kind of unique id which is passed as part of the queue item data to the queue backend, then we can return that? That would then be backend agnostic and still allow what we're trying to do?

Log in or register to post comments

Comment #23

pwolanin CreditAttribution: pwolanin commented 1 March 2013 at 20:19

File	Size
1928056-23-D8.patch	2.95 KB

@tsphethean - my thought was that if the back-end doesnt' generate a unique ID, createItem could still do so. i.e. we make it a clear responsibility of the implementation.

We also noticed that created is not loaded when the item is claimed, so adding that to make the object match the proposed docs.

Log in or register to post comments

Comment #24

pwolanin CreditAttribution: pwolanin commented 1 March 2013 at 20:23

File	Size
1928056-24-D7-do-not-test.patch	2.6 KB

here's a D7 version

Log in or register to post comments

Comment #25

tsphethean CreditAttribution: tsphethean commented 2 March 2013 at 12:17

Ok, so my last thought on this is that we need a decision from @msonnabaum and @chx as queue maintainers about the direction of queue Api.

We need to know if the objective of queue api is to be designed to provide a minimum set of functionality that can then be extended upon In custom class implementations, or whether we're aiming to provide as much functionality as possible in the base classes whilst giving individual implementations the ability to not fully implement functionality if not available to the particular backend or use case.

If the latter, then this issue should go ahead, and I think #1913504 should be reconsidered, as this can be easily ignored by backends which don't support prioritisation, in the same way we're saying we would cover backends that can't return a unique Id.

Otherwise this is a job for contrib...

Log in or register to post comments

Comment #26

chx CreditAttribution: chx commented 2 March 2013 at 15:22

Status:

Needs review

» Reviewed & tested by the community

It's fine for D8, I can't imagine how can you call this backportable but as I said on IRC, we know that certain changes like the hash function change tend to disregard process completely so I will leave the tag on it to avoid a tug-of-war and hope that the committers will be more sane this time.

Log in or register to post comments

Comment #27

Dries CreditAttribution: Dries commented 8 March 2013 at 15:39

Status:

Reviewed & tested by the community

» Fixed

Committed to 8.x.

Log in or register to post comments

Comment #28

pwolanin CreditAttribution: pwolanin commented 8 March 2013 at 19:55

Version:	8.x-dev	» 7.x-dev
Status:	Fixed	» Needs review

for consideration for D7. The changes would not break any existing consumers of the queue, but would suggest non-SQL implementors (like redis, etc) need to be updated to match, and also has the wrinkle that modules that need the (minor) new capabilities would have to check the core or contrib version.

Log in or register to post comments

Comment #29

mgifford

he/him

English

CreditAttribution: mgifford commented 29 June 2014 at 05:21

Issue summary:

View changes

File	Size
1928056-D7-29.patch	2.61 KB

Here's a port to get this moving.

Log in or register to post comments

Comment #30

29 June 2014 at 05:46

Status:

Needs review

» Needs work

The last submitted patch, 29: 1928056-D7-29.patch, failed testing.

Log in or register to post comments

Comment #31

mgifford

he/him

English

CreditAttribution: mgifford commented 29 June 2014 at 12:36

Status:

Needs work

» Needs review

File	Size
1928056-D7-31.patch	3.36 KB

Fixing Fatal error: Call to a member function queryRange().

Log in or register to post comments

Comment #32

ShaxA CreditAttribution: ShaxA at FFW commented 17 June 2016 at 08:38

File	Size
core-1928056-D7-32.patch	3.37 KB

Guys i have refactored the patch to work with 7.44 version.

Log in or register to post comments

Comment #33

mgifford

he/him

English

CreditAttribution: mgifford at OpenConcept Consulting Inc. commented 22 June 2016 at 17:02

@ShaxA what was the change? Can you provide an interdiff?

Log in or register to post comments

Comment #34

neclimdul

He/Him

CreditAttribution: neclimdul commented 23 June 2016 at 00:35

3 files were hidden/shown/deleted

File	Size
1928056-23-D8.patch	2.95 KB

1928056-24-D7-do-not-test.patch	2.6 KB

1928056-D7-29.patch	2.61 KB

Looks like just a straight reroll around conflicts from the query here #2356983: Fix SystemQueue::claimItem() returns items in the wrong order (resolves locale test failures on PostgreSQL) and a default.settings.php change here
#1930960: Block caching disable hardcoded on sites with hook_node_grant() causes serious performance troubles when not necessary. Basically the difference in the patches is just the sort change from #2346983: Path of module settings page.

+++ b/sites/default/default.settings.php
@@ -435,6 +435,16 @@ ini_set('session.cookie_lifetime', 2000000);
 /**
+ * HTTP Request Buffer
+ *
+ * When drupal requests a file from another webserver, it periodically checks
+ * for end-of-file and server timeouts.  The default is to read a maximum of
+ * 1024 bytes between each check.  Increasing this limit may help performance
+ * when reading large files from the network.
+ */
+# $conf['http_request_buffer'] = 1024;
+
+/**

What's this?

Log in or register to post comments

Comment #35

mgifford

he/him

English

CreditAttribution: mgifford at OpenConcept Consulting Inc. commented 26 June 2016 at 22:29

Status:

Needs review

» Needs work

Log in or register to post comments

Comment #36

ntsekov CreditAttribution: ntsekov commented 13 July 2016 at 08:31

File	Size
core-1928056-D7-36.patch	3.52 KB

@ShaxA I have updated your patch to work with 7.50 version of Drupal core

Log in or register to post comments

Comment #37

3 August 2016 at 00:04

Dries committed 7cfb115 on 8.3.x

Issue #1928056 by pwolanin: Queue system createItem() should return ID,...

Log in or register to post comments

Comment #38

3 August 2016 at 00:05

Dries committed 7cfb115 on 8.3.x

Issue #1928056 by pwolanin: Queue system createItem() should return ID,...

Log in or register to post comments

Comment #39

27 January 2017 at 17:53

Dries committed 7cfb115 on 8.4.x

Issue #1928056 by pwolanin: Queue system createItem() should return ID,...

Log in or register to post comments

Comment #40

27 January 2017 at 17:52

Dries committed 7cfb115 on 8.4.x

Issue #1928056 by pwolanin: Queue system createItem() should return ID,...

Log in or register to post comments

Queue system createItem() should return ID, and $item properties should be documented

Comments