Problem/Motivation

Right now there are three separate queues for tasks related to importing, namely:

  • feeds_feed_import
  • feeds_feed_parse
  • feeds_feed_process

This is problematic because that results into cron processing less of the import task as could be. Let me explain that. When queue tasks are performed during cron runs, cron loops through all queues. The order of the queues is determined based on the alphabetical order that the class names appear in. So first 'feeds_feed_parse' (FeedParse), then 'feeds_feed_process' (FeedProcess) and finally 'feeds_feed_import' (FeedRefresh). Each queue is only run once during each cron run. So when an import is queued (put in feeds_feed_import), the following happens during cron:

  1. Queue 'feeds_feed_parse' runs. No items found.
  2. Queue 'feeds_feed_process' runs. No items found.
  3. Queue 'feeds_feed_import' runs. One item found. Performs a fetch operation. Queues a task of type 'feeds_feed_parse'.

And then, on the next cron, finally some items start to get imported, but only a limited number of items if the parser supports importing items in chunks, like the XML parser from Feeds extensible parser does, for example. So if the parser limit is 50, only 50 items will get imported on the second cron run. After parsing and processing a new task of type 'feeds_feed_import' is queued, which on their turn will queue a new task of type 'feeds_feed_parse'.

This way a huge number of items will take many cron runs to complete.

Proposed resolution

Instead of having three queues, I think it is better to have just one queue. This way multiple cycles can be run in one cron run. With local testing I could import about 450 items with one cron run.

To determine what needs to happen when the queue task is run an operation must be specified when queuing the task. With the changes happened in #2803795: Cron import does not work with parser that uses progress we already have two: 'begin' and 'resume'. This are all the operations that I identified:

  • begin
  • resume
  • parse
  • process
  • finish

The 'finish' operation just checks if the import is finished and if not, it queues a new task with operation 'resume'. This is what the 'final' process queue task does now. See the code in \Drupal\feeds\Plugin\QueueWorker\FeedsParse:

// Add a final process queue item that finalizes the import.
$queue->createItem([$feed, $fetcher_result]);

Patch will follow.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

MegaChriz created an issue. See original summary.

MegaChriz’s picture

Assigned: MegaChriz » Unassigned
Status: Active » Needs review
FileSize
9.85 KB
MegaChriz’s picture

The tests will fail, as I removed FeedParse and FeedProcess for which still a test exist.

Status: Needs review » Needs work

The last submitted patch, 2: feeds-one-import-queue-2869350-2.patch, failed testing.

Erik Frèrejean’s picture

I'll try to review/test this tomorrow.

Erik Frèrejean’s picture

I've just installed this patch and I noticed some references to the old queues, so I've updated the patch to also remove those.
Besides that, the "FeedRefresh" queue has an id "feeds_feed_import" which is kinda confusing, so the updated patch also changes the ID of that queue.

MegaChriz’s picture

Thanks for testing! Renaming the queue to match the class name sounds like a good idea to me.

If we find no more issues, then only the automated tests need to be updated before this can get committed.

MegaChriz’s picture

Working on fixing the tests. First a reroll, because the patch no longer applies cleanly.

MegaChriz’s picture

Status: Needs work » Needs review
FileSize
22.34 KB
10.81 KB

This should update all the QueueWorker tests. FeedParseTest and FeedProcessTest are removed. The test methods from this class are moved and to the FeedRefreshTest and renamed.

Status: Needs review » Needs work

The last submitted patch, 9: feeds-one-import-queue-2869350-9.patch, failed testing.

MegaChriz’s picture

Status: Needs work » Needs review

Back to "Needs review", because the second time the tests do pass.

The last submitted patch, 6: provide_one_queue_for-2869350-6.patch, failed testing.

The last submitted patch, 8: feeds-one-import-queue-2869350-8.patch, failed testing.

  • MegaChriz committed 3536124 on 8.x-3.x
    Issue #2869350 by MegaChriz, Erik Frèrejean: Provide one queue for all...
MegaChriz’s picture

Status: Needs review » Fixed

Committed #9.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.