Problem/Motivation

Imports work for the first node. But any subsequent nodes don't trigger an import.

I have a content type called 'staff member' and am using feeds to import the staff members' publications from a URL.

I add a URL to the feed node and then run cron. Everything imports as expected. I add a second URL to a different feed node and run cron - the import is not triggered for this new source, even after running cron many times. The import frequency is set to 'as often as possible', so this is not the issue.

In feeds_cron(), an empty array is always returned when calling feeds_schedule, so no importers are being scheduled:

  // Find importers that need to be rescheduled.
  if (!$importers = feeds_reschedule()) {
    return;
  }

Proposed Resolution

I noticed that if I change the importer frequency, save, and run cron then the import runs as expected. Looking at the code in FeedsImporter::configFormSubmit(), it runs because the importer is rescheduled:

/**
   * Reschedule if import period changes.
   */
  public function configFormSubmit(&$values) {
    if ($this->config['import_period'] != $values['import_period']) {
      feeds_reschedule($this->id);
    }
    parent::configFormSubmit($values);
  }

My proposal is to also reschedule the importer when a feed node is saved in FeedSource::save():

if (db_query_range("SELECT 1 FROM {feeds_source} WHERE id = :id AND feed_nid = :nid", 0, 1, array(':id' => $this->id, ':nid' => $this->feed_nid))->fetchField()) {
      drupal_write_record('feeds_source', $object, array('id', 'feed_nid'));
      feeds_reschedule($this->importer->id);
    }

Unless someone with a more thorough understanding of the scheduling side of feeds has a better way of fixing this?

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

littledynamo’s picture

Issue summary: View changes
littledynamo’s picture

Patch to follow

littledynamo’s picture

Issue summary: View changes
littledynamo’s picture

Status: Active » Postponed (maintainer needs more info)

I may be off track with my proposed resolution - still not working properly. I need to do more digging.

littledynamo’s picture

Ok, so I think my original feeling was correct. When the 2nd, 3rd or any subsequent feed node is saved, the importer is not rescheduled and the import doesn't run. When I add feeds_reschedule() to the source::save() function, my importer runs:

if (db_query_range("SELECT 1 FROM {feeds_source} WHERE id = :id AND feed_nid = :nid", 0, 1, array(':id' => $this->id, ':nid' => $this->feed_nid))->fetchField()) {
      drupal_write_record('feeds_source', $object, array('id', 'feed_nid'));
      feeds_reschedule($this->importer->id);
    }

P.s. sorry for the confusion from #4, it wasn't working because the staff member had masses of publications and the request was timing out (need to look into batching).

littledynamo’s picture

Status: Postponed (maintainer needs more info) » Active
littledynamo’s picture

littledynamo’s picture

Status: Active » Needs review

Status: Needs review » Needs work
littledynamo’s picture

Issue summary: View changes
littledynamo’s picture

Issue summary: View changes
littledynamo’s picture

Issue summary: View changes
twistor’s picture

Version: 7.x-2.0-alpha8 » 7.x-2.x-dev
Status: Needs work » Needs review
FileSize
957 bytes

I think I understand what you're trying to do here.

Allow me to write up an example.
Let's say we have a site with a bunch of page nodes and no Feeds configured.
We configure an importer, and attach it to the page.
If we go back to those existing nodes, and enter in URLs to import, they are not scheduled.

I can replicate this scenario.

twistor’s picture

This changes things to use JobScheduler::check() instead of re-scheduling a job on every node save.

twistor’s picture

Status: Needs review » Needs work

The last submitted patch, 14: feeds-schedule-on-update-2450365-14.patch, failed testing.

twistor’s picture

Status: Needs work » Needs review
twistor’s picture

The last submitted patch, 14: feeds-schedule-on-update-2450365-14.patch, failed testing.

Status: Needs review » Needs work

The last submitted patch, 18: feeds-schedule-on-update-2450365-18.patch, failed testing.

twistor’s picture

twistor’s picture

Status: Needs review » Needs work

The last submitted patch, 22: feeds-schedule-on-update-2450365-22.patch, failed testing.

The last submitted patch, 18: feeds-schedule-on-update-2450365-18.patch, failed testing.

The last submitted patch, 22: feeds-schedule-on-update-2450365-22.patch, failed testing.

MegaChriz’s picture

The patch is hanging on the same error as the test posted in #2624344: Import via pushImport() keeps looping / never completes, which uncovered an endless loop. Now that I committed a fix for that issue, let's see if a retest helps...

Status: Needs work » Needs review
MegaChriz’s picture

The last submitted patch, 28: feeds-schedule-on-update-2450365-28-tests-only.patch, failed testing.

Status: Needs review » Needs work

The last submitted patch, 28: feeds-schedule-on-update-2450365-28.patch, failed testing.

MegaChriz’s picture

The tests ran forever because of a while() in FeedsSchedulerTestCase::testBatching(). Let's fix that. Tests will fail however.

Status: Needs review » Needs work

The last submitted patch, 31: feeds-schedule-on-update-2450365-31.patch, failed testing.

MegaChriz’s picture

Alright, figured it out.

When a feeds job gets dispatched by Job Scheduler, the job is set on the queue and then the job is 'reserved', meaning it is put on hold by setting the $job['scheduled'] value. This wil make Job Scheduler ignore the next import time until the job is done (which Feeds never officially reports).

When checking for an existing job with JobScheduler::check(), the job is only rescheduled when the period changed.

So what we need to do:
If the job does exist and it is currently put on hold, reschedule it. If the job does not exist, set it.

Attached a new patch.

Status: Needs review » Needs work

The last submitted patch, 33: feeds-schedule-on-update-2450365-33.patch, failed testing.

MegaChriz’s picture

Status: Needs work » Needs review
FileSize
9.49 KB
4.52 KB

The test failed because a 'feeds_reschedule' should force a feed to be rescheduled, even if done so before.

As suggested in #2630694-20: Run background jobs directly in queue., this adds a method ensureSchedule() to the FeedsSource class. This way a 'feeds_reschedule' will force the feed to be rescheduled. When saving a feed node or when hitting "Import" for the standalone form, the feed should only be scheduled if currently not scheduled.

  • MegaChriz committed 3f06cb6 on 7.x-2.x
    Issue #2450365 by MegaChriz, twistor, littledynamo: Fixed importer is...
MegaChriz’s picture

Status: Needs review » Fixed

Great, tests are passing!

Committed #35.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.