Problem/Motivation

Under certain circumstances content gets imported via a manual import, but not when running cron.

Possible causes:

Proposed resolution

Some causes are known (see above), but likely not all. The problem has been reported in multiple issues, so these issues should be sorted out to gather clues. When the other causes are known, it would be best if the bug(s) could be replicated with an automated test. This makes it easier to find out if a proposed fix really fixes the issue or just partly.

Remaining tasks

  1. Sort out the issues that reported this problem (and close duplicates).
  2. Find the cause of the issue.
  3. Replicate the issue with an automated test, if possible.
  4. Propose a fix for the issue.

Temporary Solution

The temporary solution is to manually kick off the feed import with a custom hook_cron(). For feeds not attached to nodes, the code will look like this:

function MODULE_NAME_cron() {
  $name = 'FEED_NAME';
  $source = feeds_source($name);
  $source->import();
}

For feeds that are attached to nodes, you'll need to load a list of node ids then run the import on each one:

function MODULE_NAME_cron() {
    $node_type = 'CONTENT_TYPE';
    $feed_type = 'FEED_NAME';
    $query = new EntityFieldQuery();
    $query->entityCondition('entity_type', 'node');
    $query->entityCondition('bundle', $node_type);
    $result = $query->execute();
    $feed_nids = array_keys($result['node']);
    foreach ($feed_nids as $feed_nid) {
      $source = feeds_source($feed_type, $feed_nid);
      $source->import();
    }
}

Related issues

These issues may be reports of the exact same problem:

These issues look like a different problem, but are related to cron too:

Closed cron issues:

Original report by tamarackmedia

I am importing a CSV via a URL (HTTP Fetcher), set to periodic import "as often as possible."

I had successfully used the exact same setup to import via cron on another site running Feeds 7.x-2.0-alpha7.

Manual import works fine on the new site (running latest release, 7.x-2.0-alpha8), but nothing happens on cron. I rolled back to Feeds 7.x-2.0-alpha7 and it works fine.

CommentFileSizeAuthor
#13 feeds_periodic_import.png20.72 KBdineshw
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

twistor’s picture

Status: Active » Postponed (maintainer needs more info)

I just setup an importer to run as often as possible with alpha7, upgraded to alpha8, and it works fine.

Can you provide your configuration?

essenceofginger’s picture

I'm seeing the same thing. Other than the 'periodic import' setting, I can't figure out how else I should be configuring the import to run on cron. Driving me nuts!

twistor’s picture

Issue summary: View changes
Status: Postponed (maintainer needs more info) » Closed (cannot reproduce)
ajayg’s picture

I am facing this issue and driving me nuts. I have a drupal multisite and it used to work for both sites and now it stopped only for one site.
Manual import works fine on both site but through cron works only for one site. On site where it does not work, The logs says cron ran and job scheduler processed n records. But the feed logs does not reflect new items created. No other error.

ajayg’s picture

Status: Closed (cannot reproduce) » Active
metabrown’s picture

+1 upgrade to Feeds 7.x-2.0-alpha8 broke cron updating, rolling back to alpha7 fixed the issue.

ajayg’s picture

Confirming the multisite which was upgraded , cron does not work. But using same multisite code , a new site created on scratch, it works fine. So same code is bahaving differently which points to some database or variable discrepancy.

MhueD’s picture

Feeds 7.x-2.0-alpha8 AND 7.x-2.0-beta1 -- Cron hook has completely stopped working (on our sites) and is rather easy to detect. For instance, this code:

function bupe_utility_cron() {
        run_buprenorphine_physician_feed();
}

function run_buprenorphine_physician_feed () {
        $importer_id = 'bup_info_graphic';  // Machine name of the importer
        $source1 = feeds_source($importer_id); // Load the Feeds Source object
        $source1->startImport();
        watchdog('bupe_utility', "Buprenorphine Info-graphic Feed has been initiated.", null, WATCHDOG_INFO);
}

In this case the watchdog record DOES get written, and yet the Feed itself never runs. This last part is also easy to establish, because one can put a hook like:

function bupe_utility_feeds_after_import (FeedsSource $source) {
        switch ($source->id) {
                case 'bup_info_graphic':
                        watchdog('bupe_utility', "Buprenorphine Info-graphic Feed has been completed.", null, WATCHDOG_INFO);
                        break;
        } // end switch
}

Also, if we try:

$source1->startImport() or die( "Could not start bupe import!" );

...the 'die' statement runs...so I guess in the case above the startImport is failing but control continues getting passed on to the Watchdog call anyway.

FYI: This feed only takes a few seconds to run, and it DOES run completely successfully as a standalone form from the /import UI. Perhaps even stranger...it doesn't seem to fix the Cron call if the Feed is reset to work off of a node (a common fix listed in some threads on this topic). The site itself is 7.37 Core. Also, we have tried sandboxing with Ultimate Cron with no effect. If this is a setting/db-change, I would love to have some idea of what direction to take that research, because chasing down every conceivable 'cross-talk' interaction with every other module is beyond our scope at the moment!

Below is the export code of the Feed Definition. Please note, however, that we have changed quite a few of the settings, especially those effecting timing, and the behavior stays the same. For instance, having the import run 'as often as possible' has no effect on the success of each attempt, though it does initiate the attempts as requested.

$feeds_importer = new stdClass();
$feeds_importer->disabled = FALSE; /* Edit this to true to make a default feeds_importer disabled initially */
$feeds_importer->api_version = 1;
$feeds_importer->id = 'bup_info_graphic';
$feeds_importer->config = array(
  'name' => 'Bup Info Graphic',
  'description' => 'Imports data for Buprenorphine Info Graphic',
  'fetcher' => array(
    'plugin_key' => 'FeedsHTTPFetcher',
    'config' => array(
      'auto_detect_feeds' => FALSE,
      'use_pubsubhubbub' => FALSE,
      'designated_hub' => '',
      'request_timeout' => NULL,
    ),
  ),
  'parser' => array(
    'plugin_key' => 'FeedsExJsonPath',
    'config' => array(
      'sources' => array(
        'certified_30_total' => array(
          'name' => 'Certified-30 Total',
          'value' => 'certified_30_total',
          'debug' => 0,
          'weight' => '1',
        ),
        'certified_100_total' => array(
          'name' => 'Certified-100 Total',
          'value' => 'certified_100_total',
          'debug' => 0,
          'weight' => '2',
        ),
        'certified_30_past_year' => array(
          'name' => 'Certified-30 Past Year',
          'value' => 'certified_30_past_year',
          'debug' => 0,
          'weight' => '3',
        ),
        'certified_100_past_year' => array(
          'name' => 'Certified-100 Past Year',
          'value' => 'certified_100_past_year',
          'debug' => 0,
          'weight' => '4',
        ),
        'certified_30_past_6_months' => array(
          'name' => 'Certified-30 Past 6 Months',
          'value' => 'certified_30_past_six_months',
          'debug' => 0,
          'weight' => '5',
        ),
        'certified_100_past_6_months' => array(
          'name' => 'Certified-100 Past 6 Months',
          'value' => 'certified_100_past_six_months',
          'debug' => 0,
          'weight' => '6',
        ),
        'certified_30_past_90_days' => array(
          'name' => 'Certified-30 Past 90 Days',
          'value' => 'certified_30_past_90',
          'debug' => 0,
          'weight' => '7',
        ),
        'certified_100_past_90_days' => array(
          'name' => 'Certified-100 Past 90 Days',
          'value' => 'certified_100_past_90',
          'debug' => 0,
          'weight' => '8',
        ),
        'certified_30_past_60_days' => array(
          'name' => 'Certified-30 Past 60 Days',
          'value' => 'certified_30_past_60',
          'debug' => 0,
          'weight' => '9',
        ),
        'certified_100_past_60_days' => array(
          'name' => 'Certified-100 Past 60 Days',
          'value' => 'certified_100_past_60',
          'debug' => 0,
          'weight' => '10',
        ),
        'certified_30_past_30_days' => array(
          'name' => 'Certified-30 Past 30 Days',
          'value' => 'certified_30_past_30',
          'debug' => 0,
          'weight' => '11',
        ),
        'certified_100_past_30_days' => array(
          'name' => 'Certified-100 Past 30 Days',
          'value' => 'certified_100_past_30',
          'debug' => 0,
          'weight' => '12',
        ),
      ),
      'context' => array(
        'value' => '$.[]',
      ),
      'display_errors' => 0,
      'source_encoding' => array(
        0 => 'auto',
      ),
      'debug_mode' => 0,
    ),
  ),
  'processor' => array(
    'plugin_key' => 'FeedsNodeProcessor',
    'config' => array(
      'expire' => '-1',
      'author' => 0,
      'authorize' => 1,
      'mappings' => array(
        0 => array(
          'source' => 'certified_30_total',
          'target' => 'field_cert_30_total',
          'unique' => FALSE,
        ),
        1 => array(
          'source' => 'certified_100_total',
          'target' => 'field_cert_100_total',
          'unique' => FALSE,
        ),
        2 => array(
          'source' => 'certified_30_past_year',
          'target' => 'field_cert_30_past_year',
          'unique' => FALSE,
        ),
        3 => array(
          'source' => 'certified_100_past_year',
          'target' => 'field_cert_100_past_year',
          'unique' => FALSE,
        ),
        4 => array(
          'source' => 'certified_30_past_6_months',
          'target' => 'field_cert_30_past_6_months',
          'unique' => FALSE,
        ),
        5 => array(
          'source' => 'certified_100_past_6_months',
          'target' => 'field_cert_100_past_6_months',
          'unique' => FALSE,
        ),
        6 => array(
          'source' => 'certified_30_past_90_days',
          'target' => 'field_cert_30_past_90_days',
          'unique' => FALSE,
        ),
        7 => array(
          'source' => 'certified_100_past_90_days',
          'target' => 'field_cert_100_past_90_days',
          'unique' => FALSE,
        ),
        8 => array(
          'source' => 'certified_30_past_60_days',
          'target' => 'field_cert_30_past_60_days',
          'unique' => FALSE,
        ),
        9 => array(
          'source' => 'certified_100_past_60_days',
          'target' => 'field_cert_100_past_60_days',
          'unique' => FALSE,
        ),
        10 => array(
          'source' => 'certified_30_past_30_days',
          'target' => 'field_cert_30_past_30_days',
          'unique' => FALSE,
        ),
        11 => array(
          'source' => 'certified_100_past_30_days',
          'target' => 'field_cert_100_past_30_days',
          'unique' => FALSE,
        ),
      ),
      'update_existing' => '0',
      'input_format' => 'plain_text',
      'skip_hash_check' => 0,
      'bundle' => 'bup_info_graphic',
    ),
  ),
  'content_type' => '',
  'update' => 0,
  'import_period' => '-1',
  'expire_period' => 3600,
  'import_on_create' => 1,
  'process_in_background' => 0,
);

MegaChriz’s picture

MegaChriz’s picture

Priority: Normal » Major
Issue summary: View changes

I've created a summary of the issue. While I do know possible causes have been reported in this issue (and perhaps in other issues too), I've stated that the cause of the issue is still unknown. Because there are so many issues open about this problem, I have no overview of all possible causes and as such I can not judge if one of them is a real cause or a misunderstanding of how something in Feeds is supposed to work.

The first step is to consolidate the issues: gather clues from each of them and then close duplicates. Then with the clues gathered, we can look for the cause of the issue. Perhaps there is more than one cause. When the cause is known, it would be best if the bug could be replicated with an automated test. This makes it easier to find out if a proposed fix really fixes the issue or just partly.

dineshw’s picture

This should be straight but somehow it’s not working with feeds:
Can someone summarise steps for testing periodic import

Feed importer configured to import json fields works perfectly when run via import link

Same importer does not run via Cron run even after having option periodic import enabled.

MegaChriz’s picture

Steps for using periodic import:

  1. On the importer basic settings, set "Periodic import" to a value other than "Off", for example "As often as possible".
  2. Go to the import page for the feeds importer (for example: /import/my_importer).
  3. Specify a source and click the "Import" button. If your importer is attached to a content type, hit "Save". If "Import on submission" is turned off, then you also need to go the import tab (for example: node/3/import) and hit the "Import" button there.
  4. Run cron.
dineshw’s picture

FileSize
20.72 KB

Hi Chriz,
See attached screenshot. My feeds are not attached to content type and set to Periodic import of 1 hour and I have crontab configured for every 15 mins.

But if I run cron either manually or via crontab, I dont see feeds getting imported in content.

But, If I simply use below hook in cusotm module to execute feeds import it does work.

<?php
function MODULE_NAME_cron() {
  $name = 'FEED_NAME';
  $source = feeds_source($name);
  $source->import();
}
?>
pianomansam’s picture

Title: Import on cron doesn't work in Feeds 7.x-2.0-alpha8 » [META] Cron import not working on 7.x-2.0-alpha8 and later
Version: 7.x-2.0-alpha8 » 7.x-2.x-dev
Priority: Major » Critical

Updating title to better reflect this as a meta issue.

pianomansam’s picture

Issue summary: View changes

@dineshw's hook_cron() example is helpful, but it only works on feeds not attached to nodes. Here's an example of how to handle node-attached feeds:

function MODULE_NAME_cron() {
    $node_type = 'CONTENT_TYPE';
    $feed_type = 'FEED_NAME';
    $query = new EntityFieldQuery();
    $query->entityCondition('entity_type', 'node');
    $query->entityCondition('bundle', $node_type);
    $result = $query->execute();
    $feed_nids = array_keys($result['node']);
    foreach ($feed_nids as $feed_nid) {
      $source = feeds_source($feed_type, $feed_nid);
      $source->import();
    }
}

I've also updated the issue summary with these details.

daboo’s picture

I'm having this exact same problem in Drupal 8. Feeds were working as expected and then as I added content and functionality to the site it stopped working. I attributed to another module, but have no idea exactly what the root cause is at this point.
Does anyone have a manual code change as listed above that may work for Drupal 8 as a workaround?

twistor’s picture

@daboo, your problem is a separate issue. This is for D7. The cron code in D8 is very different.

dineshw’s picture

@all : use drush feeds-import command to setup as cron, it works very straight!
Advise to look it as alternative instead ofdrupal cron.
To configure it over command line refer https://docs.acquia.com/cloud/manage/cron#direct

daboo’s picture

Thanks for clearing that up twistor

ervit’s picture

I'm on 7.x-2.0-beta3 and cron execution broke when I changed the importer's expiry setting from 1 year to Never (it executed just a few or just one Feed out of over 20 that were based on the same importer). Manual cron launches weren't successful either. As soon as I changed the expiry back to 1 year, Feeds imported all feeds on cron run. I don't know if this is related or not, but it's not good.

ajayg’s picture

I had faced these issues in the past but for some reason they have disappeared for now for several months. Is anyone using latest Drupal build and latest feeds beta3 facing these? Perhaps the issue was outside Feeds.

MegaChriz’s picture

Issue summary: View changes

@ajayg
This is still an issue. Today I found out two things:

  • When periodic import and "Import on submission" are both turned off and the importer is using the standalone form, this essentially has the result that nothing gets imported. I propose to "fix" this with warning the user: #2445477: Process in background not working with certain combination of settings.
  • As soon as periodic import gets turned off, Feeds aborts any scheduled imports on the next cron run. This may not be wrong, however.

Earlier I became aware that due that cron is ran as an anonymous user, import may not run due to insufficient permissions. This is being handled in #2541944: Switch to feed author or user 1 during imports (taxonomy mapping does not work with cron).

ajayg’s picture

Wow @MegaChriz, kudos to your persistence for chasing this.

Just FYI, I used to get this issue with Periodic Import "on" and Process in background "on". Actually it would appear suddenly on a running site where it used to work and suddenly stop without the changes to those settings. Fortunately it has not happened for a while so this issue may look same but may be multiple reasons underneath.

MegaChriz’s picture

@ajayg
If you have many importers, maybe you want to try out the patch from #2868134: Show next time that the source will be imported to see if it gives any false information? I hope that the changes provided in that patch will give more insight for people running into this issue. The patch make Feeds check the queue and job_schedule tables to see if any imports are scheduled. If people running into this issue see 'Next import: not scheduled.' on the import page then at least they know import didn't suddenly stop because of permission issues. What exactly caused the importers to suddenly stop in this case, is a whole other story.

Barnettech’s picture

making the module and running from cron works for feeds attached to nodes, but how to run more than one feed, this should work, but it doesn't? What am I missing? Just the first feed runs on cron run, the 2nd one never runs.

$node_type = 'news';
  $feed_type = 'barneytech_gazette_news_and_announcements_with_pics';
  $query = new EntityFieldQuery();
  $query->entityCondition('entity_type', 'node');
  $query->entityCondition('bundle', $node_type);
  $result = $query->execute();
  $feed_nids = array_keys($result['node']);
  foreach ($feed_nids as $feed_nid) {
    $source = feeds_source($feed_type, $feed_nid);
    $source->import();
  }

  $node_type2 = 'news';
  $feed_type2 = 'barneytech_gazette_science_and_technology_with_pics';
  $query2 = new EntityFieldQuery();
  $query2->entityCondition('entity_type', 'node');
  $query2->entityCondition('bundle', $node_type2);
  $result2 = $query2->execute();
  $feed_nids2 = array_keys($result2['node']);
  foreach ($feed_nids2 as $feed_nid2) {
    $source2 = feeds_source($feed_type2, $feed_nid2);
    $source2->import();
  }

I'm seeing the error: FeedsHTTPRequestException: Download of failed because its scheme could not be determined. The URL is expected to start with something like 'http

Barnettech’s picture

fyi, this worked and solved the FeedsHTTPRequestException error

function feeds_import_workaround_cron() {
  watchdog('news_import', 'news import1');
  $node_type = 'news';
  $feed_type = 'barneytech_gazette_news_and_announcements_with_pics';
  $query = new EntityFieldQuery();
  $query->entityCondition('entity_type', 'node');
  $query->entityCondition('bundle', $node_type);
  $result = $query->execute();
  $feed_nids = array_keys($result['node']);
  $source = feeds_source($feed_type);
  watchdog('news_import', 'got to line :19 <pre>' . print_r($source, TRUE) . '</pre>');
  $source->save;
  while (FEEDS_BATCH_COMPLETE != $source->import());
  watchdog('news_import', 'news import2');
}

and it works with multiple feeds.

joelpittet’s picture

Category: Bug report » Plan
Priority: Critical » Normal