I’ve got a site with a lot of feeds (25+) and feed items (200+) set to expire hourly and reimport.

Issue I’m stuck on is incomplete execution of these processes on cron jobs. I’ve tweaked a number of cron settings to try and exhaust the possibility this problem is related to php execution limits on cron jobs. Cron-specific changes include:

  • php_value max_execution_time 360 added to .htaccess, and max_execution_time 360 added to settings.php, expecting to bypass Rackspace Cloud Sites 30 second timeout
  • Rackspace cron setup through cron_curl.sh as perl script in their hosting admin section. (For anyone following along other articles, Configuring Cron on Rackspace Cloud Sites suggests http pointing to site’s cron.php file will work, but this http request will timeout at 30 seconds

I consistently get Cron run completed messages in /admin/reports/dblog (regardless if fired locally, or by server). Job_scheduler (used by Feeds) also claims to complete in its log entry, (ex: “Finished processing scheduled jobs (2 sec s, 99 total, 0 failed”), and likewise, Feeds Log (/admin/reports/feeds) shows successful imports, updates, and created logs for some Feeds and their Feed Item nodes.

But these imports are breaking my site because they never finish. I need all Feed and Feed Items to import at once. Seem still to be timing out at 30 seconds, evidenced by adding up “Imported in x seconds” of each cron job, always about 30 seconds.

What am i missing? Anything inside of Feeds setting this timeout? Any other ideas?!

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

surf12’s picture

i have the same problem

sydneyshan’s picture

I've got the same issue with Feeds 6.x-1.0-beta11 and Job Scheduler 6.x-1.0-beta3 - did you figure out how to resolve this? I've got 1000 items to import/update via feeds using cron. Job Scheduler says it's finished after processing about 60 of them...!

If I run the feed import via the website frontend all items import successfully.

I've tried extending the timeout of wget and my php environment (max execution time) but it's not an execution timeout issue...
/usr/bin/wget --timeout=5000 -O - -q -t 1 http://www.example.com/cron.php

It seems Job Scheduler is giving up early for some unknown reason...

franz’s picture

Status: Active » Postponed (maintainer needs more info)

I never had such a timeout issue. No, feeds doesn't have a timeout, I'm guessing this is an external issue, either with another module or with some PHP/Apache. Maybe you should check if the php setting is being respected (using phpinfo()?). You can also try running cron from drush.

balazs.hegedus’s picture

@faunt:
I've had a similar problem and managed to sort it out by setting http_request_timeout to a higher value as suggested here. Please note, this only works if you have curl installed. Also see my notes and a possible solution for the curl limitation at the issue I've just created.

PatchRanger’s picture

Version: 7.x-2.0-alpha4 » 7.x-2.x-dev
Component: Documentation » Code
Category: support » feature
Status: Postponed (maintainer needs more info) » Needs review
FileSize
2.31 KB

Are there timeouts on Feed importers?

@faunt They are. As you could see in README.TXT there is a hidden setting that you can define by adding it to the $conf array in your settings.php file:

Name: http_request_timeout
Default: 15
Description: Timeout in seconds to wait for an HTTP get request to finish.

I confirm that it is not comfortable way.
I have 2 different values of this setting on localhost and on production and it makes my settings.php permanently overridden.
Let's do it this way. Here you are a patch that adds this setting to UI.

How to

  1. Backup your database and code.
    As usual before any changing.
  2. Apply the patch attached.
    Note: For this purpose you can use Drush (recommended). Or if you can't - use Patch Manager. (But note that this module needs patched files to be writable for server that is insecure. Though you could change permissions, apply patch - and then change permission back to secure permission configuration.)
  3. Run update.php.
    It will set http_request_timeout setting to default value (30).
  4. Flush caches.
    After all this actions done you will see a textfield on admin/structure/feeds page.

Also please note that as soon as this patch get committed this setting will be available by default for every fresh install of feeds_ui module.
It automatically fixes some other issues related to http_request_timeout setting such as:

twistor’s picture

Status: Needs review » Needs work

If we are going to add a UI option for timeouts it should be a per-importer setting that overrides http_request_timeout.

PatchRanger’s picture

Status: Needs work » Needs review
FileSize
8.32 KB

Per-importer setting that overrides http_request_timeout is implemented.
Please review.

twistor’s picture

Status: Needs review » Needs work

The setting on the overview page should be removed. Having multiple places for the same thing confuses people. Timeout should be the last setting, not the first.

PatchRanger’s picture

Status: Needs work » Needs review
FileSize
6.56 KB

Done.
Please review once more.

PatchRanger’s picture

FileSize
3.94 KB

Here is interdiff of these 2 patches.

twistor’s picture

Status: Needs review » Needs work
+++ b/feeds_ui/feeds_ui.installundefined
@@ -6,8 +6,31 @@
 /**
+  * Implement hook_install()
+  */
+function feeds_ui_install() {
+  variable_set('http_request_timeout', 30);

This is not needed. That's what the variable default is for.

+++ b/feeds_ui/feeds_ui.installundefined
@@ -6,8 +6,31 @@
+/**
+  * Implement hook_uninstall()
+  */
+function feeds_ui_uninstall() {
+  variable_del('http_request_timeout');

This should have been there all along :) But it should be in feeds.install.

+++ b/feeds_ui/feeds_ui.installundefined
@@ -6,8 +6,31 @@
+
+/**
+ * Create http_request_timeout variable if it is not set yet.
+ */
+function feeds_ui_update_7200() {
+  // If it was already set - use it.
+  // If not - set to default value.
+  variable_set('http_request_timeout', variable_get('http_request_timeout', 30));

This is not needed.

+++ b/plugins/FeedsHTTPFetcher.incundefined
@@ -124,6 +128,18 @@ class FeedsHTTPFetcher extends FeedsFetcher {
+                         'When left empty, global value is used.'),

"When left empty, the global value is used."

twistor’s picture

Also, let's use a setter/getter rather than adding $timeout to the constructor. I believe that will throw a strict error.

PatchRanger’s picture

Status: Needs work » Needs review
FileSize
2.74 KB
6.13 KB

Done. Please review.
Interdiff is also attached to save your time.

franz’s picture

Status: Needs review » Reviewed & tested by the community

Looks good to me, fixed issues on #11.

twistor’s picture

Version: 7.x-2.x-dev » 6.x-1.x-dev
Status: Reviewed & tested by the community » Patch (to be ported)

Awesome!

Thanks for sticking this out.

7.x
http://drupalcode.org/project/feeds.git/commit/253fbea

charlie-s’s picture

You may also need to increase the default_socket_timeout. You can do both in settings.php:

ini_set('default_socket_timeout', 120);
$conf['http_request_timeout'] = 120;
lwalley’s picture

I'm still experiencing timeouts with some custom target alters because FeedsEnclosure::getContent() calls http_request_get but does not specify a timeout value so it defaults to http_request_timeout variable or to 30 seconds. I could increase http_request_timeout variable value, but I think it would be good if the individual importer's fetcher timeout setting could also be used.

I've outlined the issue in more detail in #2076065: Timeout when using FeedsEnclosure::getContent in target alter. I just wanted to add a link here in case anyone was experiencing the same issue or perhaps has some ideas on how best to resolve.

kenorb’s picture

Status: Patch (to be ported) » Closed (outdated)
kenorb’s picture

Version: 6.x-1.x-dev » 7.x-2.x-dev
Status: Closed (outdated) » Fixed

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.