Problem: FeedsHTTPFetcherResult should cache the result of the HTTP request to not issue duplicated HTTP requests during the parsing phase. I'm importing 1500 items and during different queue API runs the result will be request again although the feed has not been completely imported yet. In my case FeedsXPathParserXML needs to get the raw result, and that causes a new HTTP request.

Solution: cache the result of the HTTP request in getRaw().

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

klausi’s picture

Status: Active » Needs review
FileSize
1.23 KB

Patch attached.

twistor’s picture

Title: FeedsHTTPFetcherResult should cache the result » FeedsHTTPFetcherResult should store the result between batches

Sorry, the title threw me a bit, since we are caching the result. I guess what's happening is that your feed source doesn't use etags are last-modified. I wonder if this will allow us to get rid of the static cache in http_request_get()?

twistor’s picture

Category: Task » Bug report
FileSize
1.66 KB

Keeping that in memory isn't really ideal, but we do it already multiple times, and that all needs to be seriously fixed.

I'm going to say this is a bug, because other weird things can happen if you're re-downloading something in-between cron runs.

Only real change is to use !isset() instead of empty() so that an empty response body isn't fetched more than once.

Status: Needs review » Needs work

The last submitted patch, 3: feeds-http-cache-2192819-3.patch, failed testing.

twistor’s picture

Status: Needs work » Needs review
FileSize
5.88 KB

That makes sense.

twistor’s picture

Entirely wrong patch.

The last submitted patch, 5: feeds-http-fetcher-auto-scheme-2046335-5.patch, failed testing.

twistor’s picture

There's no reason to call parent::__construct() at all.

twistor’s picture

Version: 7.x-2.x-dev » 6.x-1.x-dev
Status: Needs review » Patch (to be ported)
twistor’s picture

Status: Patch (to be ported) » Closed (outdated)