FeedsHTTPFetcherResult should store the result between batches

Problem: FeedsHTTPFetcherResult should cache the result of the HTTP request to not issue duplicated HTTP requests during the parsing phase. I'm importing 1500 items and during different queue API runs the result will be request again although the feed has not been completely imported yet. In my case FeedsXPathParserXML needs to get the raw result, and that causes a new HTTP request.

Solution: cache the result of the HTTP request in getRaw().

Comment	File	Size	Author
#8	feeds-http-cache-2192819-8.patch	1.77 KB	twistor
#8
#6	feeds-http-cache-2192819-5.patch	1.81 KB	twistor
#6
#5	feeds-http-fetcher-auto-scheme-2046335-5.patch	5.88 KB	twistor
#5
#3	feeds-http-cache-2192819-3.patch	1.66 KB	twistor
#3
#1	feeds-http-cache-2192819-1.patch	1.23 KB	klausi
#1

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Comment #1

he/him||they/them

German

🇦🇹 Vienna

CreditAttribution: klausi commented 10 February 2014 at 14:42

Status:

Active

» Needs review

File	Size
feeds-http-cache-2192819-1.patch	1.23 KB

Patch attached.

Log in or register to post comments

twistor’s picture

Comment #2

twistor CreditAttribution: twistor commented 10 February 2014 at 20:57

Title:

FeedsHTTPFetcherResult should cache the result

» FeedsHTTPFetcherResult should store the result between batches

Sorry, the title threw me a bit, since we are caching the result. I guess what's happening is that your feed source doesn't use etags are last-modified. I wonder if this will allow us to get rid of the static cache in http_request_get()?

Log in or register to post comments

twistor’s picture

Comment #3

twistor CreditAttribution: twistor commented 10 February 2014 at 22:16

Category:

Task

» Bug report

File	Size
feeds-http-cache-2192819-3.patch	1.66 KB

Keeping that in memory isn't really ideal, but we do it already multiple times, and that all needs to be seriously fixed.

I'm going to say this is a bug, because other weird things can happen if you're re-downloading something in-between cron runs.

Only real change is to use !isset() instead of empty() so that an empty response body isn't fetched more than once.

Log in or register to post comments

Comment #4

10 February 2014 at 22:21

Status:

Needs review

» Needs work

The last submitted patch, 3: feeds-http-cache-2192819-3.patch, failed testing.

Log in or register to post comments

twistor’s picture

Comment #5

twistor CreditAttribution: twistor commented 10 February 2014 at 22:26

Status:

Needs work

» Needs review

File	Size
feeds-http-fetcher-auto-scheme-2046335-5.patch	5.88 KB

That makes sense.

Log in or register to post comments

twistor’s picture

Comment #6

twistor CreditAttribution: twistor commented 10 February 2014 at 22:26

File	Size
feeds-http-cache-2192819-5.patch	1.81 KB

Entirely wrong patch.

Log in or register to post comments

Comment #7

10 February 2014 at 22:27

The last submitted patch, 5: feeds-http-fetcher-auto-scheme-2046335-5.patch, failed testing.

Log in or register to post comments

twistor’s picture

Comment #8

twistor CreditAttribution: twistor commented 10 February 2014 at 22:29

File	Size
feeds-http-cache-2192819-8.patch	1.77 KB

There's no reason to call parent::__construct() at all.

Log in or register to post comments

twistor’s picture

Comment #9

twistor CreditAttribution: twistor commented 11 February 2014 at 02:10

Version:	7.x-2.x-dev	» 6.x-1.x-dev
Status:	Needs review	» Patch (to be ported)

This is eligible for backport.

http://drupalcode.org/project/feeds.git/commit/a8468ac

Log in or register to post comments

twistor’s picture

Comment #10

twistor CreditAttribution: twistor as a volunteer commented 16 June 2016 at 22:24

Status:

Patch (to be ported)

» Closed (outdated)

Log in or register to post comments