I am currently importing simple XML files hosted on Amazon S3 and they work fine.

I am now trying to import XML files that reference other files (images, PDFs, etc) that are hosted in the same bucket/folder as the XML file I'm importing.

For example, using the Standalone import form, if I import the following file hosting on S3:

s3://upload/metadata-file.xml

It will use XPath to parse the mapped fields and everything will work fine.

However, if I attempt to import an image or file that is referenced in that XML file:

<Files>
  <ContentPath>image-file.jpg</ContentPath>
</Files>

I get the following error:

Download of image-file.jpg failed with code -1002.

I've also tried using feeds_tamper to prepend "s3://upload/" to the XPath variable, but that also fails with the following result:

Download of s3://upload/image-file.jpg failed with code -1003.

How can I upload/attach/import files that my original XML file references?

Thanks in advance!

CommentFileSizeAuthor
#1 feeds-stream-wrapper-1793806-1.patch949 bytestwistor
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

twistor’s picture

Title: Unable to Import Images/Files with Amazon S3 » Add support for downloading from stream wrappers.
Version: 7.x-2.0-alpha5 » 7.x-2.x-dev
Component: Feeds Import » Code
Assigned: Unassigned » twistor
Category: support » feature
Priority: Major » Normal
Status: Active » Needs review
FileSize
949 bytes

Can you give this a try?

cmarcera’s picture

patching file http_request.inc
Hunk #1 FAILED at 77.
Hunk #2 succeeded at 121 (offset -10 lines).
1 out of 2 hunks FAILED -- saving rejects to file http_request.inc.rej

Should that have worked on 7.x-2.0-alpha5? Hunk #1 looked like just comments so I tried it anyway. The new error is:

Download of s3://upload/image-file.jpg failed with code 403.

I also tried putting the image on a publicly accessible web server (http://mysite.com/images-file.jpg) and attempted to have Feeds Import grab it from there and upload it to S3 (I am also using Amazon S3 for file storage - http://drupal.org/project/amazons3). That also resulted in an error:

The specified file temporary://fileS76bmf could not be copied, because the destination directory is not properly configured. This may be caused by a problem with file or directory permissions. More information is available in the system log.

Invalid enclosure http://mysite.com/images-file.jpg

syslog had no information, neither did my Apache2 access or error logs.

cmarcera’s picture

Sorry for the double update...

Fixed the 'Invalid enclosure' error via #1 here: http://drupal.org/node/1612246

Which led to a cURL timeout error due to large filesize. This was fixed with #6 here: http://drupal.org/node/1048810#comment-5209786

And will eventually be (re)fixed whenever the timeout makes it to the UI in a release.

This still does not address the initial issue of importing files from the same S3 directory XML files reside. I will continue to test patches if needed, but I will be proceeding with the above alternate hosting scenario (http -> S3).

twistor’s picture

Issue summary: View changes
Status: Needs review » Needs work
Issue tags: +Needs tests

We should still really do this.

Honza Pobořil’s picture

Maybe this module will be helpful: https://www.drupal.org/project/feeds_url_fetcher It is Feeds fetcher what download from any stream wrapper registered in PHP.