Current issue

The Drupal 7 version of feeds had support for importing local files. Drupal 8+ Feeds currently only supports importing by URL. Let's add support for local files to the Drupal 8+ version, too.

Original issue:

Hi,

For file field, there is option to give only fields from file entity. Is It possible to give file URL. I have a use case to import content with file url.

How I can do it?

Issue fork feeds-2968671

Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

MegaChriz’s picture

Category: Feature request » Support request
Status: Active » Fixed

Yes, you can just specify a file URL in your source and map that to "File ID". By default the target is configured to reference by filename.
In the beginning I found it confusing too that you need to map an url to a file ID. I had considered to change that until I noticed that file ID made more sense as that what gets actually saved. See #2867838: Change label of file target's property 'target_id' to 'File URL' as the input is expected to be an URL.

Example CSV for importing nodes (or other entities):

guid title file
1 Lorem Ipsum http://www.example.com/my-file.txt
saranya ashokkumar’s picture

@MegaChriz,

Thanks for your support. I have tried as you said but which is not working for me. I gave path as below in csv
http://examplesite.com/sites/default/files/2018-05/Sample.pdf. Can you suggest me, where I did mistake?

I have attached my error image also.

error

saranya ashokkumar’s picture

Working fine now. Thank you.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

Toki’s picture

Just to add an information here, I add the exact same error than saranya.
My images were in /sites/default/files and the import failed (with a mapping between the image URL and the File ID)
As soon as I moved them to the root folder, in a custom folder called "import", it worked (with absolute URL http://mysite.com/import/image.jpg, not with relative URL like /import/image.jpg)
I di not check but maybe the "files" folder has specific access and Feeds can not upload from it?
Just food for thoughts ;)

MegaChriz’s picture

Title: File url field in feeds mapping. » File target: add support for local file path as source
Category: Support request » Feature request
Status: Closed (fixed) » Active

The D8 version of Feeds seems to ony support download from urls, not local paths. It tries to get a file with the method \GuzzleHttp\ClientInterface::request():

protected function getContent($url) {
  $response = $this->client->request('GET', $url);

  if ($response->getStatusCode() >= 400) {
    $args = [
      '%url' => $url,
      '@code' => $response->getStatusCode(),
    ];
    throw new TargetValidationException($this->t('Download of %url failed with code @code.', $args));
  }

  return (string) $response->getBody();
}

The D7 version however did also support local paths. See FeedsEnclosure:

// Copy or save file depending on whether it is remote or local.
if (drupal_realpath($this->getSanitizedUri())) {

So let's turn this into a feature request to add support for local file paths.

mitchems’s picture

I have an urgent need to be able to import from local files. I was doing this on D7 wherein I used a Feed to import into a File field from the private:// scheme. What I would do is place the file son the server and have Feeds read a Feed file and translate those into Nodes with a File Attachment with the source being a local file in the private::// scheme. I really need to be able to do this in D8.

Edit: there are 1790 files in this feed. I just wanted to note that because downloading each from a URL is not workable.

gkaas’s picture

I have the need to import local files too (around 50000 files, public:// scheme). Any update on this feature request?

gkaas’s picture

I highly doubt that this patch is a proper solution, but it works for me and I can import my local files (public:// scheme) with this patch. Use with caution and at your own risk.

MegaChriz’s picture

@gkaas
Thanks for your contribution! So you basically want to preserve the file path provided by the source (when it starts with 'public://' or 'private://') and ignore the field's settings of where to save the file?
Else I think that the method getContent() should return the file contents. But preserving the file path would indeed be a lot faster as in that case the file does not need to be loaded into memory. Hm...

I see some other interesting things in the patch, like mime_content_type(). Did not know that PHP function existed. Could possibly fix issues like #2611014: Feeds doesn't import remote images which have no proper extension (such as ASP) or #2977565: Unable to download image from RSS feed when it contains a query string.

gkaas’s picture

Thank you @MegaChriz,
Yes, I indeed want to preserve the file path provided by the source. I'm migrating ~50000 images from D6 to D8 and I have a seperate script that reads the image data from the D6 database, saves some metadata to the images, copies the images to the new D8 public:// folder (and keep the original directory structure) and it generates the CSV file I use as import file. Because I want to preserve the original directory structure of our image archive I want to ignore the field's setting of where the file should be saved (the images are already where they should be). Not sure if this is a common use case though.

I didn't look into the other issues you mention yet, but hopefully mime_content_type() helps fixing these issues.

MegaChriz’s picture

Issue tags: +Feeds file target

There are more file target related issues. Tagging this with “Feeds file target”.

liquidcms’s picture

from #2 above, @MegaChriz says:

you can just specify a file URL in your source and map that to "File ID".

is this still available.

If i read this right, when i set Target to a File field, i have an option under Configure for File ID, but for this (and any other option) i get the same Target fields:
File (base64) (field_file): Base 64 string
File (base64) (field_file): Filename
File (base64) (field_file): Description

perhaps just confusing UI and i can set my path to file as any of the 3?

liquidcms’s picture

i found the solution somewhere; which was to remove the 2 file plugins from the feedsdev module.

I can now import files via absolute path to that file being used on another site. Guessing i am just lucky that those files were (incorrectly) stored as Public or Feeds likely wouldn't have access to them. But that's an issue for another day.

MegaChriz’s picture

@liquidcms
Yes, in Feeds DEV I added indeed alternative plugins for file fields. And discovered you could only have one plugin per field type. There's a patch included with Feeds DEV to allow multiple Target plugins per field type (I noted this on its project page). You would need to apply it if you want to use both the default file target plugin and the alternative one. But you can indeed also remove the alternative file target plugins from Feeds DEV. :)
The reason the feature is not part of Feeds yet is because I'm not happy yet with the current implementation.

liquidcms’s picture

i tried patch, didn't work (i think patch failed).. so tried removing the 2 plugin files.. which did work.

vrajak@gmail.com’s picture

@MegaChriz @liquidcms - Thanks for this discussion. I have some questions, hoping someone can help me out as I'm confused.

I have a large number of files (thousands) I'm going to be importing via Feeds and having this create new nodes and attach said files to the node via the File field. This works as expected in current D8 Feeds with this dev version.

My problem is that it copies the files from my directories and puts them all in the basic default/files folder instead of just using the existing files in their (complicated) directory structure. Is there some way to do this with a patch or code hack? I did try using the "File ID" Method mentioned above but it still just copied my files to the default/files folder. It seems Liquidcms got it to work by removing something in the module? I'm not sure where/what.

Would appreciate any tips or tricks, many thanks!

liquidcms’s picture

My comment was regarding how to get the Feeds import to accept a full path to a file on a different site (the D7 site i am migrating from).

In my case i am creating a different structure for where these files belong. In D7 site they were attached to a node bundle inside a Field Collection and stored in Public file space (improper design). In the D8 site we have this type of doc files attached to a different bundle and we are not planning to try to map the old D7 files to this bundle. Instead i have created a Paragraph to hold the file with some other metadata. That file field uses Filefield Paths module to locate the file in Private in a folder based on meta data imported from the D7 along with the file (basically based on the user's id number).

The Feeds import creates the "file" Paragraph.

I then wrote a small Action plugin to run as a VBO action on the new Paragraphs to attach them to the new User field.

DrupalDope’s picture

Vrajak, #18

please have a look at file (field) paths module and see if that solves your issue?
https://www.drupal.org/project/filefield_paths

DrupalDope’s picture

I have some questions about importing images using feeds.

My situation:
I have a ton of images that sit in a structure, such as /galleries/UNIQUEID/**filename**.JPG
They have no metadata and are attached to "parent content" identified by UNIQUEID.
I will have separately imported the "parent content", in which the UNIQUEID is available as feeds item GUID.

Pictures are currently on another server.

How should I proceed to import them with feeds?

Should I make a CSV file like this:

"UNIQUEID";"Filenames"
UNIQUEID;PATH/UNIQUEID/file-1.jpg,PATH/UNIQUEID/file-2.jpg,PATH/UNIQUEID/file-3.png,PATH/UNIQUEID/file-4.jpg

Will something like this work? (please note the use of semicolon as a separator and comma as a separator for filenames)

PATH:
are only URLS accepted?

then the processing:
will each file wander through the same filters and processes as uploaded files? i.e. being resized, getting default alt and title values ?

storage: will the import use the settings specified for the image field?
(I will use this module to use a similar file structure as before: https://www.drupal.org/project/filefield_paths)

do you have any other advice?

MegaChriz’s picture

@manarak
Importing files in Feeds could use improvement. Right now source files need to come from an url and only certain protocols work. See #2969401-5: cURL error 1 on CSV import with image field mapped from field with url.

Filefield Paths

I believe there were some issues as well when using file import in combination with the Filefield paths module, though it's possible these issues only exist in certain cases in the D7 version of Feeds.

File target issue serries

For all file target related issues in Feeds, see https://www.drupal.org/project/issues/search?issue_tags=Feeds%20file%20t...
I hope I can address these series of issues sometime in the future. It's not that high on my priority list at the moment, sorry.

nattyweb’s picture

#6 @Toki - was struggling with this until I read your suggestion. Worked perfectly. Thank you.

Fristaila’s picture

Hi all!
#6 confirmed working - ty Toki , nattyweb !

mitchems’s picture

#20 - I have used that module to great effect on my system. I am current SFTPing the files to my public scheme and using feeds to grab them and put them in a directory structure in my private scheme. But what I really want to do is read the files during feeds import from the local file system, rather than from a URL.

I am going to look through this thread carefully, but at first glance, I don't think anyone has solved this?

MegaChriz’s picture

Status: Active » Needs review
Issue tags: +Needs tests

@mitchems
I see that there is a patch available in #10. I haven't tried that one, but it perhaps it would work in your situation?
Setting the issue status to "Needs review". Perhaps the status should be "Needs work", as we would also need to have an automated test for this. I think a Kernel test would be good.

MegaChriz’s picture

Status: Needs review » Needs work
Issue tags: +Needs reroll

The patch in #10 apparently no longer applies, so it at least need to be rerolled. Since I'm focussing on other Feeds issues, I haven't checked the original patch closely yet, so not sure if the implementation should also be different.

andileco made their first commit to this issue’s fork.

andileco’s picture

Status: Needs work » Needs review

#10, which I re-rolled in merge request !28, worked for me. I needed to use the full URL as the filename (I was pointing to files stored in AWS S3FS).

MegaChriz’s picture

Issue tags: -Needs reroll

Cool, @andileco
I guess the next step would be to have tests for this feature :).

ptmkenny’s picture

Reroll for the current version of feeds.

ptmkenny’s picture

One issue with the current patch (#32) is that local files are deleted upon import, which may not be what the user wants. I'm trying to figure it out why the files are getting deleted but I haven't identified the cause yet.

Steps to reproduce:

1. Create a CSV feed type and use an image file ID.
2. In the CSV file, use the private:// path.
3. Import the feed. The files on the local file system will be deleted.

ptmkenny’s picture

Another issue with the current approach: if you set "Update existing media items" to "Do not update existing media items", it works fine. However if you select "Update existing media items", you will get a warning: "cURL error 1: Protocol "private" not supported or disabled in libcurl (see https://curl.haxx.se/libcurl/c/libcurl-errors.html)"

sriharsha.uppuluri’s picture

Hi,

I have used the patch for private files folder but on save getting permission issues for image upload. By adding below code with the #32 patch it worked for me.

sriharsha.uppuluri’s picture

Status: Needs review » Needs work

The last submitted patch, 36: file_save_author.patch, failed testing. View results
- codesniffer_fixes.patch Interdiff of automated coding standards fixes only.

sriharsha.uppuluri’s picture

Status: Needs work » Needs review
FileSize
3.78 KB

By applying #32 patch and my changes in single patch.

Status: Needs review » Needs work

The last submitted patch, 38: file_save_author.patch, failed testing. View results
- codesniffer_fixes.patch Interdiff of automated coding standards fixes only.

yang_yi_cn made their first commit to this issue’s fork.

yang_yi_cn’s picture

Rebased again to bring up upstream changes, so the patch can apply.

The patch is available at https://git.drupalcode.org/project/feeds/-/merge_requests/28.patch, but I'm saving a static copy here as well.

blogchef12’s picture

Still seems to be an issue with Drupal 10. How do I use the feeds module to import feeds with images from a ../private/ directory? I have tried private://image.jpg and I get:

cURL error 1: Protocol "private" not supported or disabled in libcurl (see https://curl.haxx.se/libcurl/c/libcurl-errors.html) for private://image.jpg

niki v’s picture

Still an issue with D10 and commerce imports
I'm working locally and views data export only gives local file paths, so I downloaded all the images and placed them in the folders that matched the csv structure. The import fails with cURL error 3 but succeeds if image field is not required and I leave it out of the import.

I guess I'll be editing my csv files or is there another way to add the local file path?

rohit.rawat619’s picture

Assigned: Unassigned » rohit.rawat619
Status: Needs work » Active

fixing some codeing standard

rohit.rawat619’s picture

Assigned: rohit.rawat619 » Unassigned
Status: Active » Needs review
FileSize
43.7 KB