The documentation for drush feeds-import says I can use --file to supply the file to import.

I have created a Feeds importer. When I go to /import in the Feeds UI and enter the directory in the box there, the importer imports all the files in the directory.

However, when I try to do the same using
drush feeds-import [machine name of this feed config] --file=[same directory]
I get an error saying:

The file [my directory] does not exist.

This could be because I am using a directory and not a file... maybe the drush command needs a --directory option? Anyway I will investigate. We need this to work for #2757921: Import AsciiDoc HTML output into nodes.

Comments

jhodgdon created an issue. See original summary.

jhodgdon’s picture

Title: The drush feeds-import command does not work with a directory » The drush feeds-import with --file assumes the default Fetcher

Actually, the problem is worse than that: drush feeds-import is explicitly assuming that if you provide --file, the configured Fetcher for the feed is FeedsFileFetcher, which it is bypassing by just creating a FeedsFileFetcherResult object.

That is not necessarily the case. It should be running the configured fetcher instead.

jhodgdon’s picture

Hm. This may not be a blocker for our project... It turns out that I can run the command without specifying --file at all. It just uses the source that I set up the last time I ran the import at /import.

MegaChriz’s picture

Title: The drush feeds-import with --file assumes the default Fetcher » Add a --directory option for drush feeds-import
Category: Bug report » Feature request

I don't think this is a bug. File != directory.

You can indeed run an import using just drush feeds-import. Supplying a file or url is optional and bypasses the configured fetcher. These options were almost left out because of the extra complexity, but I decided to keep it in, so it is possible to import a source for an importer without needing to go to the import form first in the UI. This way you could enable a module with a feeds importer and import a source all in one (drush) script.

Maybe the help text should mention that using the file, url or stdin option (temporary) bypasses the configured fetcher? Cause I think supplying a file or url doesn't overwrite the configured source.

jhodgdon’s picture

Title: Add a --directory option for drush feeds-import » Add a --source option for drush feeds-import

So it turns out that Feeds doesn't export the FeedsSource configuration as an exportable (to Features for instance). This means that the configuration on the /import page is not storable in Features, which is kind of a problem, because when you run the import command with Drush, you are just getting whatever someone ran it with last, and it is not in code anywhere. Whereas if you run from /import and put in a temporary directory or URL, you will be silently overriding that stored configuration.

I think adding to the Drush help to explain that --file, --stdin, and --url bypass the configured Fetcher is quite important to know, especially for people who may have written their own Fetcher and Parser. And that if you omit --file and --url, you are using the last stored source.

Anyway, on this issue, probably the right thing to do is add a --source option to Drush. This would try to pass the "source" configuration option to the configured Fetcher (this name shared by most fetchers I think?) and use the configured Fetcher, Parser, etc. that are part of the Importer.

MegaChriz’s picture

Providing a --source option for drush is an interesting idea and that would probably work for the HTTP and File fetcher, but not for contrib fetchers. Using the source configuration option is not enforced by fetchers and there are some fetchers that indeed do not use that; Feeds FTP Fetcher, for example. A fetcher may also need more information than just a single string value. A fetcher that would fetch data via SOAP for example would need the method to call on the SOAP server and parameters belonging to that method. It looks like Web Services Client for Feeds works like this. A fetcher could require login details to fetch the source (though I would save such information on the importer instead or elsewhere, like settings.php).
Anyway, such a source option will not work for all fetchers (unless maybe if there is an easy way to input arrays on the command line?).

Storing the FeedsSource in a feature is also an interesting idea, but this would need at least an API addition on the fetcher. For example, for the file fetcher, the ID a file has in the current database will not be very useful to have in a feature. Same counts for the file URL which would be something like "private://feeds/content.csv" (if not using the directory option). So the file fetcher should implement a method to export its source configuration in order to provide something useful.
An other challenge is when the importer is attached to a content type, cause then the feeds source also depends on the existence of a node with a certain ID.
Also interesting to note is that the equivalent of the feeds source in the D8 version (there simply called "feed") is defined as a content entity type. So should we want to port this feature to D8 later, we would face similar challenges.

An other option for you might be to create the Feed source programmatically in the module in which you include an export of a Feeds importer.
Something like this:

$source = feeds_source($importer_id);
$source->addConfig(array(
  'FeedsHTTPFetcher' => array(
    'source' => 'https://www.drupal.org/project/issues/rss/feeds',
  ),
));
$source->save();

Downside is indeed that if you want to change the source config you need to change it programmatically as well, so that would be a bit harder to maintain.

jhodgdon’s picture

The Fetcher that I am using is a relatively uncomplicated extension of the default File fetcher, which fetches files in a defined order from a directory, so it uses 'source' as its one configuration option. When writing my custom fetcher and parser, I chose to put all of the other configuration that the Fetcher and Parser needed into the Config rather than SourceConfig, so that all that stuff is stored on the Importer. For my use case, there is only one source for the importer, so using nodes and/or content types made no sense to me. The configuration for the Mappers can only be stored on the Importer, and all the other configuration was pretty much tied to that, so it made sense to store it all on the Importer (again, for this use case).

The code for saving FeedsSource items in the database (D7) also preferentially treats 'source' as a column, so it is something a bit special.... But I see your point that adding --source would not cover all the use cases. Perhaps something like --config=configfile.txt where configfile.txt would use a simple format similar to what we use in .info files (D7) or YAML (D8) would be good -- that would allow someone to provide the full configuration, no matter what it is.

Regarding exportables... I see what you mean about the difficulty of export.

Regarding D8... To me, having Source items be Content entities seems wrong -- they seem more semantically Configuration. See https://www.drupal.org/node/2120523 -- Content entities should be things displayed on web sites, not things used to gather/format data or define how it is displayed.

MegaChriz’s picture

Providing a config file as a drush options sound like a wonderful idea. Only concern to me is that the structure of the config file will not be obvious. You would have to inspect the code of the fetcher and parser in question for that. And for the file fetcher you would need to specify a file ID.

Regarding D8 - "Source items are content entities" - I probably should discuss that fact with twistor when I get a chance. I assume it is defined as content because there is a use case in which it is useful to attach fields to the feed entity. Attaching fields isn't possible with config entities.

MegaChriz’s picture

@jhodgdon

I think adding to the Drush help to explain that --file, --stdin, and --url bypass the configured Fetcher is quite important to know, especially for people who may have written their own Fetcher and Parser. And that if you omit --file and --url, you are using the last stored source.

I've opened #2822679: Improve documentation for options --file, --url and --stdin for the Drush command 'feeds-import'. to address that issue. It would be great if you want to give that a review!