Last updated September 11, 2015. Created on November 4, 2009.
Edited by MegaChriz, clfer, dbolser, twistor. Log in to edit this page.

Introduction

Feeds is designed to address import and aggregation use cases. It provides a UI for creating and managing multiple configurations for importing and aggregating simultaneously.

A single configuration for importing is called an Importer. As many importers as desired can be created. Each importer contains a Fetcher for downloading a feed, a Parser for parsing and a Processor for "doing stuff" with it - usually storing the feed.

Default configurations

Don't forget to enable the "Feeds Admin UI" module!

When you install Feeds and also its submodules and go to Administer > Site building > Feed importers (D7: admin/structure/feeds), you will find 4 default importer configurations (5 if the Data module is installed):

  • Feed
    Provided by the submodule "Feeds News".
    Aggregation importer. Aggregates RSS/Atom feeds to nodes. Provides a node type Feed and a node type Feed item. Create one or more "Feed" nodes to add RSS/Atom feeds to your site. On cron, these feeds will continuously produce "Feed item" nodes.
  • OPML import
    Provided by the submodule "Feeds News".
    Import an OPML file and create Feed nodes from its entries. This configuration should be used together with the "Feed" configuration. To use this importer go to http://www.example.com/import.
  • Node import
    Provided by the submodule "Feeds Import".
    Import nodes from a CSV file (http://drupal.org/node/622710#csv). To use this importer go to http://www.example.com/import.
  • User import
    Provided by the submodule "Feeds Import".
    Import users from a CSV file. To use this importer go to http://www.example.com/import.
  • Fast feed
    Only available on the D6 version and if the Data module is installed. Provided by the submodule "Feeds Fast News".
    Similar to "Feed" configuration, with the difference that this "Fast feed" creates simple database records from feed items.

Creating an importer configuration

Of course if the default importers don't fit your use case, you can modify them (click "override"), copy them (click "clone") or you can start from scratch (click "New importer").

Here is a short run down on how to create your own importer. Copying or modifying an existing one is very similar.

  1. Go to admin/build/feeds (D7: admin/structure/feeds), click "New importer"
  2. Add a name and a description
  3. Click "create", now you will be kicked over to the importer's configuration page. From here on out, modifying/copying an existing importer or configuring your new importer works essentially the same way.
  4. Go to "Basic settings". Decide whether the importer should be used on a standalone form or by creating a node ("Attached to content type"); decide whether the importer should periodically refresh the feed and in what time interval it should do that ("Minimum refresh period").
  5. Click "Change" next to "Fetcher" and pick a suitable fetcher for your job. Do the same for "Parser" and "Processor"
  6. Review the settings of each fetcher, parser and processor and adjust them to your job's requirements.
  7. On "Processor" click on "Mapping": define which elements of the feed ("Sources", e. g. the published date of a feed item) should be mapped to which elements of the Drupal entities ("Targets" - e. g. a node type's fields). There is a Legend on the bottom of the mapping page, it explains the available mapping sources and targets. This step is mandatory and if omitted, will result in empty entities.

Read more in Creating/editing importers

Using your importer

If you have set the importer to be run periodically under Basic Settings then cron and the fetcher will take care of running the importer.

If you are doing one off imports you need to run the importer by
going to example.com/import

Use the glossary

Confused by the terminology? Take a look at the Feeds glossary to get an overview of the terminology in Feeds.

Requirements and Installation

Install like any other Drupal module. If you install for the first time, make sure you install Feeds, Feeds Admin UI and Feeds Defaults module, all included in the download. Don't forget to configure cron!. Also this will require your PHP to have the CURL library installed (http://drupal.org/node/731918). PHP5-Curl.

Required modules:

Consult the README.txt file included in the module for details on requirements and installation.

Exportables and default hook

Every importer configuration can be exported. Go to admin/build/feeds and click on "export". Copy the exported code and paste it in your module into a hook "hook_feeds_importer_default()".

The export code will populate a variable called $feeds_importer. At the end of the hook, copy $feeds_importer into an export array and return it.

Here is an example:

/**
 * Default definition of 'myimporter'
 */ 
function mymodule_feeds_importer_default() {
  $export = array();
  $feeds_importer = new stdClass;
  $feeds_importer->disabled = TRUE; 
  $feeds_importer->api_version = 1;
  $feeds_importer->id = 'myimporter';
  $feeds_importer->config = array(
  // ...
  );
  $export['myimporter'] = $feeds_importer; 
  return $export;
}

Then, for this hook to be found, it must be declared by your module.

function mymodule_ctools_plugin_api($module = '', $api = '') {
  if ($module == "feeds" && $api == "feeds_importer_default") {
    // The current API version is 1.
    return array("version" => 1);
  }
}

Alternatively, you can use Features to export Feeds configuration.

Performance

Using Feeds module, how many feeds can be downloaded in what frequency?

Unfortunately, this question is impossible to answer globally. Overall, aggregation performance depends on:

  • Your server's CPU and storage I/O performance.
  • Your server's network connection.
  • The content type being created (complex CCK content type? simple Data record?).
  • The activity of your feeds being processed (many new items per run?).
  • The number of feeds being processed.
  • The parser being used (not as critical as other factors).

Usually, as performance degrades feeds will appear to be stale (no new items present although original feed has been updated a while ago).

The staleness will increase with the number of feeds you add. A good measure of overall aggregation performance is the time difference between the most recently updated feed and the last updated feed:

# my_importer_id is the id of the importer to be examined (can be looked up in feeds_importer table).
SELECT MAX(last) - MIN(last) FROM job_schedule WHERE id = 'my_importer_id';

The result of this query is a time span in seconds. For instance, a result of 3600 would mean that there is 1 hour between the feed that has just been updated and the feed that has not been updated for the longest time of all feeds.

To make sure that results are sane, also compare against current time:

# Watch out: UNIX_TIMESTAMP() returns DB's time which may or may not be the same as in PHP. Use date_part('epoch',now()) if you're on pgsql.
SELECT UNIX_TIMESTAMP() - MIN(last) FROM job_schedule WHERE id = 'my_importer_id';
SELECT UNIX_TIMESTAMP() - MAX(last) FROM job_schedule WHERE id = 'my_importer_id';

Performance: tuning

I experience performance problems, feeds are not updating as often as they should

Here are a some options if you experience performance issues with Feeds:

1. Make sure cron runs often enough, like every 6 minutes.
2. Run cron with drush.
3. Download and install Drupal Queue module, be sure to follow its README file closely to set it up correctly *).
4. Alternatively, use superfeedr http://superfeedr.com as a dedicated pubsubhubbub hub (see Feeds README file).
5. Improve system resources: analyze bottlenecks. Chances are your storage I/O maxes out as heavy aggregation involves a lot of writes. The exact remedies will depending on your findings but could be one or more of these: tune database settings, split out DB to separate server, add RAM to DB server, rearchitect to use a lighter storage model like Data etc.
6. If you are using MySQL, be aware that by nature most of what Feeds does is update data in the database, so these entries will be captured in your binary log. If you are importing large feeds, this means LOTS of log entries in the binary log file(s). Make sure that you have enough disk space for these logs and don't keep them for longer than you need. See the MySQL Binary Log page for more. If you run out of space on your logging drive, your Drupal site will stop working until you fix it.

*) Drupal Queue moves the actual aggregation work to a process separate from cron.php. Thus it is an ideal way to improve performance if other cron jobs like for instance search are already taxing the system. As queues can be worked off concurrently, aggregation speed can be improved considerably. The danger of concurrent aggregation though is that its resource consumption can peak more aggressively and thus lead to high loads that in turn result in a sluggish server.

Looking for support? Visit the Drupal.org forums, or join #drupal-support in IRC.

Comments

alex_drupal_dev’s picture

I have a question about mass importing with feeds. My problem is happening with auto creation of all the products, prices and SCU's which is normal, I have 3 images for each product that need to be setup as well. I can get all the other information to populate without issue but I don't get how to add the images in a CSV file. Is there a way I can add the images to a CSV and then have images auto poulate in commerce ?What I have is 2 album views, side a, and side b, then there is a front cover view. I need all 3 to be added without manually uploading each. I have tried importing a few different ways and I keep getting no images. The amount of time adding manually will be hundreds of hours so I need to find a shortcut for the images, any suggestions.

padman’s picture

Csv is a format of a text file, so you won't be able to embed images in it directly. What you can do is to place images in a folder structure, preferably at their final destination and include filenames with paths in the csv file. If you apply smart naming convention, you may automate the process of generating the csv.

alex_drupal_dev’s picture

So essentially your saying I should drop the files in the final location and use text based code to locate the files ?
Let me know if this is what you mean ,

in the CSV file I put my headers one of which is a image, in the cells in that row I put a
( img src="url" alt="some_text" ) for each image ?

Maybe I am misunderstanding you, please let me know.

padman’s picture

Yes, this is what I mean: place the images in the folder you want them to be and put urls in the respective "cells" of the csv file

Sorry for not responding - I haven't got notification on email

Padman

alex_drupal_dev’s picture

I had a question about how to use commerce feeds to do taxonomy categories. I already mapped it correctly to taxonomy, and I tried with and without auto create. I am trying to get 2 levels of taxonomy hierarchy to go through from my CSV but I think I may be messing up the formatting.

I have almost 1000 comics so I really need the feed to do the taxonomy sorting or I will be doing it for a month. They are sorted into taxonomy categories with multiple words, do you think this could be the problem ? I didn't know if the category length could be buffing it up or not. Some of the taxonomy categories are short and they didn't go either, so I am thinking I must be doing something wrong with the formatting. Everything else on the CSV will go through. I do it in a 2 pass deal. First pass I do commerce products, second pass I do product references, Here is an example of how I setup my CSV and how I formatted it. Everything goes fine except for the taxonomy. My categories are 2 levels. First level is "Classics", second level in classics, is "Iron Man".

sku,title,price,description,specifications,image,categories
1,Iron Man 214,1020,Appearance of spider-woman,Near mint condition,images/195.jpeg,""Classics,Iron Man""

My raw csv sheet online reads like this.

5,Batman and Detective comics 620,800,Batman and Detective comics 620,In outstanding condition,images/5.jpeg,"“Classics,Detective Comics”"

I tried to remove the extra " " but it still does not want to go through. I even tried using "Classics/Iron Man" still no go. Any ideas on what I am doing wrong in my csv format for taxonomy ?

alex_drupal_dev’s picture

I figured out a way to do single level taxonomy entries, and I can manually arrange the 2nd level hierarchy. That said, I would really like to find a way to do this with multi level taxonomy in the future. If anyone knows a good way to do nested taxonomy terms in a CSV sheet that would really help me out in the future.