I've followed the instructions over and over, but I can't get this to work. Nothing is showing up the same way as the examples because all of the examples are for Drupal 6. Has anyone been able to get this to work in one of the 7.x versions?

One point of confusion is where do I put the feed processor node tags?
Example:
context: //item
title: title
body: description
url: src/@url
field_salary: salary/[@ccy="usd"]

I'm not seeing anywhere to put these tags.

If anyone can help, that would be great.

Comments

jasonglisson’s picture

Version: 7.x-1.0-alpha1 » 7.x-1.x-dev
F.G’s picture

Are you talking about this example :
http://drupal.org/node/919448

I'm in the same case ! first install of this module and D7 !

Regards,
F.

aenw’s picture

Does this tutorial work in v. 7.x? Yup.

1. Very specifically: This tutorial *almost* works for me as written. I needed to change the XPath expression for the salary field a bit. The tutorial says to map the salary field to salary/[@ccy="usd"] but that throws an error for me. (Error: "There was an error with the XPath selector: Invalid expression") If I change it and take out the "/" so it is this: salary[@ccy="usd"] then it works fine for me.

    Here's my configuration:
  • Drupal v. 7.7
  • Feeds 7.x-2.0-alpha4
  • Feeds XPath parser 7.x-1.0-beta2

2. Much more generally...
There have also been some comments on the tutorial page that show that others are having trouble being successful with this, too. That's a pretty good indicator that we need to add information/documentation so things are clarified. I'm willing to help out with this; I think this issue is a great place to start. Plus I'm just getting into all of this, so it's a great way for me to make sure I really understand Feeds. (That's also why this reply is so delayed.)

@F.G., here's the short answer to

where do I put the feed processor node tags?

You created a content type named "xml job feeds". You now need to create a node of one of those content types, either by going to Create Content > xml job feeds, or by going to the list of Feeds Importers, looking at the row for your ""Jobs XML Feeds" and clicking on "xml job feed" that's in the "Attached to:" column.

When a "xml job feed" node is created manually (as opposed to being created automatically when a feed is scheduled by cron, etc), you can enter something like the date and time you're running this feed as the title. (You can ignore the body.) Scroll down and you'll see that Feeds has put in a fieldgroup. You'll see your Feed Fetcher (either a HTTP Fetcher or a File Uploader, depending on what you selected). Below that you should see a fieldgroup for XPath Parser Settings (the Parser you selected). When you expand that fieldgroup, you'll see fields for the context and each of the fields that you mapped as an XPath.
Enter the XPath expressions (you may need to make the change I noted above to salary). Check all of the debug boxes so you can see any messages generated. Save your changes.
Then run Import on your "xml job feed". If all is well, you will import the feed and create two nodes.

As for why you put the XPath expressions into a "xml job feed" node.... here's the much longer answer:

To review, here's an overview of the whole process:

A Feeds Importer is a factory: it takes in raw material (information from a feed) at one end, does three processing steps, and then spits out some shiny new nodes that are populated with the data from the feed at the other end.

(OK, sometimes it alters existing nodes. But not in this tutorial.)

Specifically, the three processing steps that a Feeds Importer does is:

  1. Fetch raw data from a source and hand it to the Parser for the next step:
    • (v 7.x:) [raw data in a feed] ==> is an input to ==> a Feeds Fetcher ==> which turns this into a [FeedsFetcherResult]
  2. Parse the data we now have into nicely organized pieces, then hand this to the Processor for the next step:
    • (v 7.x:) [feed data we sucked in and turned into a FeedsFetcherResult] ==> is an input to ==> a Feeds Parser ==> outputs a [FeedsParserResult]
  3. Process all of those nicely organized pieces and put them into shiny new nodes (or existing nodes), using a Map that tells us what piece of data goes where, and -- in this tutorial -- using XPath Query expressions to give us more mapping info and rules about how to pick through or change the data before we put it into our shiny new nodes:
    • (v 7.x:) [ FeedsParserResult] ==> is an input to ==> a Feeds Processor ==> uses mapping info and XPath expressions ==> outputs shiny new nodes!

So each time we read in a feed, we start with the raw feed input, have our Feed Importer factory run it through the three processing steps, and then produce (or alter) nodes.

We need a way to keep track of things when we run this process. We need to know exactly when the process was run (what day & time), and what the results were. And of course we need to keep track of the details: Exactly what was the data source we used when we ran this process? Did we have any special processing rules or information? What was the result? Were there any errors? Did we successfully create (or alter) nodes? (Imagine that your site is reading in a newsfeed every hour and generating dozens of nodes with news each time. Or that you've used feeds to import thousands of nodes from various data sources. Ya gotta keep track of all of that stuff.)
This is where the Feed Processor Node comes in to the picture. You need to create a content type so you can record the specific data source, any specific processing information, and results each time you process a feed. That is: each time you press the 'Import' button, and/or each time a job is scheduled to run to import a feed you have to keep track of what data you used, what you did, and what happened.
(I'm not yet clear on why you have to create a content type to do this, but that's how it is.)
In this tutorial, we create a content type called xml job feed -- and this is where you enter the XPath expressions

So finally, here's what we have for this tutorial:

(Note that when you see notation like "Fetcher > Change:" below, that's my shorthand for what you select on the user interface, e.g. for the Fetcher, click on Change.)

  • the content type (the type of nodes) we want to produce from this feed: job
    Fields to add to the job content type:
    • url: a text field
    • salary: an integer

    (You could also add a field for the 'publish date' but it's not covered in this tutorial.)

  • the Feed Importer we're going to create to import this feed and produce the nodes: Jobs XML Feeds
    The specific settings and choices for our Jobs XML Feeds (a Feed Importer):
    • tell it what to use to keep track of the whole process when it runs = Attached to: xml job feed
    • what kind of Fetcher can get to our feed (data) for us? = Fetcher > Change: select either the HTTP Fetcher (if you're going to get the example data from a real server) or a File Upload (save the .xml data file for this tutorial as a file, then use the File Upload Fetcher to read the file in)
    • how do we need to Parse this data? = Parse > Change: XPath XML Parser -- this Parser knows how to deal with a feed that is just XML data (it's not in RSS format, it's not in Atom format, it's not REST, etc.), and we can then use XPath expressions to pick and choose data and even transform it. (Plus XPath XML Parser is the point of this tutorial)
    • Once we have the data organized into nice pieces, what Processor can create (or alter) the items we are ultimately interested in creating (or altering)? = Processor > Change to: Node Processor (because we want to create nodes)
  • the content type that we'll create for our Feed Processor Node so we can keep track of the input info, any specific configuration or processing info, and results when we import data from this feed and process it: xml job feed


Whew.

Now we have the Feed Importer (Jobs XML Feeds) configured and set up and we have a content type created (xml job feed)so we can keep track of things each time we import data from our feed.

But now we're right at the crux of @F.G's question: in the Node Processor Mappings we said we would handle fields with XPath. Now we have to specify exactly what to do with those XPath fields.

We're going to create a new xml job feed node and give it information so that we can manually import this feed. Create the new node either by going to Create Content > xml job feeds, or by going to the list of Feeds Importers, looking at the row for your "Jobs XML Feeds" and clicking on "xml job feed" that's in the "Attached to:" column.

When a "xml job feed" node is created manually (as opposed to being created automatically when a feed is scheduled by cron, etc), you can enter something like the date and time you're running this feed as the title. (You can ignore the body.) Scroll down and you'll see that Feeds has put in a fieldgroup. You'll see your Feed Fetcher (either a HTTP Fetcher or a File Uploader, depending on what you selected). Below that you should see a fieldgroup for XPath Parser Settings (the Parser you selected). When you expand that fieldgroup, you'll see fields for the context and each of the fields that you mapped as an XPath.

Here are the XPath expressions to enter and a note about what's going on with each:

field value to enter what this means
Context //item tells the processor that this is the starting place (the <item> element) for the XPath queries to follow
title title map the value in the <title></title> tags in the feed to the field "title" in a job node
body description map the value in the <description></description> tags in the feed to the field "body" in a job node
url src/@url look for the <src> tag, and map the value of the attribute "url" in the feed to the field "url" in a job node
field_salary salary[@ccy="usd"] (Note that this is the change that I had to make for it to work on my system: I removed the "/") look for the <salary> tag, and if the value of the attribute "ccy" is equal to "usd" then map the value in that<salary></salary> tag in the feed to the field "field_salary" in a job node



Check all of the debug boxes so you can see any messages generated. Save your changes.

Then run Import on your "xml job feed". If all is well, you will import the feed and create two nodes.


Yes, this did turn in to quite the document. But I hope it's helpful. Pipe up if you think we should polish this and add it to the documentation.

Anything not clear? Questions? Something not working?

- aenw
(now out of words for the day)

_petja’s picture

Version: 7.x-1.0-beta2 » 7.x-1.x-dev
Component: Code » Documentation

I keep getting this error when importing:
An AJAX HTTP error occurred. HTTP Result Code: 500 Debugging information follows. Path: /batch?id=198&op=do StatusText: Internal Server Error ResponseText:

No matter what I try.

Any ideas?

dafreak’s picture

Version: 7.x-1.x-dev » 7.x-1.0-beta2
Component: Documentation » Code

When I try to use XPath HTML Parser or XPath XML parser and I go to XPath parser settings, there are no settings options. Something is missing here.

dafreak

aenw’s picture

Title: Does this work in Drupal 7?? » Does this tutorial work in Drupal 7??

#4 @_petja: You should probably file a new issue. (And perhaps provide some specific information about your configuration: what version of java, what browser, etc.)

#5 @dafreak: This issue is really about the example tutorial. If something is broken with the project code, then you'll probably want to file a new issue. And then in your new issue you can set the component to 'Code' and the version to the specific version you have; this issue is about the documentation.

But first I'll try to help you figure out what's going on.
What are the versions you're running for: Drupal, Feeds, and Feeds XPath Parser?
And at what step in the tutorial do you run into a problem?

twistor’s picture

Issue summary: View changes
Status: Active » Closed (fixed)

Closing out old issues. If you're still having this problem, feel free to re-open it.