Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
Hi there,
I want to scrape a news site with Feeds Xpath Parser. The news site is very basic, it has 10 ~ 15 titles, and below those titles a little block of text.
How can I loop through all the titles and blocks of text and have nodes created? Also it has to check every 30 mins for new articles, and if there are it should create a new node.
Any help will be greatly appreciated.
//W
Comments
Comment #1
blogook CreditAttribution: blogook commentedI cant believe that there are not more people having the same issue, or perhaps even know how to resolve this one.
Settings for XPath HTML parser
context:
.//*[@id='content']
title:
.//*[@class='article-title']/table/tr/td/h1/div/a/text()
Doing the above will grab all the titles from the HTML page, however it will only create one node titled : ARRAY. If I change it as follows:
.//*[@class='article-title'][2]/table/tr/td/h1/div/a/text()
It neatly grabs the 2nd title. A work around can be that I create 10 parsers all doing the same thing except each parser grabs a different title, but that's just not how I would like it to work.
So pleassseee .. If someone knows how to do it, dont be shy and let us know :-) I have searched extensively the drupal forum, the issues for this module and googeled like a maniac hoping to find a solution. But to no avail :(
thanks in advance,
W//