I've tried FeedAPI, too complex to get a simple job done. I used the Aggregator from the core which is almost perfect, the thing is, it doesn't make nodes. Then I discovered SimpleFeed and it was just right. Nodes, Taxonomy and simple!
Just one strange problem, I get lots and lots of duplicates. I import a couple of feeds of stuff I publish elsewhere. But whether it be twitter,com or soup.io or flickr.com all sources produce duplicate entries in the end.
I've looked at the sources and stored several versions of the feeds, but the items are the same. Not even whitespace is different, yet the duplicates keep coming.
Cron is set to 15 minutes, I use lynx to fetch the cron.php
There are 4 sources I usse, they are all processed before the next cron runs.
All my feeds imported:
http://drupal.xiffy.nl/tumble (this starts with my last.fm, the only one not having dupicates (yet?)).
My flickr feed, which has dulicates:
http://drupal.xiffy.nl/tumble/flickr
or twitter
http://drupal.xiffy.nl/tumble/twitter [these url's won't last, this is my test and building setup]

SimplePie version used 1.1.1 of 15 march 2008
any sugestion where to look for additional loggin would be appreciated, i know my PHP, it should be somewhere inside simplefeed_item_feed_parse where iid get's set to the md5 of the body and title. I'll do some debugging later on and keep you guys posted, but I would like to share this problem with you as well.

Comments

xiffy’s picture

Status: Active » Closed (duplicate)

Sorry, did not look at the bug reports close enough
this is a duplicate of #240697: Duplicate Items good things are being said on that thread, in my db the iid field is empty so every check of a feed will produce duplicates. Will try the patches now.

mbx’s picture

Heh... had to laugh at the status your bug report ended up being assigned! ;o)>