Hi,

In my feed I get an id, which I store as a CCK text field, and that can be used to check for uniqueness, as it is unique for each item.

Which is the better place to implement this uniquness check, the parser side? or Feed API? Maybe FeedAPI node?

I also posted this as a feature request for FeedAPI, to be able to specify a primary key per feed to avoid duplicates.

Cheers,
Gyuri

CommentFileSizeAuthor
#3 csv_parser_unique.patch1.98 KBburgs

Comments

parrottvision’s picture

subscribe

giorgio79’s picture

Here is an intermediary solution:
http://drupal.org/project/unique_field

burgs’s picture

Status: Active » Needs review
StatusFileSize
new1.98 KB

Here is a patch to solve this issue.
It adds another textbox in the same section as the timestamp, title and description fields for the feed source node. Just add the heading of the particular column that you want to check uniqueness on, and this will be used to update on (rather than the md5 hash that was previously the rule). I'm not sure how this will apply to the csv version that scraps those title/description textboxes for mapper elements, but this should still work with those changes.

parrottvision’s picture

The patch from Burgs in #3 worked perfectly. I have tried it on 5 different CSV feeds and each updates if there are changes rather than creating a new node. All you need is a unique field in your CSV file that remains constant. I have managed up to 600 nodes in a single file and not sure my server will let me do much more. This is awesome!

I would love to see this get added into the module. It works.

parrottvision’s picture

Status: Needs review » Reviewed & tested by the community

Tested patch. All works.

alex_b’s picture

Does the patch apply to the dev version?

burgs’s picture

do you mean the cvs (head) version? if so, i'm not sure, but the cvs version doesn't currently work for me.
The patch applies to the 6.x-1.0-alpha1 version.

alex_b’s picture

Status: Reviewed & tested by the community » Closed (won't fix)

#7: yes.

I know that the CVS version is different from the head version as it does not include built-in mapping, but I can only accept patches that apply to HEAD.

HEAD does not include built in mapping, because mapping is supposed to happen on the Feed Element Mapper level. This is possible now as title and body field mapping was recently added to Feed Element Mapper.

Thinking about this now, also the GUID mapping should happen on the Feed Element Mapper level.

Based on these assumptions, I move this issue to 'won't fix', please change that if I'm going wrong here.

If you'd like to create a GUID mapper, take a look at the how-to on Feed Element Mapper's project page.

burgs’s picture

Thanks Alex,

Basically, what i'm hoping to achieve is changing the functionality of this line:

$item->options->guid = hash('md5', serialize($row));

to insert a unique id into the $item->options->guid instead of a hash value (which currently stops the creation of duplicates for exact rows, but not for slightly changed ones).

If i can bring this functionality in with a custom mapper via Feed Element Mapper then great, i'll go for it. But just checking with you that i can override the hash value that is hard-coded to go in?

Thanks,
Simon

parrottvision’s picture

After using Burgs patch it just seems to fit that the unique node reference occur in the Parser CSV. I might be wrong but there is simplicity in having - title field / body field / unique node fields - in the settings for the Parser set up in the feed source CCK.

If you feel it better fit mapper would it be as easy to implement? It works and I have been using it for a few weeks.

alex_b’s picture

If i can bring this functionality in with a custom mapper via Feed Element Mapper then great, i'll go for it. But just checking with you that i can override the hash value that is hard-coded to go in?

You can.

alex_b’s picture

#10: there is simplicity, but also duplicity :)

I'm trying to keep the ducklings that is the FeedAPI family in a row. While Feed Element Mapper could use a better UI and abstraction from nodeapi, it is the place to do mappings. In order to create solid and sustainable solutions I'm trying to keep things where they belong...

alex_b’s picture

I posted a task on feed element mapper for this issue #368561: Mapper for FeedAPI URL/GUID

parrottvision’s picture

Thanks Alex, I appreciate what you are saying and you are right - it is a bit short sighted of me. I guess having the quick fix that worked was appealing but understand that ideally it should be in it's 'proper' place. Great work.

burgs’s picture

Thanks Alex for #11,

I'll get onto it soon.