I am using Drupal 7.12 and the latest release of Feeds Tamper. I am using a standard setup for feeds fetcher, parser and processor using Feeds Import 7.x-2.0-alpha4. I've set up a mapping for a GUID field where I have configured the Feeds Tamper to calculate a hash on this field. My understanding is that the hash plugin will serialize the record in the feeds upload and save that as a unique identifier for this record. However, it is only updating a single record regardless of the content, not creating a new node. If I ADD my own unique identifier then it will create a new record, but I thought the point of the hash plugin was to create the unique identifier. This is a great idea but I think I must be doing something wrong or missing a key point/setting. Any help would be appreciated. Thanks.

Comments

sassafrass’s picture

I realize now that the hash is being created on only one field. I thought the hash was being created on the entire record. It would be great if you could select more than 1 field to create a hash on since it is often the case that more than one field is often required to uniquely identify a record in a database. Working on this...

sassafrass’s picture

I tried changing:

function feeds_tamper_hash_callback($result, $item_key, $element_key, &$field, $settings){
if($settings['overwrite'] || !trim($field)){
$field = md5(serialize($result->items[item_key]));
}
}

to

function feeds_tamper_hash_callback($result, $item_key, $element_key, &$field, $settings){
if($settings['overwrite'] || !trim($field)){
$field = md5(serialize($result->items));
}
}

but that didn't seem to have any affect. Is $item_key the field that's being hashed? Is $field the variable that saves the hash in the GUID target field? If anyone can lead me in the right direction, I'd really appreciate it.

jeffschuler’s picture

Category: support » bug

The hash should be created over all fields in the source. Then you'd store that in a field you define as GUID... That's how my original patch worked and it sounds like that's how you're expecting things to work.

Glancing at what got committed, though, I wonder, too, why

    $field = md5(serialize($result->items[$item_key]));

isn't

    $field = md5(serialize($result->items));

Are you sure that didn't work?

Note also that there's a feature request for being able to select a subset of fields: #1297650: hash over only the chosen sources.

sassafrass’s picture

I think what might be happening is that the Feeds module is overwriting the hash with the field that maps to the GUID. For example, I have the source field from my upload file: Full Reference Information mapped to GUID. I select this as a unique identifier. Without Feeds Tamper, this field would be my GUID. With Feeds Tamper, I set this field to calculate a hash. But it only treats the record as unique if the the Full Reference Information field is unique, not if the whole record is unique. It doesn't seem to matter if I use $field = md5(serialize($result->items[item_key])); or $field = md5(serialize($result->items));

Here are my settings:

Processor Settings: Do not update existing nodes
Mapping settings: Source: Full Reference Information, Target: GUID, Unique Target selected
Feeds Tamper: Enable hash plugin on Full Reference Information -> Full Reference Information, GUID

sassafrass’s picture

Another thought...maybe determining whether or not a field is unique is performed and processed by the Feeds module before Feeds Tamper creates the hash for the node.

If so, I think a patch along the lines of this thread might be needed.
http://drupal.org/node/661606#comment-5648628

sassafrass’s picture

I confirmed that a hash of the GUID is NOT being saved in the database table feeds_item. Only the original unhashed single field mapping is being saved.

sassafrass’s picture

I now realize that NONE of the Feeds Tamper plugins are working!? In the feeds_tamper.module, line 54: if (isset($item[$element_key])) is always false and none of the plugins ever get called. Any idea why that might happen?

emptyvoid’s picture

Well 3 months of silence is probably your answer. I am not the maintainer for this module but I'll take a crack at looking at it as I'm using it for a project I'm doing.

twistor’s picture

Well first off,
$field = md5(serialize($result->items[$item_key]));
is doing exactly what you would expect. It's making a hash of the entire feed item. $result->items is the whole feed.

@sassafrass, Feeds Tamper doesn't save anything anywhere. If you want to use the hash as a unique id, then you have to map it to a unique field, GUID for instance, and configure it to be unique.

As for the plugins not working, it sounds like there's a problem with your feed. https://drupal.org/node/1515316#comment-7853155 will solve some problems when items don't exist in the feed.

twistor’s picture

Title: Feeds Tamper Hash Plugin not creating new nodes » Add option to hash plugin to hash multiple fields.
Version: 7.x-1.0-beta3 » 7.x-1.x-dev
Category: Bug report » Feature request
Issue summary: View changes
MegaChriz’s picture

Status: Active » Closed (duplicate)

This looks like to be a duplicate of #1297650: hash over only the chosen sources which has several patches.