I would like to use Feeds CSV Import to update node values after the initial node import.

Example:
If the initial Import looks something like the following..
Nid --> nid (also used as unique identifier)
Title --> title
Body --> body
Val1 --> 101010
Val2 --> ABC

I want to be able to import a feed later that looks like..
Nid --> nid (also used as unique identifier)
Val1 --> 202020
Val2 --> DEFG

Currently using 'update existing node', in this case Feeds views the title and body as NULL and erases the already defined Title and Body etc upon updating the node.

I'm not sure if there are other cases/reasons for this as the desired method but for my own case I would like to change this. Other than this the module works great and has been great.

Can someone point me in the right direction on how to accomplish this way of updating node values or where in the code I may look to editing to accomplish this? It seems like there may just be a simple check that I could run during the process of cross-checking the information provided in the CSV with that in the database and have it view NULL values as a match? Any Ideas??

Thanks for the help in advance!!!

Comments

twistor’s picture

After the initial import, remove the mappings to the title and body fields.

chadmkidner’s picture

Thank you for the quick feedback!

If I am understanding correct, in the feed settings under mapping is where I would remove them? If that is the case would it work to create a clone of the node importer and include only the mappings for the fields that are to be used in the node updates feed? So basically to have 2 imports, one for new items and one for updates?

Thanks again!

twistor’s picture

That should work. But depending on your workflow, you could overwrite the title and body fields when you run the first importer.

chadmkidner’s picture

Status: Active » Closed (fixed)

I have only ran some test feeds through but so far everything seems to be working great! Thanks for pointing me in the right direction with this!

Changing to fixed. I'll be finishing up the template for the actual feeds that will be used in the update importer sometime in the coming week - if I see any kinks afterward I'll post them here.

chadmkidner’s picture

Status: Closed (fixed) » Needs work

Almost there! Unless I missed something, everything updates great except for any taxonomy values are removed.

After running initial update with the new "Updates" importer cloned from "New Item" importer and removing all mapping except the ones I'm wanting to use in our update feeds, this is the results.. Title and Body are intact, other cck fields, images, UC values, and node reference fields also all remain, leaving the only issue with taxonomy values.

Anything particular to the mapping of taxonomy values that may cause this to be cleared while all other values stay intact? (I might also note that array values verses non do not seem to make any difference)

coreycondardo’s picture

Today I deleted a field (thus dumping all data out of it), recreated the field, imported to specific nodes and my import result was "There are no new nodes." even though my import was only calling out data for that new (and supposedly empty) field.

I did this both with the importer set to "replace existing nodes" and "update existing nodes" neither with any change.

This should work but something must be doing wrong. Any ideas?

--Corey

twistor’s picture

Feeds does not look at the fields to see if data has changed. It looks at the feed items. If they have changed, if will update the appropriate entity.

coreycondardo’s picture

So what should I do to get the change to take? Change the mapping name in both the Csv and mapping screen?

--Corey

twistor’s picture

There is a patch somewhere around here that adds support for forcing updates.

A quick hack: You can add a mapping to something you don't care about, that will force an update for the next run. To force again, remove the mapping. Feeds considers the mappings when comparing if an item has changed.

coreycondardo’s picture

That makes total sense, I can even create a dummy field and delete it when I'm done w the upload :).

Thanks,
Corey

coreycondardo’s picture

Confirmed that this method worked. If you have imported something, decide you want to do a similar import with nothing changed other than the actual data...

1) Create a dummy field for the content type (ex: test)
2) Add the mapping
3) Run the import

RESULT: it will import and update

If you need to do this again, simply remove the mapping of test and do the import. If you need to do it again, add the field to the mapping.

Rinse - repeat.

Thanks for the help!!

--Corey

chadmkidner’s picture

Unless I'm missing something here, was any of that meant to deal with the original topic issue?? I read through it like 3 times and I'm convinced that there was no correlation?

Soo.. Any ideas on difference between why taxonomy values would become overwritten while all other values remain? (More info from #5)

twistor’s picture

No, I guess that wasn't directly relevant.

Per #5, Are your importers still using 'Update existing...'?

chadmkidner’s picture

Yeah I have both importers set to Update Existing.

It seems on the secondary importer (one with less values) the taxonomy fields are being replaced with 'null' where as the other elements, such as Title, Body, uc fields, etc are all keeping the already stored database value. Specifically the database values stored in their nodes NOT the values stored in feeds tables as this would be a first run for this secondary importer.

I'd assume this would be the desired functionality, to not overwrite a value unless explicitly given a 'null' or empty value in the csv file.

grahamvalue’s picture

Hi twistor,

This doesn't work for me.

I have one importer that creates new nodes, each with about 40 fields.
I made a second importer by cloning the first but with mappings only for the GUID field and one other field. I need to update that field everyday in all the nodes.

But when I run this second importer, it errors out saying that it can't enter '' into one of the other 39 other decimal fields.
I haven't even mapped the other fields in this second importer.

How do I update just this one field in all the nodes everyday?

Thanks!

PS: Is feeds capable of updating existing nodes with a different importer? In Drupal 6, I had to use feeds_node_multisource to accomplish this.

grahamvalue’s picture

OK, I just confirmed that the same feed can update values in the nodes it created (even if the nodes were created in Drupal 6 and are being updated in Drupal 7).
Just remove mappings for the fields you don't want to update.

But AFAIK a different feed cannot update nodes created by this feed.

cswan’s picture

Thank you very much for help! The #11 method worked for me also. :)

jelo’s picture

Component: Feeds Import » Code

This seems to work well except in combination with pathauto and the url field. I have a content type with existing content in it. Several nodes have custom URL path settings (not using a standard pattern in the pathauto settings). I imported data from an RSS feed into a field in that content type which works fine with the NID as unique target. However, the update triggers a pathauto update to the node URL which removes all my custom URL entries and resets the entries back to the default pattern.

How do I set it up that the import does not trigger pathauto updates? After all, the mapping does not include a map to the url field in the target content type. I believe $node->pathauto_perform_alias or in D7 $node->path['pathauto'] could be used to disable this effect, but I am not entirely sure how to best implement it? I can see 2 [+1] options:

1) Should this be an option on the feed importer settings, i.e. should pathauto be triggered during the import or not?
2) Does feeds provide a hook that I could use to overwrite the default setting in a custom module and add a configuration to not execute pathauto updates for specific importers?
3) A temporary makeshift solution for me was to disable pathauto on that content type as most of my content has been created once and will just keep getting updated. This would obviously not work if the import creates nodes and requires a pattern for the path (unless the pattern is mapped as part of the importer).

A related issue was entered at http://drupal.org/node/1895280

sja1’s picture

@chadmkinder... Did you find a solution to the problem with taxonomy values being set to null? I'm experiencing the exact same behavior. Importing a feed with mappings to some but not all vocabularies assigned to the node type. Everything works great, except any terms already belonging to the vocabularies with no mappings get removed from the node upon import.

sja1’s picture

FYI - I filed the following bug report for the problem with taxonomy terms raised in comment #5.

http://drupal.org/node/1997984

jenyum’s picture

OK, I'm stumped.

What's driving me nuts is that I built a site last year using feeds that seems to be able to import new data just fine without overwriting missing fields.

Creating a cloned importer with only the fields I need to update and the guid just creates duplicate nodes with blank data except for the fields I just updated. (I'm importing data from .csv files.)

I don't want to remove all of the mappings because I have a very large number of fields with complicated names. (Blame the labels and lists voter data service.) If I have to add them back again it's another hour or two out of my life.

I have no taxonomy fields or otherwise complicated field types.

Help! I really need this to work. It used to work.

Edited to update:

OK, I calmed down. Thought of a way to test this with less pain, by creating a dummy content type with a couple of fields and testing it out. It does in fact still work if you delete all of the mappings that you don't want to update, and then re-import the data. This must have been what I did last year and I forgot this bit of weirdness.

However, this is still very awkward. In order to accomplish what I'm building this for I have to import all of the original data, then remove all of the mappings except the field I want to update, the guid ID and the field that shows the guid. Then run the next round of imports. Then if I need to add another field later. (Which I will) I will need to delete the last field I mapped from the mappings, and map the new field.

The expected behavior when you set "update existing nodes" would be to only update the fields that are actually present in the update file, not to overwrite them with blank data. Otherwise, "update existing nodes" does the same thing as "replace existing nodes" unless you change the field mappings.

Having the option to accomplish this through a cloned importer would be less awkward, except as noted it creates duplicate nodes rather than updating existing ones.

twistor’s picture

Sorry for your troubles.

The difference between Replace existing vs. Updating existing is how Feeds handles un-mapped fields. That's really the only difference. If you map a field, Feeds will control it.

goodog’s picture

Issue summary: View changes

I'm looking for confirmation about the question in #15 & #16, about whether different feeds importers can or cannot update the same nodes in D7. Feeds Node Multisource doesn't appear to have been ported to D7.

MegaChriz’s picture

@goodog
If you use a target other than GUID or URL as unique target, you can update the same content using multiple importers. See also Use other field as an unique target in Feeds 7.x-2.x. Alternatively, you could try the patch from #1539224: Add support for unique fields to be unique site wide to use GUID as unique target across multiple importers.

goodog’s picture

Thanks MegaChriz. Do I understand correctly that the solution in Use other field as an unique target in Feeds 7.x-2.x doesn't directly address the fact (aka the problem) that FeedsProcessor's existingEntityId() function uses the importer id of an existing entity to determine whether the exiting entity is to be considered? Seems like I'd have no alternative but to consider your second reference: #1539224: Add support for unique fields to be unique site wide.

MegaChriz’s picture

@goodog

Do I understand correctly that the solution in Use other field as an unique target in Feeds 7.x-2.x doesn't directly address the fact (aka the problem) that FeedsProcessor's existingEntityId() function uses the importer id of an existing entity to determine whether the exiting entity is to be considered?

No, in option 1 of that solution the Field validation module is asked to deliver the entity id and the importer id is not used during that process, thus the problem that FeedsProcessor's existingEntityId() function uses the importer id does not exists in that case. You only need to have an unique field (text or numeric) on your content type, you can not use the targets GUID or URL as an unique target if you want to update items with multiple importers. If that is a problem (for example, you don't want an extra field on your content type that is only needed to be able to update the content), you can try the patch from #1539224: Add support for unique fields to be unique site wide.

pal4life’s picture

Version: 6.x-1.x-dev » 7.x-2.0-alpha8
Category: Support request » Bug report

Hello,
I am experiencing the same issue on Drupal 7. There is no difference in behavior between
Replace Existing Nodes
or
Update Existing Nodes

Expected Behavior: I was hoping that when the mappings are set up but if those columns are not provided in a CSV, then update existing nodes would leave those fields alone and not replace them with a null.

Actual Behavior: With the mappings set up and columns not included in CSV, the fields are being replaced with empty in both the Replace Existing Nodes as well as Update Existing Nodes. Thus there is no difference in their behavior for CSV parser for a Node Processor.

Thanks.

MegaChriz’s picture

Category: Bug report » Support request
Status: Needs work » Closed (works as designed)

@pal4life
This is the intended behaviour. When a field target is setup as a mapper, the feed controls that field. This means that if the field is empty in the source, the field will also be emptied on the target. Else you wouldn't be able to empty previously imported values. Unfortunately, this also means that a field will be emptied if it was not provided by the source. The problem is that Feeds is unable to make a distinction between non-existent and empty. This behaviour is documented in the following change record:
https://www.drupal.org/node/2301993
See also #1107522-147: Framework for expected behavior when importing empty/blank values + text field fix.

If you don't want field targets to be overwritten, you have three options:

  • Remove the mappings for the fields you don't want to be emptied.
  • Try the experimental Feeds empty module.
  • Implement hook_feeds_presave() or hook_feeds_after_parse() in a custom module. In there, pull in the original field values for the fields you do not want to overwrite.
MegaChriz’s picture

By my knowlegde, the only difference between "Replace existing nodes" and "Update existing nodes" is that with the "Replace" option the entity is not completely loaded when setting values (loading the entity's fields is skipped), while with the "Update" option the entity is fully load (including all the fields) first. Therefore, the "Replace" option provides better performance (more noticable when the entity has a lot of fields and when importing a lot of items). I always use the "Update" option though, as that seems to be more predictable.

mbutelman’s picture

If mappings could be marked as "active" or "inactive" (just as Importers or tamper settings) the process would be more simple without having to recreate the mappings as described in #21

crawcole’s picture

Agreed with @mbutelman. I realize that this behavior isn't the module's intended one, but it seems like a significant number of people want to use Feeds to update certain fields and not others - and actually expect that this is the default behavior. Being able to mark mappings as active / inactive would be a great compromise for a future release.

grahamvalue’s picture

If it helps, here's a simple workaround:

Save two exported versions of the feed - with and without the required mappings - on your local drive. Import whichever version of the feed you require just before running an update.

It has pretty much the same effect as (and takes less time than) enabling and disabling mappings.