You should be able to update nodes, and not just create new ones. Should just be a matter of adding an existing id field (existing nid), if it finds it it updates, if it doesn't, it inserts (could add options as well). Then add some logic to each destination to either save or update the content at save time. Might be a little more complicated than that, but it sure would be useful, especially if you wanted to add things like roles to users once you go though and import them with a previous content set.
Let me know your thoughts, and I can code it. I'd probably start with a new destination called roles?
Comments
Comment #1
mikeryanYep, I've thought of this. I would visualize another toggle on the Process page, operating like Import except the main processing query would include rather than exclude rows in the map table. The one problem - remembering where you are (what rows have been updated and what haven't).
Comment #2
frankcarey commentedyes, true. the fields that make the most sense are either version (if saving as versions) or timestamp (updated)
Comment #3
moshe weitzman commentedUntil we have this, I hope folks realize that you can just set nid (or whatever id you are working with) in your prepare hook and then drupal will do an update instead of insert
Comment #4
mikeryanWhen it is implemented, the approach I'm considering is a needs_update boolean on the map tables. "Update" would be an operation that sets needs_update to TRUE for all rows in the corresponding map table - the uber-query for an import process then would include both missing map entries (as now) and map entries with TRUE, and it would be cleared when the row is imported.
Comment #5
scotjam commentedRegarding #3,
How can you set the nid in the prepare hook for an update rather than an insert?
Can you kindly provide outline instructions to make this change?
thank you
Comment #6
matason commentedHello scotjam,
Something like this should work:
Hope that helps!
Comment #7
scotjam commentedThanks matason,
What if you want to update many nodes! I was thinking there might be a way to automatically recognise which nodes exists and then to update those nodes. Or could I make the code assume that all nodes need to be updated by default?
cheers
scotjam
Comment #8
matason commentedHi scotjam,
I assumed you wanted to update many nodes :)
Currently, afaik, the only way to cause an update instead of an insert is to set $node->nid
Let me ask, how are you going to maintain a primary key both in your data outside of Drupal and your data inside Drupal?
One option could be store the external primary key in a CCK field on initial import then in your migrate_prepare_node function perform a query to look up the nid of the node that has the external primary key, if the query returns a nid then set your $node->nid
I'd need to know more background on what you're doing and trying to achieve in order to be more specific.
Comment #9
mikeryanLooking back at this issue, I realize there are really two different features we're talking about here:
Renaming this issue to reflect the first idea - I'll create a new issue for the other one.
Comment #10
scotjam commentedHi matason,
I'd be happy to maintain a primary key and store the external primary key in a CCK field. My particular situation is that I have to create my content outside drupal. Once the data is up on the site, I regularly need to update the nodes with more info as I find it. Does this help with regards to what I'm looking for?
And the change you've suggested, would that be a one time change to the code? i.e. I could import new nodes and updates existing nodes without switching php code?
best wishes
scotjam
Comment #11
mikeryanSee also #601656: Update migrated content in-place (@scotjam, I think this is the feature you're looking for).
Comment #12
mikeryanComment #13
scotjam commentedHi mikeryan,
Where can I read more about #601656?
I've looked at issue #601656 but I can't find a description, I'm guessing these two issues the same?
If not, which one do you think will eventually be implemented?
cheers
scotjam
Comment #14
mikeryanThe description is brief but it is there.
Both this issue and that one should be implemented - they are two distinct features having to do with updating content.
Comment #15
matason commentedMy apologies scotjam, I thought what I suggested would work based on comment #3 but now I've tested and looked at the code I realise it won't. This is because the id of each row of the incoming data is mapped to the nid created in Drupal on initial import. On subsequent import attempts the query that gathers the data for import makes sure already imported rows are not returned.
As mikeryan correctly pointed out in #11 this a new feature request as the functionality you require doesn't yet exist.
Apologies once again and I'm really sorry if I wasted your time.
Comment #16
scotjam commentedHi matason,
Totally appreciate your help with this and thanks to your efforts.
Didn't mean to cause an issue (...pun intended!) sorry!
best wishes
scotjam
Comment #17
DrPhunk commentedHi everyone,
Just wondering, has a separate request been created for option two in comment #9 ?
We are in need of a feature like this, namely re-running an import every cron cycle to update existing nodes (products) with updated CCK fields (pricing) from a server-side CSV file.
Has anybody made any more progress on this topic?
Comment #18
dww@DrPhunk: Didn't you read the rest of the issue? mikeryan clearly posted in #11 that the issue for that is here: #601656: Update migrated content in-place ...
@moshe/mikeryan: Pointing out the hopefully obvious: not everything being imported is a node, so whatever solution is done here needs to not assume nodes. ;) I'm not sure if it can all just be handled with the existing import and prepare hooks -- maybe we just need to pass in a final optional argument for $update_id = NULL, and if it's set, the import and prepare hooks knows they're in update mode? Do we want a new hook_migrate_update_[type] hook invoked, instead of reusing the import hook? Part of me thinks an "import" hook with an "update_id" argument is confusing. However, part of me thinks those two functions are going to be mostly the same, so it might be easier to reuse code if they're a single function with an arg... Hrm.
Comment #19
mikeryanYes, the solution will apply to all destination types.
I plan on implementing this and #601656: Update migrated content in-place in the next week or so.
Comment #20
mikeryanFinally done. The core destination types (node, user, etc.) now allow using their keys (nid, uid, etc.) as destination fields - when these are mapped, the existing object is loaded and only fields mapped from the source are updated. Note that, since such a content set is intended for updating existing content, clearing such a content set only clears the map/message tables, it does not delete the content; processing such a content set cannot be undone (there's no record of the original values being updated).
As part of this work, an extension to the migrate_fields hook API was added - putting [square brackets] around a destination field index marks it as the primary key for the destination.