I am faced with a rather tricky challenge which I have not been able to come up with a solution for as of yet. This is for a client project where we need to import lots (30,000 + rows) of data about hotels (e.g., location, description, price, etc).
The initial import is not what presents the problem. I can do that quite easily using Node Import module.
The problem is that this data is updated several times a year by a data provider who basically send us a new .csv file with all of the latest information. No way of telling what rows have changed, where new rows have been added, deleted, etc.
Furthermore, to compound things, each hotel does not have a unique id number (gasp, I know).
On the website, each hotel has ratings, reviews, and comments associated with it.
Does anyone know of a solution where I can do the initial import, and then periodically do an update that will update existing rows and add new rows, all while keeping peripheral info like comments intact?
So far I have tried Node Import, which works fine for the initial import. I've investigated Node Import Update (http://drupal.org/project/node_import_update) but in my tests it appears to do nothing.