I have an existing database and tables that I'd like to "drupalize" (add comments, search, include references in other nodes, create custom views, etc). From what I can tell, I need to create a custom node type to include my tables/fields. If I was starting from scratch, I can clearly see how to do this after reading through the module development handbooks.
However, I expect to be getting a refresh of this database on a regular basis and need to come up with an appropriate strategy for populating the "drupal" nodes and associating them with the rows of the imported database. For example, once a month, I may be getting a new drop of the database, most of the data being the same, albeit with some changes to some of the columns and perhaps with a few more rows in the tables.
Is there a common pattern to doing this in drupal or are there some ways I could implement my module to make this easier?
The obvious answer is to have an administrative "import" command that would simply read my new database tables, compare them to my custom nodes in the drupal DB, and then insert new nodes and update any that have changed - in essence, a big diff/merge function. While obviously doable, it will be slow. I'd have to use some internal APIs to create and update the drupal node table in addition to my own. Also, my tables don't have a node id (nid) column so I figure I can create a join table between the two to make it easier on the import of the records (just a copy).
However, I was wondering if there isn't a better way to "fault in" the drupal nodes as either the records in my database were loaded or queried. It would require some kind of facade since the nodes wouldn't exist until they were loaded. I've done this in the past in other programs (using private data in object references where I had a lot more control of the application architecture) but I'm not sure how I could do this in Drupal.
I also have to force drupal to reindex the search index for those nodes that had changes in the columns. Can this be a byproduct of the update procedures or do I need to make an explicit call into Drupal?
Any advice is greatly appreciated.
Comments
Drupalizing Existing Data
I'd say create a node in parallel with each row of your main table and mark your main table with the node ID. Then create node hooks for viewing/editing/deleting etc. It's a little hack-y, but seems to work fine.
I very recently had to do the same thing. I ended up doing the above and also doing a little hack so my new node type could be extended using CCK fields (I've detailed the hack here: http://www.mikecantelon.com/?q=node/46 ).
-Mike Cantelon
http://mikecantelon.com