Closed (fixed)
Project:
Drush extras
Version:
6.x-2.0-alpha1
Component:
Code
Priority:
Normal
Category:
Feature request
Assigned:
Unassigned
Reporter:
Created:
31 Mar 2010 at 21:57 UTC
Updated:
27 Jun 2011 at 21:02 UTC
Jump to comment: Most recent file
Comments
Comment #1
moshe weitzman commentedCode looks good.
So, this appears to dump each table to a file and then md5 the contents. Fair enough. Could you elaborate on how you use this during dev or deployment?
Comment #2
greg.1.anderson commentedSome co-workers of mine in another division have put up a new Drupal website for their Beta Labs program, and of course they want to make code changes on their dev site and content changes on their live site. The question posed to me was, "how do we know which tables to put into --structure-tables to insure that we get any database changes needed from our dev site changes without overwriting the blogs on the live site?" This led to a discussion on node tables, whether you run updatedb on the dev site or the live site, and other situations where the database tables change on the dev site (adding new modules, for example).
Now, to see what actually changed, you can take sql dumps and diff them, but these diffs are very noisy, and often, the only thing you really care about is whether the table changed or not. In the early days, I used a bash script to print our table names that had changed; sql-hash is a port and improvement of the script that split the database up into tables. The old script saved the whole database table, but since the specific changes of the table are not relevant, I switched to saving just the md5 hash.
The use case, then, goes something like this:
There are a couple of weaknesses to this technique.
variablestable of both sites (and thanks to the cron timestamps, it's about certain the both variable tables will have inconsequential changes.The variables table requires a little additional investigation. It's not currently an issue on any of my sites (I've got it easy -- I only need to preserve the
commentstable), but it has been in the back of my mind that a script to three-way-merge variable table changes might be useful in some circumstances.As for the fact that the output of sql-hash is not atomic, that is not a big issue either, since we are only looking for database tables that change, and we use an atomic operation (sql-sync --structure-tables) to do the actual transfer. So, the only risk is that some type of change that wasn't already made earlier might happen in between the time you sql-hash and the time you sql-sync. This can be avoided with a little caution; just be sure to take the live site offline while you are investigating. Over time, you will come to learn that there are only certain tables that can change on the live site based on the kinds of edits that are allowed there, and the sql-hash operation then becomes unnecessary.
Of course, there are other solutions to to data-merge problem, such as exporting the data nodes from the live site and re-importing them on the dev site. This puts an upward limit on how much effort one should spend on syncing via --structure-tables, and of course there is also the option of using vcs checkins of the sql-dump to merge db changes together. Both of these are more complicated solutions (esp. the later, which can hurt you a lot if you're not careful). Even if --structure-tables-based syncing is not used, though, sql-hash remains a valuable early investigation tool to determine what changes when certain high-level operations are performed on a Drupal website.
Comment #3
greg.1.anderson commentedHas a bug on mysql databases.
Comment #4
davidcm commentedmysql database bug fixed, thanks Greg
I'll write up a nice how to/blog about how sql-hash helped solve my sync problem once my project is finished
Comment #5
greg.1.anderson commented@moshe: I think this is ready to go; if you don't want it in drush core, I'll put it in drush_extras, but I think it would be good to have here.
Comment #6
moshe weitzman commentedwow, i replied here a day ago and it went poof. odd.
thanks for the thorough use case info.
i mulled this and i think its better suited for drush extras.
Comment #7
greg.1.anderson commentedForgot to make a note here a hile back; this still has a problem with mysql. Seems that mysql includes timestamps and other variable data in the dump, so hashes are always different, rendering this command useless. For it to be useful, it would need to correctly strip out variable information from the dump before generating the hash.
Comment #8
greg.1.anderson commentedCommitted a variant of the above (d42e733) that strips out comments (including the timestamp inserted by mysql) before computing the hash. Also added an sql-compare command to compare tables from two different sites.