In this previous thread we discussed how to configure a "shadow" settings.php file to allow "drush sql load" to operate on a remote database. While this is very slick, it introduced the problem that it isn't really desirable to have the remote database exposed to the local Drupal install. We discussed a couple of options for "hiding" shadow settings from the local system, but I realized that since these settings are only useful to "drush sql load" and "drush sync" and are potentially destructive when used in other contexts, what is really needed are remote site settings that are only visible to these two commands (and potentially other future commands that operate on remote systems).

Enclosed is a sample implementation that works pretty well. If you set up an alias called "production" and another called "dev" (see example in the example.drushrc.php file in the patch), then you can do useful things like this:

$ drush sql load production dev (copy sql db from production server to dev server)

$ drush sync production:. dev:. (rsync Drupal root from production server to dev server)

In actual usage you would probably be logged in to the dev server, and you would have a site alias for the production server but not for the dev server. If your dev install was located at sites/all/dev, then the drush sql load example above would still look the same (code reverts to original behavior when an alias is not found), but the drush sync would more typically be:

$ drush sync production:. . (rsync Drupal root from production server to local server)

I hope you'll like this concept. The amount of code is pretty minimal, and I think this beats trying to protect settings.php from accidental use in other contexts. Also, there is no convenient way to represent the remote Drupal root using settings.php alone, so it would be difficult to implement the "drush sync" shortcuts by other means. Let me know what you think; I'd be happy to adjust the implementation if you want any changes.

- Greg

CommentFileSizeAuthor
#103 drush-HEAD-2009_11_04.patch13.66 KBgreg.1.anderson
#100 drush-HEAD-2009_11_03.patch10.67 KBgreg.1.anderson
#95 drush-HEAD-2009_11_03.patch10.16 KBgreg.1.anderson
#92 drush-HEAD-2009_11_02.patch9.75 KBgreg.1.anderson
#85 drush-HEAD-2009_10_27.patch6.63 KBgreg.1.anderson
#82 sitaalias-new-files-2009-10-26.zip13.35 KBgreg.1.anderson
#82 drush-HEAD-2009_10_26.patch44.24 KBgreg.1.anderson
#80 drush-sitealias-new-files-2009-10-22.zip13.49 KBgreg.1.anderson
#80 drush-HEAD-2009_10_22.patch43.92 KBgreg.1.anderson
#78 sitealias.drush_.inc_.txt6.87 KBgreg.1.anderson
#69 drush-HEAD-2009_10_01.patch74.44 KBgreg.1.anderson
#69 backend.inc-2009-10-1.patch6.3 KBgreg.1.anderson
#69 sitealias.drush_.inc_.txt4.1 KBgreg.1.anderson
#66 backend-2009-09-23.patch6.48 KBgreg.1.anderson
#60 drush-HEAD-2009_09_21.patch71.23 KBgreg.1.anderson
#60 alias.drush_.inc_.txt3.79 KBgreg.1.anderson
#55 drush-HEAD-2009_08_23.patch69.98 KBgreg.1.anderson
#55 alias.drush_.txt3.79 KBgreg.1.anderson
#49 drush-HEAD-2009_08_07.patch62.06 KBgreg.1.anderson
#43 drush-backend.inc-2009_08_01.patch4.65 KBgreg.1.anderson
#41 drush-HEAD-2009_07_30.patch62.92 KBgreg.1.anderson
#38 alias.drush_.inc_.txt4.47 KBgreg.1.anderson
#37 drush-HEAD-2009_07_26.patch64.58 KBgreg.1.anderson
#35 drush-HEAD-2009_06_18.patch40.23 KBgreg.1.anderson
#31 drush-HEAD-2009_05_29.patch34.95 KBgreg.1.anderson
#28 drush-HEAD-2009_05_27.patch33.64 KBgreg.1.anderson
#9 drush-HEAD-2009_05_14.patch11.99 KBgreg.1.anderson
#5 drush-HEAD-2009_05_13.patch8.64 KBgreg.1.anderson
drush-HEAD-2009_05_12.patch5.51 KBgreg.1.anderson

Comments

greg.1.anderson’s picture

Title: Proposal and preliminary implementation for remote site aliases for Drush » Remote site aliases for Drush (Proposal and preliminary implementation)
moshe weitzman’s picture

I like this proposal.

My one hesitation about the rsync is that you can do this aliases already with .ssh/config file. The example for the sync command even recommends this sort of syntax: `sync dev:/var/www/files/ stage:/var/www/files/`. Granted that there is some utility for a team of people to use one .drushrc instead of one config file. Perhaps the .ssh/config file should be described in the command help ... Does it make sense to have aliases to the path part as well. It might be nice just say dev:files and have that expand to /sites/default/files. Just thinking aloud.

The db_url stuff looks good. We need to mention this feature in the command help. Just point people to example.drushrc.php

+  }
+  else
+  {

that should be 2 lines per our coding standards.

greg.1.anderson’s picture

Cool.

I think path aliases might be a useful addition; this is something you can't do easily with .ssh/config. It might be complicated to allow arbitrary sets of file aliases, though; if someone made a path alias called "misc", then dev:misc would be ambiguous. Aliases could take priority over folders of the same name in the drupal root; I think that would be okay.

The path alias could be everything up to the first "/" in the path. There are two ways to implement it:

  1. $options['site-aliases']['dev'] = array(
        'db_url' => 'pgsql://username:password@dbhost.com:port/databasename',
        'root' => '/path/to/remote/drupal/root',
        'fqdn' => 'mydevsite.localdomain.com',
        'files' => 'sites/default/files',
        'data' => '/my/arbitrary/path',
      );
  2. $options['site-aliases']['dev'] = array(
        'db_url' => 'pgsql://username:password@dbhost.com:port/databasename',
        'root' => '/path/to/remote/drupal/root',
        'fqdn' => 'mydevsite.localdomain.com',
        'path-aliases' => array(
          'files' => 'sites/default/files',
          'data' => '/my/arbitrary/path',    )
      );

Do you have a preference? I think I'm partial to #2 (extra nesting over namespace collisions). A third option would be to have a common path-aliases section, but I discount that because 'files' might exist in different locations on different servers.

Do you want me to rename anything? I'm partial to slightly verbose names, but I could change 'root' to 'r' and 'fqdn' to 'l' for better alignment with $options of the same purpose.

You caught me on that "else" formatting; that's how I format my code. I was trying to follow the Drupal convention, but missed one. I'll fix it and submit a new patch with help and path aliases sometime in the next day or two.

- Greg

moshe weitzman’s picture

sounds good. lets keep your verbose names and do the nesting within path-aliases.

greg.1.anderson’s picture

StatusFileSize
new8.64 KB

Implemented as described above.

moshe weitzman’s picture

$PWD will get expanded by bash when the command is executed? What's the advantage over . ?

What about ssh username and pass? we are expecting admins to setup an entry in .ssh/config? if so, that partially eliminates the benefit.

greg.1.anderson’s picture

The way I implemented it, relative paths are always relative to the Drupal root. This makes sense most of the time, but is inconsistent with standard shell command-line usage. What do you think? We could require the use of "root" as an alias (and make the Drupal root just another path alias...), but that would be a bit awkward, since the common usage will be to copy things inside the Drupal root. I have to admit, I was a little ambivalent about the handling of relative paths...

For my part, when I ssh / rsync to a remote system frequently I set up a public/private key pair, sometimes with a password and sometimes without. I use a script ("pushkey") to set it up automatically. "ssh-agent" helps out a lot if you have a script that does a lot of ssh / rsync commands (wrap it in ssh-agent, and you only need to enter your password once). I could also add ssh username and password to the site-alias settings for those who prefer to do it that way.

I'll submit another version with ssh username and password, plus any changes you want for relative paths; just let me know your opinion.

Thanks,

- Greg

greg.1.anderson’s picture

I was shying away from requiring 'root' as a path alias because I figured that there were no path aliases most of the time for the local machine, which was a common case; however, it would be easy enough to make an implicit path aliases structure containing just 'root' and 'files' for local paths. So then you would have:

drush sync root stage:root

... if you want to sync the Drupal root of the local installation with the root of 'stage'. An then this:

drush sync ./ stage:files

... would copy everything in the current directory to the 'files' directory on 'stage'.

Now I think that's better. If you agree, I will re-implement it like that.

- Greg

greg.1.anderson’s picture

StatusFileSize
new11.99 KB

I've updated the code, and I think it works like it should now.

I use 'dt()' to substitute the path aliases, so aliases are now "!files" instead of "files", and they can appear in multiple places in the path if you wish (e.g. "dev:{\!alias1,\!alias2,\!alias3}"). Relative paths in path aliases always start from the Drupal root. Relative paths in the command-line args are relative to the Drupal root when a machine alias is specified, and they are relative to the current working directory if there is no machine alias. "drush sync" tries to determine if a machine is local by checking to see if it is "localhost", or if it has the same name as a folder in "sites" (so it can be wrong if you don't set things up just-so).

"ssh-user" and "ssh-password" settings are now supported. There are two ways to pass the user: one is via 'rsync -e "ssh -l user" machine:/path .' and the other is via 'rsync user@machine:/path .'. The former does not seem to have any facility for specifying the password on the command line, so I only support the later.

I have tested this only in simulated mode--it's good enough for review, but should be run through its paces before commit. I'll do that if you have no further change requests.

- Greg

owen barton’s picture

I have some thoughts on this - we have a quite established home grown perl system called pushdb/sync for updating Drupal sandboxes, and I think there are quite a few ideas we could steal from that. While we probably need quite a lot of configurability, I think we also need to make it possible for the aliases to be computed using some algorithm and a site "key" of some kind (with manual overrides where needed), and also to allow some of the settings to be determined after connecting to the remote host either through configuration on the host or through autodetection.

Finally, I think we should consider a more ambitious eventual goal. For example, pushdb/sync will create a "secure" dump of the database (with e-mails and passwords cleaned out), and then rsync it to your local host (by far the most efficient way of updating the database). It will then import this dump into the appropriate database, fix a bunch of paths and clear the caches. It will also rsync all uploaded files under a certain size from the remote site to your sandbox. There is almost no configuration locally (and no "per-site" configuration at all), if you have a fairly normal sandbox setup - the only command is "sync.sh username".

Anyway - I don't think this is necessarily incompatible with the current approach, but I think we should give it some thought so that we make sure we are able to limit manual configuration (and increase "intelligence") as much as possible in the long run :)

greg.1.anderson’s picture

Very cool. Is pushdb/sync open source / sharable or otherwise available for inspection and porting? (I didn't find it via Google.)

The patch on this particular branch started out as a simple way to keep db_url configuration settings for remote sites out of settings.php and in a drush-specific location. drush sync was supported "because it was there" and because it made sense for the settings to be symmetric for db and rsync.

My thought on configuration was to make a tool (drush command? script? web service?) that would dump a site alias record for the current Drupal installation; if you could hit these remotely, you could easily build a configuration table for whatever set of remote Drupal sites you needed to address. Extra intelligence, be it home-grown or borrowed from existing scripts, is always welcome.

My long-term goal is to build a three-way-merge tool for syncing dev / staging / production servers. I have a working prototype, but it is young and needs work. My thought was to build strong db-independent accessors into drush, and layer the merge function on top of that (either within drush or outside drush, as appropriate). Being able to borrow / steal code from pushdb/sync would be a helpful boon, as the kinds of things that you describe above have crossed my mind from time to time.

Thank you for your feedback,

- Greg

owen barton’s picture

Hi Greg,

Yes - the scripts are all GPL and available at https://svn.civicactions.net/repos/system/sandbox_tools/ - we haven't publicised them much because they are still somewhat specific to our vhost and sandbox layouts, and quite frankly a drush based approach would be much cleaner in many areas (ever tried to unserialize php from perl?).

Also, I should mention that pushing databases (and fixing post-push) from live to qa/dev sites (and anywhere, really) was actually the initial task pushdb was written for, and it does it very well.

When I get a chance I will post some example commands and explain how it works cross server, although the docs on the code (especially on pushdb) are pretty good.

Thanks for your work on this!

moshe weitzman’s picture

Good stuff.

@Greg - Kathleen has some scripts that might be of interest to you - http://drupal.org/project/DBscripts

re: pushdb - Post-fixing paths in system and files tables should be rarely needed in D7 with sites.php. progress!

greg.1.anderson’s picture

Status: Needs review » Needs work

All of the scripts look great, thanks; I'll peruse those over time. I'm setting the status of this incident to "needs work" until I have time to test the mods, probably early next week; then I'll set it back to "needs review" to indicate it's tested. Everything looks good in simulated mode, so I expect I won't need to make any changes or bug fixes. We'll see!

Thanks,

- Greg

moshe weitzman’s picture

That syncdb command that Grugnog2 shared really got my attention. It's gonna be much faster than sql load. I think this patch is a perfect compliment to that. Here is a quick proposal for such a command:

sql sync --sourcessh --sourcemysql --targetssh --targetsql --cache --filename

the first four params are site aliases as proposed here. this is enough to specify where to ssh to, where to mysqldump to, where to rsync to, and finally where to import into. --sourcemysql would be optional and if omitted, we just assume a dump is waiting there for us (i.e. some other process does the dump).

cache works just like syncdb and would default to 24 hours or so. this encourages reuse of dumps among team members.

filename is the name of the file that gets rsynched. we need to come up with some naming convention if omitted. probably a concat of the first 4 params.

greg.1.anderson’s picture

I agree; that looks like it would work well. My first thought is that I'd like to see the first four parameters simplified down to two, a-la sql load:

sql sync from to --cache --filename

Here, "from" and "to" are either a site name in the "sites" folder or a site alias. In either case, you can figure out where the machine is and what the db_url is. There are two problems / limitations with this simplification, though.

  1. The 'port' of the remote machine might be different depending on whether you are using the site alias with sql load (tunneled) or sql sync (executed locally)
  2. It might be inconvenient if you were forced to make a site alias to talk to a remote system

I think that the first problem could be overcome by adding another parameter to the site alias that specified what the db_url port mapped to on the remote system; sql sync could substitute that value when doing a remote operation.

The second problem could be overcome in the same way that other drush sql commands get another optional parameter, e.g.

drush_get_option($prefix . '_database')

For example, it might work like this: presume that "dev" is an actual folder in "sites" or a site alias, and "r" is a fake name that is neither in the sites folder nor in the site aliases. Then we would have:

drush sql sync dev r --r_db --r_db_user --r_db_password --r_ssh_target --cache --filename

sql sync would see that "r" did not exist, but it wouldn't care if it could get the parameters it needed from --r_db, etc. Clearly that's a lot of parameters; I don't think you'd ever want to actually call sql sync like that unless you were calling it from another script.

I caught a bad cold that's kept me down for a few days, so I still have a bit more testing to do on the existing code. In any event, if you think the above proposal is good, I'll work on sql sync when I'm feeling a bit better.

moshe weitzman’s picture

I like the db_port proposal. The second proposal works but won't win any beauty points. Needs more thought.

adrian’s picture

Drush can already make calls to drush on remote systems (ie: drush on this server speaking to drush on another server)
It uses encrypted bi-directional unix pipes to communicate.

Look at drush_backend_invoke , it has parameters for hostname, user and remote drush path.

Using those parameters you could call a command on a remote server from within drush.

One of the things I want to add to it is a mechanism to transport files via ssh (or whatever). Theoretically we can create additional pipes with proc_open and pass data (along with control codes) through it.

In Aegir, we use this for doing upgrades. When you do a provision migrate command, it makes a backup of the Drupal site hosted on the current directory, and then it runs a provision deploy command on the target drupal directory, which hits update.php if applicable.

For our next release we want to make this work across servers too.

An example of this would be : drush.php migrate username@server.com:/path.to/drupal:site.com /path/to/local/drupal:site.com , which would then cause drush to connect to the remote site, make a backup, download the backup file and redeploy it on the local site.

Anyway, I digress, if all you want is the database credentials of the remote site, you should just be able to do a drush.php status --pipe on the remote site , but you will need to know (a) where drush.php is on that server and (b) the drush root path.

Other than that it should just work.

adrian’s picture

Oh, and btw. the proliference of parameters is one of the reasons i very much dislike the --root and --uri options.

I think we'd be far better suited by making these things be configurable via an optional parseable argument,
and dropping the fscking requirement to add 'http://' to the url.

adrian’s picture

adrian’s picture

We'd have to pick a text format that could map everything we want, and then add the ability to give arguments meaning (such as attach a parser to it).

So, even locally, instead of doing

drush.php --root="/full/path/to/drupal" --uri="http://sometime.com" status

it would be :

drush.php status /path/to/drupal#site.com

or if you are in the directory :

drush.php status site.com

or if you want a remote drupal site :

drush.php status username@server/path/to/drupal#site.com

Drush also already has a mechanism to write it's own config files, although that is only used by provision.
It uses the drupal root drushrc.php and the site drushrc.php to store relevant configuration settings.

It would be pretty simple to implement a drush command to save aliases, ie : drush.php alias mysite
and then you could use that alias in the future. ie: drush.php migrate @mysite /path/to/drupal#site.com (for instance).

greg.1.anderson’s picture

Okay, I agree with everything adrian says, and I will certainly look into using drush_backend_invoke instead of ssh to invoke the remote drush command. I don't think I'm bold enough to jump in and add file-passing as described above, so for the time being I suggest we use rsync to transfer the sql dump. It might be a wise idea to use rsync in any case; if we at some later date employed Kathleen's technique to insure that the sql dump was made in a stable order, rsync might have a big advantage over a homegrown data pipe (being that it is optimized for that sort of thing).

I also think it is a grand idea to be able to easily collect site aliases in your drushrc.php file, although 'drush sql conf' does not contain quite enough information to do that. Another similar command could be added later if desired, but even with that facility, there is still some value in having "exploded" parameters, not for by-hand usage, but for instances where "drush sql sync" is being called by some other script that has all of the connection info it needs (but no site alias).

If it is generally agreed that exploded parameters are a good thing, then I propose the following.

Use case 1: use site aliases

drush sql sync from to

Use case 2: use exploded parameters

drush sql sync --from_db --from_db_user --to_db ... etc.

You can also mix these two forms, in which case sql sync determines whether the argument is for "from" or "to" based on whether there are --from or --to style optional parameters. drush sql sync can determine whether the "from" and "to" args are passed via isset, as missing arguments are always passed as null.

The only shortcoming of this solution is that drush does not support optional parameters; you get a warning every time an argument is omitted. This could be rectified by enhancing drush_parse_command to fill in the extra missing args with explicit 'null's if the command had a (new) indicator that the arguments are optional.

The other option I see is to just abandon the long-form parameters and require site aliases, just like 'sql load'.

greg.1.anderson’s picture

Some quick notes. I looked at a couple of things and did a couple of experiments at lunch.

I'm not opposed to the recommended syntax changes to drush, but they are beyond the scope of what I'm prepared to discuss on this thread (or tackle as a minor contributor). Integrating site aliases with drush_backend_invoke probably wouldn't be too hard, though.

  1. It seems that the http on --uri is optional (drush --uri=othersite.com status works for me)
  2. --uri and -l seem to be equivalent
  3. It would be easy to make _drush_bootstrap_site_validate accept a site alias in addition to a named folder with a settings.php file
  4. It would be easy enough to make drush_dispatch call drush_backend_invoke if a remote site alias was used
  5. This would break places where drush parsed settings.php, but that seems to be mostly limited to environment.inc

Even with this addition, I still favor my proposed syntax above for sql load, sql sync and sync; in other words, for commands that operate on two drush instances at once, I prefer drush sql load from to to drush -l from load to. I think these changes would be low-impact (compatible with existing drush usage) and easy to implement.

moshe weitzman’s picture

I don't think drush_backend_invoke() makes sense for the sql sync command. We have no need for drush on the remote server. We can just call mysqldump there and then rsync. I'd rather not deploy drush to production servers just so I can get their DB back to dev. Same for the reverse case.

owen barton’s picture

Yeah - I agree that mysqldump (or some other backup method) and rsync are hard requirements for db syncs for sites of any size - even gzipped databases can get big enough that doing a couple of syncs a day becomes painful.

I also really like Adrian's suggestion adopting the scp style notation (with a "#" or other modification somewhere for setting the appropriate URL) for identifying sites and (where possible) making it transparent to work on remote sites. I think that is mostly a separate issue from this one though.

I think we have 3 issues:
* Adding a basic sql only sync command that can pull from a remote DB - ideally via a local rsync. Probably using some kind of site aliases set up via drushrc - perhaps using a basic URL style syntax for server/site identification, but perhaps with other parameters for alternate db connection details. This is really the issue on the table now, and should be pretty easy to achieve.

* Generalizing the URL style syntax and alias system, so it can potentially replace the --url and --root options and allow local and remote sites to be identified seamlessly, with optional command specific details that can be added to each site/alias. This also implies a lot of work and is probably another issue.

* Extending the "sql sync" (probably as a more general "sync" command), to do more and more of the stuff that pushdb does, such as anonymizing dumps, fixing common path and caching issues, as well as synchronizing uploaded files (with rules for size/date thresholds) and working copies. This also implies a lot of work and is probably another issue.

greg.1.anderson’s picture

Agreed. I'm working on issue #1 using ssh / rsync and should have a patch ready shortly.

moshe weitzman’s picture

Priority: Normal » Critical

Excellent. Lets mark this as critical and focus our efforts to get it committed before 2.0 lands (any day now).

greg.1.anderson’s picture

StatusFileSize
new33.64 KB

Work in progress.

The difficulty in 'sql sync' is not in the actual execution of the operation, but in getting the many parameters to the function in a flexible and usable way. For this reason I took a fairly methodical approach to the implementation and started with parameter management first. As a result, I ended up implementing issue #2 first (Adrian' syntax for remote commands); as a bonus, it also works with drush_backend_invoke, so you can do things like:

drush alias status
drush user@host/path/to/drupal#site.com sql query 'select * from users;'

... or whatever.

I've been over this code pretty well and feel good about it. The implementation of 'sql sync', on the other hand, is still green. I only recommend running it in simulation mode (I have done no more than that). There currently is no provision for doing the rsync without the sql dump, and there is no provision for the "cache" flag. If you are using "Adrian syntax" with 'sql sync', you must specify the db_url with command-line arguments (--source-db-url and --target-db-url). I was going to add a cache flag to 'drush sql dump' to take care of that, but since 'sql sync' is not using drush on the remote machine, I am at a moment a bit unsure how to do the 'cache' test. (bash commands before the sql dump in the ssh command?) I think I have "_" where I mean "-" in some of the command line arguments. Documentation is needed.

I cannot work on this Thursday, but I'll probably have time on Friday. Next week is JavaOne, so I am a bit concerned about "any day now" given my pending short-term lack of availability. I have taken this code pretty far... I'll check here before doing any further work. I recommend playing it by ear until the end of the day Friday vis-a-vis the Drush 2.0 schedule, but if anyone feels the urge to throw in some code or suggestions before then, I'm all ears.

- Greg

moshe weitzman’s picture

A few quick notes:

  1. this is a pretty large change. Lets release 2.0 and then commit this to 2.1 right afterwards.
  2. sql sync needs an entry in core_drush_help()
  3. we need a semi-permanent place to store the dump so that rsync is effective. That goes for both remote and local. It would be nice if the command did that for you without requiring thought (and overridable of course).

more to come later.

greg.1.anderson’s picture

Agreed; I will do what I can on Friday, and bring this to completion after JavaOne.

greg.1.anderson’s picture

StatusFileSize
new34.95 KB

This version of 'sql sync' appears to work fine under light testing. There is still at least one bug; the script uses 'user:pass@machine' with rsync, but rsync does not support this syntax. 'drush sync' correctly uses the rsync password file option.

I invented a new 'alias' context so that the options for the site aliases could be inserted with a lower priority than the command-line options. Before they were being inserted into the 'process' context, which meant that default options from the alias record in drushrc.php could not be overridden with an explicit option on the command line. Now they can.

I added a '!dump' path alias that is used as the sql dump file location if available. This can be overridden by --source-dump and --target-dump options on the command line. If nothing is specified, a temporary file is created in /tmp instead. The default behavior could be more elegant; suggestions on how it should work are welcome.

More documentation and more testing are still needed, but it's getting close.

- Greg

adrian’s picture

Priority: Critical » Normal

I'm not happy about this being marked as critical for 2.0, so I am changing the priority to normal.

I'm the maintainer of both the backend.inc and context.inc files, and I have thousands upon thousands of lines of code that leverage this functionality, and I will need time to properly evaluate these changes, so please don't commit this until I've had a chance to properly review this patch, which will only occur mid next week, when I arrive in washington DC.

Drush is incredibly close to being released, and this has the possibility of introducing subtle issues that could be a nightmare to track down (just about anything that is distributed does as you need to debug multiple threads across multiple servers).

I personally feel that the scope of the 2.0 release is already large enough, and severe enough that this feature, although very desirable, should have a proper development cycle instead of just being tacked on as an added extra right before release.

moshe weitzman’s picture

@adrian - both greg and i agree with you - see most recent follow-ups. this is not going to be in 2.0. thanks for changing priority.

moshe weitzman’s picture

Issue tags: +3.0

hi greg. when you get a chance, lets tackle this again.

greg.1.anderson’s picture

StatusFileSize
new40.23 KB

By coincidence, I was just logging in to post an update. I had some critical customer issues come up that have kept me busy the past couple of weeks, but I have done a bit more testing and have included an updated patch with this post.

This version has an sql sync that works well locally, but I have not done remote testing yet. Site specifications are working well, locally and remotely, and I have added an "alias print" command that produces output suitable for dumping into a drushrc file. I have also contemplated "alias list" (show all aliases and site folders); further suggestions are welcome.

I will continue to work on this as time permits, debugging remote syncs and adding documentation. Please let me know if you have any comments; now, before I write the docs, would be a good time to make any adjustments to syntax that might be desired.

Thanks,

- Greg

greg.1.anderson’s picture

Update: not forgotten, still fixing bugs and finishing implementation. Haven't had too much time to work on this, but have been making progress. Hope to finish it up soon.

- Greg

greg.1.anderson’s picture

StatusFileSize
new64.58 KB

This basically works, but there are still a few TODOs and probably still a few bugs. The code is commented; see help for sync, sql sync, alias print and alias list.

Not tested:

mysql (but probably works)
d7 (especially database target specifications)

Features implemented vs requests from various people (see documentation for actual syntax / features):

Owen (14 May) sql sync will fix up d6 paths, but will not create 'secure' databases (I would call these "stripped"). Paths are fixed up in the dump file directly; stripping email addresses should be done at the database level (not supported in current sql sync).

Kathleen (Moshe's 15 May post): mysql will produce sorted output from an sql dump, but postgres will not; this makes committing to svn/cvs in a cross-db way a little difficult, so this is not supported at this time.

Moshe (19 May): sql sync supports the cache flag; you need to specify the path to the sql dump, but you can put this into the site alias (avoids putting it on the command line every time). If the dump file is not specified, then a temporary file is used -- and deleted when the sync is finished.

Adrian (20 May 18:03): Backend invoke is used to execute remote commands such as "drush status". Backend invoke is not used by sql sync, though. 'drush alias print' will produce an alias record you can put in an rc file, but drush will not write these directly. It's easy enough to use "drush alias print > $HOME/.drushrc.php".

Adrian (20 May 18:32) Site specifications as described by Adrian are implemented and work. it is also possible to define other variables (e.g. db_url) by including them in the URI query syntax (e.g. username@server.com:/path.to/drupal?db_url=xyz#site.com).

Adrian (30 May) I've done my best to make the drush code that is not affected by these changes work the same way that it did prior to my modifications; this includes the changes I made to backend.inc, which were for the most part just factoring / parameter changes that should not affect other callers. Please do review, check and adopt carefully; there may be bugs. I'll fix bugs that are reported.

Factoring question: I followed the "drush_x_functionname" convention where it was followed in the existing code -- which is to say that I did not follow this convention in code I added to environment.inc, which did not follow this convention. There is also alias code spread across sql, core, environment, etc., and no new include/alias.inc file or anything like that (command/alias/alias.drush.inc only contains code for alias print and alias list, and is not used by other functions that use aliases). If you'd like me to move or refactor any of this stuff, I'd be happy to do so; I wasn't sure what would be best in some cases. My main goal was to minimize changes in code paths I was not directly affecting.

- Greg

greg.1.anderson’s picture

StatusFileSize
new4.47 KB
greg.1.anderson’s picture

Here is commands/alias/alias.drush.inc; I didn't put it in the patch file this time. This command file requires the patch above (of course), but the main patch does not require this command file; it is optional but useful.

- Greg

greg.1.anderson’s picture

Status: Needs work » Needs review
greg.1.anderson’s picture

StatusFileSize
new62.92 KB

Minor bug fix to make remote commands work without spurious messages and errors.

adrian’s picture

fwiw. I am going to be looking at this issue again soon.

aegir 0.3 will be out and once we start work on 0.4 this is one of our first requirements.

greg.1.anderson’s picture

StatusFileSize
new4.65 KB

Sounds good, Adrian. The changes I made to backend.inc are independent to the rest of my changes; you might want to patch just backend.inc and test it with your released system to confirm there are no problems. Here is a patch for just that one file.

Thanks,

- Greg

greg.1.anderson’s picture

Important note: The "--fix-paths" feature in the patch in #38 and #41 is flawed. This feature tries to fix up paths embedded in a D6 database dump. Unfortunately, paths often occur inside encoded strings like this:

s:34:"sites/mydrupal.org/files/imported/"

In order to fix up "sites/mydrupal.org" to "sites/dev.mydrupal.org" (for example), it is also necessary to fix up the string length included in the encoded string.

Unless I come up with a really good idea, i'll probably just have to back this feature out. There are a few workarounds that can be used in place of this feature:

1) What I usually do with D6 is make my site "dev.mydrupal.org" located at a different drupal root, and store it in "sites/mydrupal.org" just like the live site. This works due to the way Drupal searches for sites folders; the site dev.mydrupal.org will look for "sites/mydrupal.org" if it does not find "sites/dev.mydrupal.org" when looking for settings.php. The other advantage of storing the dev site at a different drupal root is that you can upgrade the dev site and test it out without impacting the live site.

2) Put the dev site and the live site on different machines -- perhaps even put the dev site in a virtual machine. Then both can have the same drupal root and the same sites folder, and you don't need to fix anything up.

3) Use D7, which, as mentioned in #13 above, does not embed the site url in the database.

- Greg

moshe weitzman’s picture

Yeah, I don't think we need to fix that problem in these commands.

moshe weitzman’s picture

this is great work. i hope to try it out soon. first, a few code observations. also, well want to remove some debug logging before final commit.

+++ example.drushrc.php	31 Jul 2009 00:13:21 -0000
@@ -78,3 +78,54 @@ $options['structure-tables'] = array(
+ * mya also be used in arguments to certain commands such as

typo

+++ example.drushrc.php	31 Jul 2009 00:13:21 -0000
@@ -78,3 +78,54 @@ $options['structure-tables'] = array(
+ * are stored in the site alias record:

Perhaps append - "Most aliases use only a few of these keys."

+++ commands/core/core.drush.inc	31 Jul 2009 00:13:21 -0000
@@ -82,13 +82,23 @@ function core_drush_command() {
+      '--mode' => 'The unary flags to pass to rsync; --mode=rultz implies rsync -rultz.  Default is -az.',

generally the sql commands call this option 'extra'. or they used to before i did some refactoring and removed some. mode is OK if you think thats best.

+++ commands/core/core.drush.inc	31 Jul 2009 00:13:21 -0000
@@ -82,13 +82,23 @@ function core_drush_command() {
+      '--exclude-sites' => 'Exclude all directories in "sites/" except for "sites/all".',

this looks pretty esoteric. why not rsync them all?

+++ commands/core/core.drush.inc	31 Jul 2009 00:13:21 -0000
@@ -349,15 +369,306 @@ function drush_core_sync($source, $desti
+  // then call sync with the original parameters nere

typo - nere

+++ commands/sql/sql.drush.inc	31 Jul 2009 00:13:21 -0000
@@ -22,6 +22,8 @@ function sql_drush_help($section) {
+      return dt("Usage: drush [options] sql sync <source_alias> <target_alias>.\n <source_alias> and <target_alias> are site aliases, or names of directories under \'sites\'. These determine from where and to where you want your database copied.");

perhaps omit the description of params here. duplicates the arguments help.

+++ commands/sql/sql.drush.inc	31 Jul 2009 00:13:21 -0000
@@ -83,6 +86,7 @@ function sql_drush_command() {
+    'optional-arguments' => TRUE,

i still have to wrap my mind around optional-arguments. i think this needs documenting in drush_get_command(). by documenting, i mean it needs to be listed in the array of supported elements.

+++ commands/sql/sql.drush.inc	31 Jul 2009 00:13:21 -0000
@@ -90,6 +94,40 @@ function sql_drush_command() {
+  $items['sql sync'] = array(

do you think it is confusing to offer both sql load and sql sync commands?

+++ includes/command.inc	31 Jul 2009 00:13:21 -0000
@@ -147,7 +147,18 @@ function drush_get_commands() {
+  // Test to see if the first arg is a site specification

do we really want this? seems like site spec should be a global option, no?

This review is powered by Dreditor.

owen barton’s picture

I agree that fixing paths or sanitizing is another patch for another day.

I would love to read through this in more detail before it goes in - we are flying to the UK tomorrow for a trip, and I think I will have some time next week to really look at this (and other drush issues in the queue).

greg.1.anderson’s picture

I agree that this shouldn't be rushed in; let's give Owen and Adrian some more time to review it, and if you give me another week or two, I'll test mysql and D7. I expect the former will work without changes, but I think I'll need to make some slight adjustments for D7 database handling.

@moshe: Thanks for the comments; here are my answers (no comment == I agree and will change as suggested).

mode vs. extra: I have no strong opinion; I wasn't really sure how "extra" worked elsewhere in Drush. I saw some references, but didn't see how it was supposed to function. For extra options other than "mode", see what I did in drush sync; it's a little verbose in the implementation, but it allows you to specify any rsync option right on the command line, which has turned out to be very handy. I consider these to be "extra" args, ergo I named "mode" something else -- but I'll name it however you want.

--exclude-sites: Sometimes I have a multi-site installation, and I want to push out just one site somewhere (e.g. maybe take one site from a live server and pull it into a Virtual Machine for testing). If you have your 'files' inside of your Drupal root (maybe not best practice), then rsync'ing the files for all of the sites you don't want takes a lot more time and disk space than necessary. (Real disk space is perhaps free, but keeping a virtual disk small is still nice.) I find this option very handy.

Optional arguments: These are very handy, as they allow "drush alias print " instead of "drush alias print --what=". They are also useful in sql sync if you want to specify one of the targets with a site alias, and the other with --args, as you suggested in #15 (well, you only used --args; I wanted to merge #15 with site aliases, so that either form could be used). I'll document optional args. Basically this flag just causes drush to fill in 'null' for missing arg values so that you don't get a PHP error on the dispatch. The command that supports optional args then has to check for null to see if the arg was provided or not.

sql load vs sql sync: These commands both do about the same thing, and they work almost exactly the same way when run on local databases. They use substantially different mechanisms for remote database access, though. I used sql load a lot when I was in development of sql sync, but now that I've been using sql sync for a while, I personally have no use for sql load. I'd be happy to remove sql load if you want; if so, sql sync could be named either sql sync or sql load, at your option. The only downside to this is you might tweak out people who are using sql load to grab remote databases -- but it might be a net benefit to them, as they'd probably enjoy a performance increase once they switched over to sql sync. Your choice on which way to go; I was keeping sql load around to be conservative.

"seems like site spec should be a global option, no?": I strongly prefer it the way I implemented it. I don't have time to evangelize it right now, but will later if you need convincing. Since I'm short on time, I'll just point to comment #21 above for now (although I moved the order of the spec vs. the command).

I'll post another patch with the suggested modifications as soon as I have time to roll them in and test. I'm also going to roll out --fix-paths; I can't do it with sed, as sed is no good at math, and I can't do it with awk, as awk can't do math on regex matches either. The only option I see is to do stream-processing on the dump file with a php function (or do operations on the database as Owen does); that's another patch for another time.

- Greg

greg.1.anderson’s picture

StatusFileSize
new62.06 KB

Latest patch. Tested with mysql; works now. Needed to make some bug fixes to --create-db for mysql, and some fixes were needed for situations where the primary Drupal site used a different database than the site being acted on by drush sql * (for people running both postgres and mysql).

Fixed typos identified by moshe. Updated documentation. Removed most calls to drush_log (still wrap call to backend_inc in drush.php with log statements, deliberately).

To do:

Test on D7
Yank out --fix-paths (or buckle down and fix it)
Make --table work more like --skip-tables

- Greg

greg.1.anderson’s picture

I noticed that "drush updatecode" takes a variable arguments array. It is implemented like this:

function drush_pm_updatecode() {
...
// Get specific requests
$command = drush_get_command();
$requests = $command['arguments'];
...

I could use this pattern and remove the 'optional-arguments' flag from command.inc if you thought that was better. Then, drush sql sync would take no parameters, and $source and $destination would be loaded from $command['arguments'][0] and $command['arguments'][1], respectively.

- Greg

moshe weitzman’s picture

I do think thats better than optional-arguments flag. Lets just be clear in the help text for the command.

greg.1.anderson’s picture

I just installed and tried Drupal 7 with the patches above. While the basic operations work without modification, there are some caevets.

* D7 uses an array to specify db parameters rather than a string. While it works just fine to use a D6-style db url to point to a D7 database, I intend to change the site alias structure to allow either a db-url string or a D7 db array. I may rename the field from 'db-url' to 'db'.

* There is no way to specify D7 slave database specifications in an alias record. However, since it is possible to make an alias record that points directly at a slave db rather than a main site db, and since it is possible to specify the db-url with a command-line parameter to sql sync, I do not have any specific idea to improve database targeting for D7; it's probably fine as it is.

* One slave-database support feature I am considering is single-argument sql syncing, as in "drush sql sync mydrupal.org". This form would sync from the main db to all of the slave databases defined in settings.php. This could be useful when first setting up slave databases or for testing the feature before replication is set up in the database itself.

* drush alias print does not print D7-style db records; I plan to fix this too.

I'll post a new set of patches when I have time to put them together (probably within a week); let me know if you have any D7-related comments or suggestions.

- Greg

greg.1.anderson’s picture

I started implementing the items above yesterday, and found that it was most natural to put the entire 'databases' structure into the alias record. drush will create a 'databases' record with 'default' / 'default' from a db-url if necessary. Contrary to what I said above, I went ahead and supported the database and target flags to select any database in an alas record. It is still an option to make separate alias records for every database, if desired. This is still a work-in-progress, but I should have an updated patch ready shortly.

A question on documentation, though. It is difficult to fully document the changes in this patch within the contect of the drush help system. I think what is needed is a rewrite of the documentation at http://drupal.org/node/477684. I could rewrite this in some external location (where?) and promote it to the main documentation page once drush 3.0 is marked stable. Sound good?

- Greg

moshe weitzman’s picture

Perhaps you can just use that same doc and add (3.x) in various places where the feature is only available for drush3.

greg.1.anderson’s picture

StatusFileSize
new3.79 KB
new69.98 KB

Here is the latest patch, which includes all of the code changes mentioned above. To whit:

  • D7 support
    • --source-database, --source-target, etc. supported
    • Single-parameter 'drush sql sync' will copy default database to all slave databases
    • D7-style 'database' structures may be used in alias records. If db-url is used instead, it is converted to a database structure for uniformity.
  • 'optional-arguments' feature removed
  • Unification of skip-tables, structure-tables and tables, all of which can be specified either via a key or list (--skip-tables-key is key of element in tables list, -skip-tables-list is a list of tables on the command line). All tables reference the same table in options, but 'skip-tables' and 'structure-tables' are still supported for backwards compatability.
  • The previous implementation of the D6 database repair has been removed in favour of a mechanism that allows the caller to specify a clean-up script via the --db-repair-script option. The --simple-repair option will provide a very simple 'sed' script that works in instances where the root is the same and the uri does not appear in serialised data structures.

The documentation still needs a little touching up. I'll enhance the drush help structures in the near future, but will wait to modify the online drush documentation page until after this patch is committed (to avoid confusion that may be caused if trying to follow documentation on the code at drush HEAD).

alias.drush_.txt should be renamed to alias.drush.inc and placed in "commands/alias" (the --new-file option to cvs diff isn't working as it should for me).

@adrian: No changes to backend.inc; you can still do an isolated test with the patch from #43 before testing the whole patch if you wish.

- Greg

moshe weitzman’s picture

Thanks much, greg. With imminent Drupalcon and code freeze, I might be delayed in reviewing this. But I am very much excited and interested in it hitting drush soon.

moshe weitzman’s picture

Status: Needs review » Needs work

Am seeing a largish hunk fail in commands/sql/sql.drush.inc. Any chance you can reroll? adrian has been on vacation since drupalcon so that explains the delay here. I'm going to review this right after the reroll.

adrian’s picture

i'll be back on monday, so i'll try to look at it next week.

greg.1.anderson’s picture

I'll patch against head again later today.

greg.1.anderson’s picture

StatusFileSize
new3.79 KB
new71.23 KB

Okay, here is the latest, re-patched against head as of right now. This also includes moshe's fix to the drush sql dump bug that I introduced that broke skip tables when sql dump was called directly, rather than from sql sync (sorry).

There are also a couple of changes that I added.

1. --exclude-conf is now the default for drush sync; you must use --include-conf to copy settings.php, robots.txt and .htaccess files. With site aliases, it became really easy and convenient to just type "drush sync live dev"; after I overwrote my settings.php file by accident a few times, I got annoyed and changed the behaviour. I am generally reluctant to make changes that would break existing functionality, but in this case, the original drush sync added so little on top of rsync that I am assuming that not many people are using it right now. Please let me know if you think this change is too high impact.

2. Previously, drush sql query returned results in psql's default format. I modified the default flags so that it would instead return a table in the same format that mysql does, so that scripts can call drush sql query and process results the same way regardless of the database used. These args to psql were provided as default values to 'extra', so if you provide your own extra, it will replace the default values completely (no need to worry about 'dueling command-line arg preferences').

3. I unified the way that skip tables, structure tables, and tables (list only the tables you want to include) work; it is now possible to either specify a 'key' (lookup to a setting in some .drushrc.php file) or as a direct list. It still works the way it used to, but it also will look in a common table for all three table modifiers, so if you have a table list including the various user tables (users, roles, permissions, etc.), you can either skip them or include them w/out having to maintain both lists in two or three separate lists.

I was thinking about taking the "!" out of path aliases, so that the root would be 'root' instead of '!root', but I didn't do that yet. Let me know if you have an opinion on this; I don't think I'll want to make the change after commit.

The latest 'alias.drush.inc' is also attached; put this in "commands/alias" if you want "drush alias list" and "drush alias print" (very useful, and I recommend taking it, but it is optional).

- Greg

moshe weitzman’s picture

thanks for the reroll.

adrian is digging himself out but he gave me these comments in person and again in a chat. he is frowning at the 'db_url' part of the aliases array. he thinks we should connect to a foreign drush and bootstrap to the configuration phase in order to grab the foreign db_url.

if you are connecting to a site on the other end it should have drush
drush2drush communication
it's the same way that git works for instance
so remote = server details, paths and drush path
where drush path defaults to just 'drush'
which will work for apt-get installs
once that is there, the remote drush can just bootstrap_configuration
and return the db details
which can also be cached locally if necessary
defining the db details by hand is unecessary and prone to human error
you have no guarantee that you are connecting to the db that drupal is actually using

moshe weitzman’s picture

I like the aliases commands. I have a slight twitch about a naming collision with #549494: Support for command aliases. And a collision with url aliases in drupal core and the new sites.php aliases in drupal core. So lets rename these commands to sitealias and then use 'sa' as the short version once #549494: Support for command aliases lands

greg.1.anderson’s picture

I have no objection to renaming to site alias. Do you want "site alias list" and "site alias print", or something else?

Regarding the db_url parameter in site aliases, I would kind of like to keep it there as an option, as that allows your feature request from #24 above (drush sql sync to remote servers without drush) possible. For my part, I do not put in the db_url parameter by hand; I use "drush alias print >> $HOME/.drushrc.php" and then edit the result. I do admit that under this scheme, all of your site aliases will become stale if you ever change the db_url of your drupal site, so I am amenable to the idea that the db_url parameter could be optional, and would be filled in if needed via a call to the remote drush to look it up if it is not hand-coded. At the end of the day, though, I leave it up to you and Adrian to decide if you want to require drush on the remote site or allow the local copy of db_url in the site alias. I'll write code to fetch db_url remotely in either case.

- Greg

dman’s picture

subscribe.
Like everyone else here, I've also got my own old scripts that did this, and have tried working them into drush. It works for me, but I don't want to muddy the water with my approach too much.
What I've not seen mentioned here is the remapping of database prefixes, or even explicit prefix support which is a handy extra. I'm doing prefix-remapping on the fly when I push and pull databases.

My approach (just FYI, I'm not suggesting any derailment) - I add a local config file (a drushrc.php in the site dir) that specifies a preferred 'peer' assuming that the deployment pipeline is usually linear. This can be overridden via parameters.
I then call commands like:
drush upload files
drush download database

I was building this for my own uses before (I became aware of?) the drush sync development. Had not released/talked about it as I knew there were major security things to look into first.

drush help
...
 backup                Save the named part of the site to a backup file                                      
 restore               Restore the named part of the site from backup                                        
 upload                Push part of the site to the remote peer and deploy it there.                         
 download              Download and install from the remote peer                                             
 peer info             Verify the remote peer settings for upload and download       
drush help upload
drush help upload
Push part of the site to the remote peer and deploy it there.

Examples:
 drush upload database                     take a local database snapshot, upload it, and install it into the     
                                           remote host                                                            
 drush upload files                        Synchonize the remote site files folder with those from here.          
 drush upload site                         Synchonize the remote site custom code directories (site/{sitename}/*) 
                                           with those from here.                                                  
 drush upload file                         Synchonize the named directory to the equivalent path on the remote    
 ../all/modules/pathauto                   site.                                                                  

Arguments:
 database                                  The site data, both content and configs, excluding cache               
 site                                      The current site custom folder (themes, modules, settings) - excluding 
                                           the 'files'                                                            
 files                                     The current site 'files' - excluding the 'backups'                     
 file                                      Path to a specific file or folder, relative to the drupal root or the  
                                           site instance folder.                                                  
 common                                    The shared site resources - contents of the 'sites/all' directory.     
 all                                       The main Drupal core code, excluding the contents of /sites.           
 required                                  The modules and themes that are detected to be currently in use, a     
                                           sparse version of 'upload common' '.                                   

Options:
 --delete                                  Delete files from the remote site that do not exist on this server. 
 --simulate                                Display a list of commands or files to be transferred, but don't    
                                           actually do it yet       

I've so far extended this to handle FTP-only hosts as well (via lftp) when rsync/ssh is unavailable. Handy for small clients.

I'll have to get onto the train of thought that you guys have been following here, and see if I've got anything to bring to the party...

greg.1.anderson’s picture

drush sync + site aliases can do your named section pushes (site, files, common, all, but not required). Prefix fixup would be a helpful addition to drush sql sync. I don't use that feature myself, so it is a lower priority for me than the other kinds of fixups discussed above. At the moment I am carefully configuring my sites so they do not need fixup (see above); what I'm working on right now is a higher-level set of commands (a separate patch) to detect database differences and facilitate three-way merges (see some of the references above).

Thank you for the link to your scripts; I think the code here might start moving a bit for a while as Adrian and Moshe submit their comments. If you were interested, perhaps you could make a prefix-fixing patch after this thread is committed.

- Greg

greg.1.anderson’s picture

StatusFileSize
new6.48 KB

@adrian,

I know you're probably still catching up post DrupalCon, but please review this for me when you get a chance.

Regarding your comments that Moshe forwarded in #61, I looked into implementing your suggestions, but found backend.inc to be lacking in two important ways.

1. 'drush' is addressed via the full path to drush.php, which defaults to $GLOBALS['argv'][0]. Your suggestion was to provide a path to 'drush', defaulting to just 'drush'. I actually like your suggestion better, but to do so would require a modification to backend.inc -- and of course we'd have to be sensitive to existing code that was calling it with a path to drush.php and not just accepting the default value.

2. backend.inc nicely packages up a JSON-encoded data structure to return to caller, where it is nicely unpackaged again; however, the command output is always returned as a simple string. This is fine for most D6 implementations, where $db_url is a string, but it leaves much to be desired in D7, where $databases is a nested array. (Even D6 has some support for multiple databases, but this may be vestigial; I've never tried to use it, anyway.) I could always package up a JSON-encoded string and write it to stdout; the result would then be a JSON-encoded string embedded inside a JSON-encoded string. If I closed my eyes and tried not to visualize it, that might be okay, but I wanted something cleaner.

I solved these two problems with two enhancements to backend.inc.

1. The path to 'drush' can now be either a path to a drush script, or a path to drush.php itself. If the provided path ends in ".php", then backend.inc will use 'php' to execute it. If it does not end in ".php", it is presumed to be a shell script and is executed directly. When the target machine is local, the default path to drush is $GLOBALS['argv'][0] (which must always be correct on the local machine), and on remote machines the default is just 'drush' (which will be correct whenever 'drush' is in the path, which seems like the most likely choice given no explicit information about the remote machine).

2. I added two convenience routines, like so:

function drush_backend_set_result($value) {
  if (drush_get_context('DRUSH_BACKEND')) {
    drush_set_context('BACKEND_RESULT', $value);
  }
}

function drush_backend_get_result() {
  return drush_get_context('BACKEND_RESULT');
}

Any function can call 'drush_backend_set_result', and the object passed in will be stored in the BACKEND_RESULT context if drush was called with the --backend flag. drush_backend_output then uses drush_backend_get_result like so:

  $result_object = drush_backend_get_result();
  if (isset($result_object)) {
    $data['object'] = $result_object;
  }

This can be used as in the following example:

function drush_sql_conf() {
  $db_spec = _drush_sql_get_db_spec();
  drush_backend_set_result($db_spec);
  drush_print_r($db_spec);
}

In other words, stdout is the human-readable output for the command, and the 'object' entry in the returned array is the machine-readable php object that is equivalent to the human-readable output. drush_backend_set_result would of course be optional, and would be used only by drush commands designed to provide information to a remote drush instance.

The full patch containing my modifications to backend.inc are attached. Please let me know if you think these extensions are acceptable; if they are, I will go ahead and use them to extend the functionality of site aliases to fetch db settings from the remote drush instance.

Thanks,

- Greg

adrian’s picture

2. backend.inc nicely packages up a JSON-encoded data structure to return to caller, where it is nicely unpackaged again; however, the command output is always returned as a simple string. This is fine for most D6 implementations, where $db_url is a string, but it leaves much to be desired in D7, where $databases is a nested array. (Even D6 has some support for multiple databases, but this may be vestigial; I've never tried to use it, anyway.) I could always package up a JSON-encoded string and write it to stdout; the result would then be a JSON-encoded string embedded inside a JSON-encoded string. If I closed my eyes and tried not to visualize it, that might be okay, but I wanted something cleaner.

umm. you'll find that json_encode is recursive. i send entire filesystem trees in aegir with it. it does however turn all objects into associative arrays (which is actually very useful)

patch also no longer applies.

greg.1.anderson’s picture

Yes, I do know that it is technically possible to put a JSON-encoded string inside of the results in backend.inc; my objection is merely one of style and complexity. backend.inc currently takes all of the output from stdout and puts it into the result variable. To put a JSON-encoded string into the output, you would either need to surround it in start / end marks (as backend.inc already does for its result string) and strip it off on the other end, or just put it in 'naked' and hope for the best (which would work fine unless there was stray output on stdout, in which case it breaks). Putting in a naked JSON string seemed to fragile, and duplicating the output wrapping and unwrapping seemed inelegant. In either case, you would need a new version of "sql conf" that output JSON-encoded output, or a separate arg to that controlled the output style. Overloading --backend to control both (that is, make "sql conf" output JSON when --backend is specified) interferes with code that really does want the human-readable version returned remotely. Overall, I think I like it the way I did it better, but I could re-implement as a wrapped JSON-encoded string; there's just so much more duplicated infrastructure that way.

I re-rolled the backend.inc patch and the entire patch as well, but I want to test them before posting.

greg.1.anderson’s picture

Okay, here we go. Since I was in the middle of implementing the db spec fetch, re-rolling and testing the patch turned into finishing the implementation; it now works pretty well. If you do not specify 'db-url' or 'databases' in an alias, it will be fetched via backend_invoke (remote aliases) or read from settings.php (local aliases).

I also renamed a couple of functions

  • 'site alias', aliased to 'sa' (formerly was 'alias print')
  • 'list sites', aliased to 'ls' (formerly was 'alias list')

Also, you can now do the following:
drush sa `drush ls`
This quick little construct will print out a site alias record for every alias and every site in the 'sites' folder.

drush-HEAD is the whole patch, naturally. The backend.inc patch is just for Adrian, for isolation testing; it is included in drush-HEAD. The sitealias text file should be renamed to sitealias.drush.inc and placed in "commands/sitealias".

- Greg

moshe weitzman’s picture

Outstanding work ... I just found trivial issues after a code review. I am ready to commit this after one more patch from Greg. If Adrian cannot review this soon, he will just propose changes as needed. Meanwhile, I'll do some testing.

+++ example.drushrc.php	1 Oct 2009 22:16:25 -0000
@@ -81,3 +81,62 @@ $options['structure-tables'] = array(
+ * using the "drush alias print" command.

should be 'drush site alias print'?

+++ drush.php	1 Oct 2009 22:16:25 -0000
@@ -70,6 +70,31 @@ function drush_main() {
+      // Process a remote command if 'remote-host' option is set.

Perhaps move this hunk into a new drush_remote_host(). Lets keep special case code out of drush_main().

+++ example.drushrc.php	1 Oct 2009 22:16:25 -0000
@@ -81,3 +81,62 @@ $options['structure-tables'] = array(
+ *   in the 'sites' folder.  Behaves like the "-l" command-line flag.

FYI, -l should have an http:// at the beginning. Perhaps we should tone down the comparison as it isn't a perfect match.

+++ example.drushrc.php	1 Oct 2009 22:16:25 -0000
@@ -81,3 +81,62 @@ $options['structure-tables'] = array(
+#      '!data' => '/my/arbitrary/path',    )

This one is not documented above. Is it a 'custom' path-alias? I guess thats OK.

+++ commands/core/core.drush.inc	1 Oct 2009 22:16:25 -0000
@@ -89,13 +89,24 @@ function core_drush_command() {
+      '--all-paths' => 'If <source> and <destination> are site aliases, and neither <source> nor <destinatiln> specifies a path component, then rsync all path aliases that are defined in both the source and destination alias.',

typo - destinatiln

+++ commands/core/core.drush.inc	1 Oct 2009 22:16:25 -0000
@@ -348,16 +359,26 @@ function drush_core_cache_clear() {
+  // One of $souruce or $destination (at most) will be remote;

typo - $souruce

+++ commands/core/core.drush.inc	1 Oct 2009 22:16:25 -0000
@@ -366,15 +387,307 @@ function drush_core_sync($source, $desti
+  // merge the options specified by the caller with the options specified on the command line

Comments should start with uppercase and end in a period as per coding standards.

+++ commands/core/core.drush.inc	1 Oct 2009 22:16:25 -0000
@@ -366,15 +387,307 @@ function drush_core_sync($source, $desti
+  foreach ($rsync_available_options as $test_option) {

Bail if user specified unrecognized option? Catches typos? Just a thought.

+++ commands/core/core.drush.inc	1 Oct 2009 22:16:25 -0000
@@ -366,15 +387,307 @@ function drush_core_sync($source, $desti
+  $path = dt($path, $path_aliases);

I think we want strtr() instead of dt() here. dt() is for translation texts.

+++ commands/core/core.drush.inc	1 Oct 2009 22:16:25 -0000
@@ -366,15 +387,307 @@ function drush_core_sync($source, $desti
+function _drush_core_is_local_machine($alias) {

Some new functions in this patch feel more like API functions that belong in drush.inc or a new sitealias.inc. If you disagree, then perhaps lets break sync out into its own file like updatecode.inc

+++ commands/sql/sql.drush.inc	1 Oct 2009 22:16:26 -0000
@@ -91,6 +93,37 @@ function sql_drush_command() {
+      '--db-repair-script' => 'Path to script to execute to repair the source database dump prior to import at the destinarion.  Repair is necessary if the source and destination have different URIs or different Drupal root paths.  The following replacements are done to the repair script arguments: !input, !output, !source-db, !target-db, !source-uri, !target-uri, !source-root, !target-root.  The script receives input on stdin, unless !input is used, and stdout is written to the target file unless !output is used.',

Perhaps db-repair-script should be db-prepare-script since one could do non repair work there. Lets eventually document an example script or even ship with one.

typo - destinarion.

+++ commands/sql/sql.drush.inc	1 Oct 2009 22:16:26 -0000
@@ -150,75 +198,93 @@ function drush_sql_dump_execute() {
+  // Dump only the specified tables.  Takes precidence over skip-tables and structure-tables.

precedence

+++ commands/sql/sql.drush.inc	1 Oct 2009 22:16:26 -0000
@@ -227,26 +293,64 @@ function drush_sql_dump($db_spec = NULL)
+// look up the appropriate list of tables in the given catagory.

category

+++ commands/sql/sql.drush.inc	1 Oct 2009 22:16:26 -0000
@@ -256,6 +360,377 @@ function drush_sql_query($query) {
+  $source_database = drush_get_option('source-database') ? drush_get_option('source-database') : 'default';

Just do drush_get_option('source-database', 'default'). Adrian thinks of everything.

+++ commands/sql/sql.drush.inc	1 Oct 2009 22:16:26 -0000
@@ -256,6 +360,377 @@ function drush_sql_query($query) {
+    // sql sync once as ussual to sync from the default to

typo ussual

+++ commands/sql/sql.drush.inc	1 Oct 2009 22:16:26 -0000
@@ -256,6 +360,377 @@ function drush_sql_query($query) {
+  // all of the slave databases from the source.

the term 'slave databases' confused me here. Perhaps 'destination databases'?

+++ commands/sql/sql.drush.inc	1 Oct 2009 22:16:26 -0000
@@ -256,6 +360,377 @@ function drush_sql_query($query) {
+        $source_dump = tempnam(sys_get_temp_dir(), $source_database . '.sql.');

again, drush_get_option('source-dump', tempnam(sys_get_temp_dir(), $source_database . '.sql.'))

+++ commands/sql/sql.drush.inc	1 Oct 2009 22:16:26 -0000
@@ -256,6 +360,377 @@ function drush_sql_query($query) {
+        drush_log(dt('modification time is less than !cache hours old.', array('!cache' => $cache)));

Lets capitalize first letter in log messages too.

+++ commands/sql/sql.drush.inc	1 Oct 2009 22:16:26 -0000
@@ -256,6 +360,377 @@ function drush_sql_query($query) {
+    $repair_exec = $input_from_stdin . dt($database_repair_script, $script_replacement_array) . $redirect_to_output_file;

again, use strtr() or str_replace().

+++ commands/sql/sql.drush.inc	1 Oct 2009 22:16:26 -0000
@@ -369,6 +827,20 @@ function _drush_sql_get_db_spec() {
+    $database = drush_get_option($prefix . 'database') ? drush_get_option($prefix . 'database') : 'default';

use $default param to drush_get_option()

+++ includes/environment.inc	1 Oct 2009 22:16:26 -0000
@@ -436,6 +439,242 @@ function _drush_bootstrap_drush() {
+function _drush_process_site_alias() {

I'm not clear on what we are doing here. Please add doxygen.

+++ includes/environment.inc	1 Oct 2009 22:16:26 -0000
@@ -436,6 +439,242 @@ function _drush_bootstrap_drush() {
+function drush_get_alias_record($alias, $db_settings_needed = false) {

This function is a bit multi-purpose. Perhaps even more docs explaining why we do this, not just what we do.

+++ includes/environment.inc	1 Oct 2009 22:16:26 -0000
@@ -436,6 +439,242 @@ function _drush_bootstrap_drush() {
+  // BUG WORKAROUND:  if we use 'drush_locate_root' we will set 'uri' after 
+  // drush_bootstrap_drush, which works.  If we use drush_get_context('DRUSH_DRUPAL_ROOT'),
+  // then we set 'uri' after drush_bootstrap_drupal_root, which seems like it should work
+  // but doesn't.

Did you ever figure this out?

+++ includes/environment.inc	1 Oct 2009 22:16:26 -0000
@@ -436,6 +439,242 @@ function _drush_bootstrap_drush() {
+function drush_set_alias_record_element(&$alias_record, $key, $value) {

Please doc this functon and the ones below it.

This review is powered by Dreditor.

greg.1.anderson’s picture

Thank you for the thorough review. I'm away for the weekend, but I will be able to quickly take care of these issues sometime early next week.

- Greg

owen barton’s picture

Got a chance to review this from a functional point of view (I'll leave the backend change review to Adrian), and I think it looks really awesome.

Some thoughts and questions:

- I think --db-repair-script and --simple-repair should be dropped for now. I think regexing a dump is simply the wrong place to do these kind of fixes (experience with query rewriting has shown us that regexing SQL at all is a PITA). I think that we should load the database and then apply fixes. Drush should handle all fixes for Drupal core, and we can leave it to contrib to implement fixes for contrib modules. If we can get it working reliably, a table/field walker scattergun approach may be worth looking at - at least we can properly handle serialized data from this position.

- Can you describe how hard it would be to extend the aliases a little further, so that settings can (optionally) be parsed and loaded from a remote site. This would reduce the amount of configuration a fair amount, and make it less sensitive to (for example) monthly MySQL password changes. I am thinking that these could be overridden or added to by local configuration - so for example you could have an $options['site-aliases']['stage']['remote-config'] element that contains a command to run via ssh and return a configuration that will fill in any empty local elements. Done right, this could allow settings to be picked up from a static file (if you don't want to run drush on a server), or from the output of a remote drush sa command, or from some server wide script that is able to deduce a standard vhost/db layout and config based on the username. I don't think this should be part of the patch, but I think it would be good to identify where this could live, and put a TODO in there :)

- Another future use case to consider is having a step on the remote server to remove personal or confidential data before returning the database dump. The easiest way of doing this (for the reasons explained in my first comment) is to copy the requested database into a temporary schema and make the modifications there before dumping the data. Having this done by a drush command on the remote server (with a pluggable framework, so that modules can add their own cleansing routines, based on their knowledge of the data) is one approach, but there may be others. A simple way to approach this would be to have an option for specifying the remote database dump (e.g. mysqldump) command. It should then be straightforward enough for someone to write a script that parses the basic mysqldump parameters, does the copy, clean and then calls mysqldump for the temporary schema to the appropriate file name. Obviously more advanced patches could also talk to a remote drush instance directly. Does this approach sound feasible? If it is trivial enough it may be worth adding the option for the database dump command now.

- Yet another one (and these are all things that we actually need and do now with the pushdb script) to consider is syncing files directories. On multimedia sites these can end up stupidly big, and it is problematic and of little value to sync the entire thing. Hence, we have a script that lets you specify a maximum size of file you want to sync (date would likely be another parameter of interest), pulls down the list and passes it to rsync:

ssh w_${CLIENT}@${SITENAME}.${CADOTNET} "cd sites/${SITENAME}.${CADOTNET}/files; find -size -${FILESIZE} | grep -v civicrm/templates_c | grep -v ./js/ | grep -v ./css/" > files_lt_${FILESIZE}
rsync -avz --files-from=files_lt_${FILESIZE} "w_${CLIENT}@${SITENAME}.${CADOTNET}:sites/${SITENAME}.${CADOTNET}/files" files
rm -rf files_lt_${FILESIZE} files/js/* files/css/*

Anyway, apart from the first one I think we should NOT try and implement these now (in fact, I think some of them are probably separate contrib projects) - however I would be interested in folks thoughts on if there is anything in the current patch that would make these directions harder (it doesn't look like it from my brief read through the code), or anything simple we could change (e.g. adding a hook) to make them easier?

moshe weitzman’s picture

Update: removed a couple of items. Was editing wrong file.

Testing out the sitealias commands now ...

  1. Can we combine the two commands such that we do a listing when no arguments are supplied? Perhaps we can find a use for --long in sa.
  2. The help still refers to the 'alias' command. Use sitealias

Useful command IMO.

moshe weitzman’s picture

Testing the patch now. FYI to all testers - you need this patched version of drush on both sides if you are using sql sync and a remote site. Fortunately, you can use sql sync command to get it up there. Handy!

  1. Need to document the 'drush' element of a site alias. The example should point to the drush shell script, not to drush.php
  2. By default, it seems like the mysqldump on the remote site tries to save to the temp dir of the local site.
  3. Using sql sync in verbose mode, I am seeing harmless Notices "ob_flush(): failed to flush buffer. No buffer to flush". Coming from _drush_proc_open().
  4. I am seeing the whole sql dump echo to the terminal when it is finally imported. Does not depend on debug level. Kinda obscures log messages and such.
moshe weitzman’s picture

Responding to Owen's points.

1. I'm OK with dropping support for repair. You guys can decide it.
2. I think we already support 'remote configuration' as you describe. If you omit db-url in site spec, then we ssh to the remote server and run drush sql conf to get the db details. I don't really see the need to fetch from a plain text file. Also, lets remember that a drushrc.php can be in to source control and shipped to all members of your team. Thats a form of centralized configuration as well.
3. Yes, pre-dump cleanup is needed. I'm wondering though if we should support it in sql sync. That use case might be better handled with a cron job on the server that cleanses and saves the dump somewhere. Then this script can rsync down the dump and import using the --no-dump option. Just a thought.
4. That filesize restriction looks really handy. Too bad rsync can't do that natively.

moshe weitzman’s picture

I'm looking at the drush_main() change, and it looks like it is a way to run any command on a remote system. Is that right? But it does not accept a site alias? I'm missing something.

greg.1.anderson’s picture

Yes, you can run any command remotely with this mod (e.g. drush dev status will run drush status on the remote machine specified by the "dev" alias. Actually, "dev" does not have to be a remote machine; it can be another site on a multi-site install. (Aside: this effectively makes the "-l" optional, per Adrian's comments in #18 and #19.) The code to process the alias does not happen right there in drush_main, though; just like the other portions of the drush bootstrap, it happens elsewhere as the drush state is initialized.

The site alias is processed in the function _drush_process_site_alias(), which is called from _drush_bootstrap_drush() in environment.inc. This is one of the functions you already requested more comments for; I will provide them once I have a chance to sit down and address the issues raised since the last time I rolled the patch.

- Greg

greg.1.anderson’s picture

StatusFileSize
new6.87 KB

The task is longer, and my available time scarcer than I supposed; I'm still working on comments and factoring. In the meantime, here is the new sitealias command. With no parameters, it lists the known sites. With --long, it lists them as site specifications. With --full, it prints full alias records. Unlike the old 'alias print' command, sa with no parameters lists all known sites.

For factoring, I was considering making a new 'sync.drush.inc' file for 'sync' and 'db sync' (renamed from 'sql sync') to live. Let me know if you'd like to see that happen, or if you'd prefer that I keep 'sync' and 'sql sync' as they are.

moshe weitzman’s picture

File sync and db sync are sufficiently different in my mind that they should not be in same file. If each of those commands gets sufficiently complex, they can go into their own file like updatecode.inc ... Thanks for working on this.

greg.1.anderson’s picture

This one might be ready to go; please give it a try and let me know if you find any blocking issues. I still want to polish the site alias command, and I never did get rid of the SQL command echoing during sql sync, but "the perfect is the enemy of the good". I will continue to refine this code as time goes on.

IMPORTANT NOTE to those who have tried this code before: If you have any existing site aliases, please rename and !drush entries in your path aliases to !drush-script. I renamed this alias so that I could use !drush as the alias to the folder that drush lives in. If !drush is not defined, it defaults to the same path that drush is installed to on the current machine. Therefore, without doing much of anything, you can update drush remotely via:

drush sync \!drush stage:\!drush

I renamed sync to rsync and aliased it back to sync again. This prevented Drush's "magic dispatcher" from running both sync and sql sync when I factored them into separate files. There are probably other solutions that would have worked as well.

See my comment on ob_flush in the patch. I found the source of this, but was reluctant to change the file in case there was some side effect. The warning is innocuous.

Owen made some good points, but I didn't have time to address them. I agree with Moshe's comments. I didn't even take the time to remove the repair script. It would be okay if that disappeared; I'm not sure that I'm going to use it long-term. Short-term it is vaguely useful to me. The comments on these features should be shorter. Actually, I just noticed that I broke this feature. I switched to strstr as suggested, but this should have been str_replace (fixed in drush sync, but not here -- no time to remove this now).

I just backed out the feature that let you put an alias in the -l / -uri parameter.

All other issues commented above, I believe, are fixed.

- Greg

greg.1.anderson’s picture

Status: Needs work » Needs review
greg.1.anderson’s picture

StatusFileSize
new44.24 KB
new13.35 KB

Minor update:

1. re-rolled against latest commits
2. sql sync now quieter on import
3. used string_replace in prepare script

Also, some answers to a couple of questions that I missed above:

By default, it seems like the mysqldump on the remote site tries to save to the temp dir of the local site.

Yes, this was by design. If the temp dir of the local site is not in sync with the temp dir of the remote site, then you can set the dump file location in the site alias to avoid using the temp file feature. As a side benefit, this also allows you to use the cache feature, which is not available when using temp files. I could ask the remote machine to create the temp file, but that would require one more call to ssh (and one more password entry, if you don't have your credentials set up yet). I felt it wasn't really worth the effort to put this code in, considering the overhead.

the term 'slave databases' confused me here. Perhaps 'destination databases'?

A d7 'slave database' is a mirror of the main (default) database of the named database set. Usually, the slaves are set up to mirror at the database level, but if you haven't configured your database to do this yet, this drush feature will sync up all of the slave databases to the main by dumping the main and copying them over all of the slaves. This feature isn't super-critical, IMO.

Some new functions in this patch feel more like API functions that belong in drush.inc or a new sitealias.inc. If you disagree, then perhaps lets break sync out into its own file like updatecode.inc

So, I split out the larger functions into sitealias.inc, and the two commands went into rsync.inc and sync.sql.inc. The problem with my factoring is that it requires sql/sync.sql.inc to make a cross-sectional include to core/rsync.inc, which is inelegant (if not prohibited by policy). Moving the drush_core_call_rsync function to includes/drush.inc would take most of the contents of the rsync file with it. Thoughts?

Bail if user specified unrecognized option? Catches typos? Just a thought.

The reason I have individual options here instead of just using "extra" is that it allows site aliases to specify default options to rsync that can also be overridden on the command line. In order to detect unrecognized options, I would also have to be able to filter our drush global options, which means that the sync function would break every time more of these were added. I didn't think that was ideal.

Overall, though, I implemented these as I thought would probably be best. It would be nice to commit this so that my differences and conflict lists could be smaller. Please let me know if there's anything in these items, or anything else you need prior to doing that.

Thanks,

- Greg

moshe weitzman’s picture

Status: Needs review » Active

Committed with minor tweaks. Very nice work, Greg.

Here are my notes. I'd like to see folks tackle these in the coming days and weeks.

  1. Respond to Owen's comments in #72 and my replies in #74. Remove both repair script features.
  2. drush sa --full gives Fatal error: Call to undefined function drush_get_sitealias_record()
  3. new installcore command needs to use sitealias. moshe will do this one.
  4. remove sql load command. any objections? could we simplify our db_spec code as a result?
  5. remove slave databases feature
  6. document how to run remote commands. for now, i guess this goes to the README. ultimately, this feels like a `man` entry for drush. is that feasible and desireable? or maybe we add a way to read 'topics' which are helps texts not associated with a single command. Or we add this topic to our drush.api.php. So many choices.
  7. address the todos in the code like // TODO: IF FAILURE THEN ABORT
  8. More validation. for example, sql sync command currently expects mysqladmin and sed on the $PATH.
greg.1.anderson’s picture

Thanks!

1. I'll remove the repair scripts.

2. Doh. I was about to say 'works on my machine', but does indeed seem to be broken. I'll do a clean checkout and fix.

4. I leave this to you. I have no objections, but it will break people who use sql load. You might want to leave it in for a while to give people a chance to migrate to sql sync.

5. I will remove the slave db feature.

6. I promised to document this on drupal.org once the code was committed. Since it now is, I'll move this to my active to-do list.

7. Will do.

8. What sort of validation do you want to see? Potential solutions seem tedious; requiring an appropriate $PATH on the remote machine seems okay to me. I could put in the full path to the commands, but there is some chance that they may be installed in different locations on different machines. 'sed' should always be at '/bin/sed', but then again, 'bin' should always be on $PATH, unless $PATH is empty. How would you like to see this handled? Call each remote command via its full path? Hard code the path to common values, or assume that remote and local machines are about the same and use 'which sed' to find the path to sed?

- Greg

greg.1.anderson’s picture

StatusFileSize
new6.63 KB

Update: #1, #2 and #5 done in this patch. (Still owe you the other half of #1: a reply to Owen's issues.)

moshe weitzman’s picture

Committed. I also removed the docs for the two options related to prepare-script.

moshe weitzman’s picture

I checked in one minor bug fix. Here are a few observations:

  1. Instinctively, I put my first site aliases into my site's drushrc.php. It turns out that this does not work. You can only define sitealiases in your homedir or on a command line specified config file. In some ways, this makes sense. After we bootstrap the site, could we again check to see if user wants to do a remote command?. I would think that this happens right after drush_load_config('site').
  2. perhaps we should get rid of !drush and derive it from dirname(!drush-script)? there is probably a reason your didn't do that. feels a little odd to speify them both ... i agree that `drush sync \!drush stage:\!drush` is neat.
  3. I'm bummed that most people will not specify a !dump dir and thus will lose the benefits of the cache on the dump side and the rsync on the receiving side. What do you think about requiring a !dump directory, at least for the receiving side.
greg.1.anderson’s picture

  1. You are right; you cannot define site aliases in site config files. An earlier version of site aliases attempted to initialize aliases at every stage of the bootstrap. However, I changed it to the way it is now because aliases that change the drupal root after we bootstrap the drupal root would require us to run through the bootstrap process again. From a code standpoint, this is very awkward due to the way the drush bootstrap works today. Aliases in site config files that do not change the drupal root work, though; I could perhaps put in a secondary alias check after we bootstrap sites, and require that the drupal root not change if the current bootstrap phase is > DRUSH_BOOTSTRAP_ROOT. Then you could put in aliases to remote machines, or other sites in the same sites folder in site config files. (I sometimes have multiple drupal roots for isolation testing, but I imagine that for most people, multiple roots is probably rare.)
  2. You rarely need to specify !drush-script; if $PATH is correct, !drush-script defaults to 'drush', and if '!drush' is defined, then !drush-script defaults to '!drush/drush', which should always be correct. I could check to see if !drush-script is defined, and initialize '!drush' from 'dirname(!drush-script)' if it was not explicitly defined. The rest of the code's behavior would not have to change, and you wouldn't have to define both unless you had an exotic setup.
  3. What if I added a flag '--temp' that indicated that drush sql sync should use a temporary file? Then we could fail unless we had !dump or --temp. This 'nag feature' could get more people to initialize !dump. (Alternately, I could print a nag warning message instead of failing).
moshe weitzman’s picture

1. Yes, I would like the secondary alias check as you describe.

2. Fact is, your rarely need to specify !drush-script or !drush. I'm just looking out for all the type A people who want to fill out a full alias record and they will see both of these items and think WTF. I'm just thinking that we should only have !drush_script. Is !drush useful for anything besides rsyncing drush like in your example? If not, then perhaps folks can just make up their own path alias for this? Not sure. Perhaps we just do nothing for now.

3. A nag message instead of failing sounds good here.

greg.1.anderson’s picture

Regarding #2, how about if I change the code as I suggested in #88, and then either (a) change the documentation to emphasize that you do not need to specify both path aliases, or (b) remove the documentation for !drush. (b) sort of obscures the drush sync !drush trick, though...

greg.1.anderson’s picture

Edit: Question removed; I thought of a better way to do it.

greg.1.anderson’s picture

StatusFileSize
new9.75 KB

Edit: Do not apply this patch; it is superseded by #94 below.

Here is a patch for your consideration. Site aliases defined in site config files work, but only if they have the same drupal root as the site the config file appears in. This limitation is caused by the fact the it is not possible to bootstrap the drupal root twice.

Edit: This limitation is difficult to overcome because the function "conf_path()" is pulled in by the drupal root bootstrap; if you don't bootstrap the drupal root, then you can't load additional drush settings from the drupal site folder, so once you have bootstrapped to the site, you cannot switch to a new drupal root.

This patch is a little rough; I can clean it up a bit if you want to keep it, but I want to know what your thoughts are first.

- Greg

greg.1.anderson’s picture

Edit: Ignore this; the solution in #94 is better. Keeping this here for reference, though.

Here is an alternate idea on how to solve the site alias problem.

  • Move the 'locate drupal root' operation to the drush bootstrap phase (put off the rest of the drupal root bootstrap operations until later)
  • Find all of the drushrc.php files anywhere under the drupal root
  • Include each one, and array_merge the site aliases from $options into the site aliases in the drush contect
  • unset $options at the end

The correct site drushrc.php file would still be found and included during the site bootstrap.

The advantage of this solution is that all aliases would work cleanly (e.g. you can set the drupal root in a site configuration file).

The disadvantage of this solution is that it would introduce a new limitation to site configuration files, as you would now have problems if you defined functions or introduced other side effects besides setting $options in one of these files. While I expect this to be rare, any drush user doing this today would be hurt by this change.

Please let me know which way you think it better: #92, #93, leaving it unfixed, or perhaps some other technique I haven't thought of yet.

- Greg

P.S. I also considered loading site aliases from a new, different configuration file stored in each site folder, following the techniques above (call this "#93b"). This is, perhaps, counterintuitive, as people will expect to put their aliases in drushrc.php. But maybe we do -both- #92 and #93b, with the warning message in #92 providing the name of an alternate file that the alias can be placed in...

greg.1.anderson’s picture

Wait, I've got it.

The code will proceed as in #92. If the alias changes the drupal root, then I will set the 'remote-host' option to 'localhost', forcing the command down the remote-execution code path in drush.php. I will special-case 'remote-host' == 'localhost' just before calling drush backend invoke, and substitute a local exec instead. If the drupal root does not change, then I'll re-bootstrap the drush site as #92 currently does.

That should work cleanly. Give me some time to code that up; I'll post it here and set the status of this incident to 'needs-review' when it's ready.

- Greg

greg.1.anderson’s picture

Status: Active » Needs review
StatusFileSize
new10.16 KB

Here is the patch for #94. It only differs from the patch in #92 by a few lines, but it clears up all of the open issues from above nicely and works well.

Please review; I recommend committing if you find no issues with it.

- Greg

moshe weitzman’s picture

Ideally, the code comment explains WHY we do this. What we do is well explained there already - thanks.

moshe weitzman’s picture

greg.1.anderson’s picture

Finally replying to Owen's comments in #72...

  1. Per Moshe, the repair scripts have been backed out as you suggested. I think this is for the best.
  2. If you take a look at the code that fetches the db_url from a remote machine (see drush_sitealias_get_record in includes/sitealias.inc), you will see the mechanism where we can fetch a php object from a remote drush instance using drush backend invoke. The invoked command uses drush_backend_set_result to pass the php object result back (see drush_sql_conf for an example), and the caller obtains the object from $values['object'], where $values is the result from backend invoke. It would be easy to use this same mechanism to load settings from a remote machine. I would put this facility in the Drush bootstrap routine _drush_bootstrap_drush just before the call to drush_sitealias_check_arg. You would need a new drush command to call on the remote machine that would return either the value of the $options variable from one config file, or perhaps the entire $cache object from context.inc. It wouldn't take too much for the receiver to then integrate either $cache or $options, as appropriate. I think I'd go with $options, as there might be too much in the whole $cache. It would probably only take a dozen or two lines to implement. However, I think that Moshe's suggestion -- to just check a drushrc.php into svn -- is better. Another option would be to get yourself started via "drush --remote-host=somewhere.com sa --full >> $HOME/.drushrc.php", and hand-edit from there. Also, as previously mentioned, the code will already fetch the db spec from the target machine if it is not included in the alias record, and this much is probably sufficient for most purposes (automatically sync with database name / password changes). So, in summary, I don't think I will pursue this route (fetching remote options every call), but it wouldn't be hard to implement if someone wanted to do it--in which case I'd be happy enough to help.
  3. I agree with Moshe on the issue of database fixup. For my part, I'd like to add that if you have a security concern about the contents of your database, it would be better to have a web service or cron interface that would clone the secure database to a work database on the same machine (perhaps via drush sql sync) and then fix up the records via the SQL interfaces. You could then allow developers to drush sql sync directly from the work database, or (perhaps more securely) sql dump the work database and only give access to the dump for import. This can be remotely imported via the --no-dump option of drush sql sync as Moshe previously suggested.
  4. Regarding telling rsync to skip large files, if it were me, I think I would want to place the large files in a particular path inside of the files section, and use the built-in rsync facility to skip files at a given path. I realise that this isn't necessarily realistic for existing file hierarchies, or in instances where users or schemas dictate where the large files are stored, but that would be the ideal situation. If this feature was widely desired, my preference would be to see it enhanced inside of rsync, but I have no insight into the rsync code complexity or the feasibility of making a contribution to that project. Because drush sync will pass through most rsync options, you could always continue to use your existing script above, but replace the call to rsync with a call to drush sync, and replace the ssh / find call with a remote drush call to a new drush command that executed the 'find' command for you. That would only be a little work, and you could then use your drush aliases to address your remote machines. Offhand, I don't think this should be an integral feature of drush sync, though.

Thank you, Owen and Moshe and Adrian for all of your feedback on this drush feature. The open issues are starting to wind down and I'm starting to make new incidents for some of the minor items due to the length of this thread. All in all, it's nearly "done", and the result is clearly much better than it would have been without all of your great assistance. Thank you so much for your help.

- Greg

moshe weitzman’s picture

Status: Needs review » Needs work

Review of most recent patch

  1. can we give 127.0.0.1 the same treatment as localhost? due to some networking problem on my mac, i have to use 127.0.0.1 on all my sites.
  2. typos: srupal, Additionaly
  3. Uppercase first letter and append period: re-bootstrap drupal site

I am satisfield with greg's replies to owen. I do think drush should tackle the 'omit large files' feature request. I have no particular opinion on how and where that should be done. Not urgent.

greg.1.anderson’s picture

Title: Remote site aliases for Drush (Proposal and preliminary implementation) » Remote site aliases for Drush (Proposal and implementation)
Status: Needs work » Needs review
StatusFileSize
new10.67 KB

Here is a new patch that addresses issues 1 - 3 in #99, above. It also fixes a bug with site aliases in site config files that change the drupal root; I'm not sure if it never worked and I just didn't notice, or if I broke it with some other change, but it definitely works here.

If I come up with a good idea for the 'omit large files' request, I'll submit a new issue.

moshe weitzman’s picture

It is OK to always do drush_set_option('remote-host', 'localhost'); even if local site prefers 127.0.0.1?

greg.1.anderson’s picture

Moshe,

In the instance you mention, the 'localhost' setting will never be shown to the local machine's network; it will be special-cased in backend invoke, and converted into a direct call to drush rather than an ssh-wrapped call to drush. The change you requested above, to also respect 127.0.0.1 in backend invoke, will only come into play should someone call drush with "--remote-host=127.0.0.1". There's no motivation to do that by hand, but I figured it might help if someone had a script that called drush with --remote-host set to one of its parameters, which might be 127.0.0.1 if that's what you habitually use on your local machine.

- Greg

greg.1.anderson’s picture

StatusFileSize
new13.66 KB

There's always a better way...

This patch is an improved version of the previous patch. I realised that I was checking for localhost in the wrong part of backend invoke, so I was not correctly testing against localhost in all parts of the function. This is now corrected, and I also factored out the localhost test to a function in environment.inc (moved from rsync.inc).

This fix unfortunately exposed an existing bug already discussed in #586466: Drush.php is no longer directly executable (can interfere with backend invoke). I put in my proposed fix for that problem in another patch that I attached to that incident; that fix is not included here. These two patches do not conflict; the patch attached to this issue will work on your system if drush_find_php is working on your system.

moshe weitzman’s picture

Status: Needs review » Fixed

Committed. Thanks.

moshe weitzman’s picture

OK, sounds good. Committed #100.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

dman’s picture

Component: Code » Documentation

For review:
I documented my own first-time walkthrough, based on what I could deduce from the --help info and (well-documented) example.drushrc to bring some light onto this work. It's all pretty cool.

Using drush to syncronise and deploy sites.

PLEASE correct or expand on anything I've written there!

ian-g’s picture

This is an excellent solution to deploying dev/staging/live sites, but I have a couple of issues:

1) I don't want some tables to go up from my dev site to my staging or live sites. These tables are the users tables (I have different administrator logins for different instances).
2) In order to get my environment working, I had to supply the db_url for all of my local and remote sites within the site_aliases array.

I'd really like to be able to keep my user access table intact depending on which site I am on. I realise that this may not be applicable to everyone, so an optional --exclude-tables parameter would be useful. User permissions aren't really an issue for me, as I only have a couple of users, and I can create them on each instance, but they might be for others, hence the optional --exclude-tables param.

Ian.

dman’s picture

I've found that the drush-rsync wrapper goes to great lengths to allow you to use any normal rsync options on the commandline, and passes them through when it actually calls rsync.
I imagine the drush-sql command could do the same (if it doesn't already) and pass any number of sql options through as it goes.
... you'll want to post this as a new feature request issue though, this thread is done