Hi,

This is a bit of an awkward bug report because I can only say some functionality is broken in Drush 4.1 and 4.0 that works in 3.3 :/

When I do a remote server verify, in Aegir, Provision syncs some parts of the filesystem to the remote server with calls to Drush rsync.

In Drush 4, this yields an error that it couldn't sync the very first file:


aegir:~/config$ /var/aegir/drush/drush.php @server_aegirweb1mig5labsnet provision-verify --debug
(snip)
Executing: rsync -e 'ssh ' -azv --exclude=".bzr" --exclude=".bzrignore" --exclude=".bzrtags" --exclude=".svn" --stats --progr
ess --relative --keep-dirlinks --omit-dir-times --delete '/var/aegir/config/includes' 'aegir@aegir-web1.mig5-labs.net:/'
  sending incremental file list
  /var/aegir/config/includes/
  Number of files: 5
  Number of files transferred: 0
  Total file size: 28 bytes
  Total transferred file size: 0 bytes
  Literal data: 0 bytes
  Matched data: 0 bytes
  File list size: 130
  File list generation time: 0.001 seconds
  File list transfer time: 0.000 seconds
  Total bytes sent: 204
  Total bytes received: 19

  sent 204 bytes  received 19 bytes  446.00 bytes/sec
  total size is 28  speedup is 0.13
/var/aegir/config/includes could not be synced to remote server aegir-web1.mig5-labs.net. Changes might not be    [error]
available until this has been done. (error: sending incremental file list
/var/aegir/config/includes/

Number of files: 5
Number of files transferred: 0
Total file size: 28 bytes
Total transferred file size: 0 bytes
Literal data: 0 bytes
Matched data: 0 bytes
File list size: 130
File list generation time: 0.001 seconds
File list transfer time: 0.000 seconds
Total bytes sent: 204
Total bytes received: 19

sent 204 bytes  received 19 bytes  446.00 bytes/sec
total size is 28  speedup is 0.13) [0.28 sec, 9.41 MB]

Yes, my SSH keys are set up just fine between the servers.

The thing is: it *did* sync these dirs and the file inside it to the remote.

If I run that rsync command manually from the CLI (so no Drush), it also works fine.

If I revert back to Drush 3.3, it also works fine!

Executing: rsync -e 'ssh ' -azv --exclude="*.svn*" --stats --progress --relative --keep-dirlinks --omit-dir-times    [notice]
--delete '/var/aegir/config/includes' 'aegir@aegir-web1.mig5-labs.net:/' [0.11 sec, 7.63 MB]
  sending incremental file list

  Number of files: 5
  Number of files transferred: 0
  Total file size: 28 bytes
  Total transferred file size: 0 bytes
  Literal data: 0 bytes
  Matched data: 0 bytes
  File list size: 130
  File list generation time: 0.001 seconds
  File list transfer time: 0.000 seconds
  Total bytes sent: 163
  Total bytes received: 16

  sent 163 bytes  received 16 bytes  358.00 bytes/sec
  total size is 28  speedup is 0.16
/var/aegir/config/includes has been synced to remote server aegir-web1.mig5-labs.net. [0.26 sec, 7.64 MB]            [notice]

Finally, this isn't something 'special' related to /var/aegir/config/includes: That just happens to be the first set of dirs in an array of several others that attempts to be synced, so it's just borking at the very first one and exits (despite syncing quite successfully!)

I am happy to try and debug this further, all I know is it works in Drush 3.3 where it does not in Drush 4.0 or 4.1. Any tips where I could dig deeper? Something to do with an incorrect parsing of rsync's exit code or something?

Comments

mig5’s picture

Reverting this commit fixes it, in Drush 4.1:

http://drupalcode.org/viewvc/drupal/contributions/modules/drush/commands...

Don't really know why though :(

moshe weitzman’s picture

Assigned: Unassigned » greg.1.anderson
Priority: Normal » Major
greg.1.anderson’s picture

This function is documented to return TRUE on success, FALSE on failure, and that is what it does. I downloaded drush-4.0, and it appears to be consistent with drush-HEAD.

So, this function returns a different result code than it did in drush-3.x. All I can say about that is that I suck; I should not have made that change. Unfortunately, this API change was made in drush-4, so it should stay as it is until drush-5. By that point in time, I think it would be better to keep it as-is, rather than go back to the drush-3 definition.

Sorry that this is inconvenient for Aegir, but I recommend that you just change your test to match what drush-HEAD returns today.

mig5’s picture

Hi,

OK, thanks for that..

I guess I don't really know what we're meant to do. Something in Drush 4 reports a success as a failure: that's all I can tell.

It is hard for me to know what to fix in Aegir because I can't find any logic that suggests we need to return a different result. This is our code in question:


        if (drush_core_call_rsync(escapeshellarg($path), escapeshellarg($this->script_user . '@' . $this->remote_host . ':/'), $options, TRUE, FALSE)) {
          drush_log(dt('@path has been synced to remote server @remote_host.', array('@path' => $path, '@remote_host' => $this->remote_host)));
        }
        else {
          drush_set_error('PROVISION_FILE_SYNC_FAILED', dt('@path could not be synced to remote server @remote_host. Changes might not be available until this has been done. (error: %msg)', array('@path' => $path, '@remote_host' => $this->remote_host, '%msg' => join("\n", drush_shell_exec_output()))));
        }
      }

Is there something obvious here that we need to change? I just don't really understand why the call to drush_core_call_rsync is false here when it used to work. We send a sync, it actually succeeds in transferring the data, and yet the error-side of the conditional is reported.

I know it's not from any recent change in this code in Aegir either, as I just tested our last release with Drush 4.1 and get the same result.

Feel free to not answer since it isn't your code :) i'm just somewhat desperate since the people that really understand this part of the code don't work on Aegir anymore.

Otherwise just mark this as won't fix I suppose.

mig5’s picture

It baffles me that I can just change

return $exec_result == 0;

to

return $exec_result;

and it works as expected.

I guess my PHP is not as good as I thought it was, because I just don't see how it's coming back FALSE. Unless somehow the drush_shell_exec() ($live_output is false for us) doesn't expect to get a TRUE return (?!) . Please do go ahead and close this and I'll look into how we should be handling it.

rcross’s picture

as far as I can tell, you *should* change it back to

return $exec_result;

otherwise you're inverting the logic (and returning the results of that comparison, in which case it should at least be using "FALSE" instead of 0)

greg.1.anderson’s picture

The good news is that the Aegir code you quote is written correctly, so perhaps there is a bug here after all.

From drush_op_system:

 * @return
 *   The result code from system():  0 == success.

From drush_shell_exec:

 * @return
 *   TRUE on success, FALSE on failure

Looks like there is in fact an inversion of the result code here. I'll have to spend some time and make sure that drush is not checking for the wrong result code when calling this function, etc.

mig5’s picture

We have a similar issue related to conditionals depending on TRUE/FALSE reported over in #1047922: Drush 4.x and 5.0-dev/HEAD breaks Aegir in some (hidden) ways - paths in files not rewritten on clone/migrate/rename, again that are apparently mitigated by reverting to Drush 3.3.

Unsure what the issue in this one is though, just dropping it here in case it starts to emerge any patterns/clues.

greg.1.anderson’s picture

The changes mentioned in #6 (half right) and #7 can be made, as it does not appear that drush is very good at checking the result code from drush_core_call_rsync at all.

I will make some efforts to insure that drush is more consistent about checking for its own error codes; this will insulate downstream components from spurious errors (if they are found and fixed here first).

greg.1.anderson’s picture

Status: Active » Fixed

Committed a fix: http://drupal.org/cvs?commit=494608

Please re-open if there are still issues with drush rsync and Aegir.

moshe weitzman’s picture

Version: All-versions-4.1 » All-versions-4.x-dev
Status: Fixed » Patch (to be ported)
mig5’s picture

I can confirm this appears to have fixed it, tested with a fresh pair of servers and Drush straight from CVS.

Thanks Greg!

greg.1.anderson’s picture

Just a note: it appears that, per #1047922: Drush 4.x and 5.0-dev/HEAD breaks Aegir in some (hidden) ways - paths in files not rewritten on clone/migrate/rename, Aegir HEAD is now compatible with drush HEAD again.

mig5’s picture

Yep.

Our plan, discussed in IRC the other day, was to 'formally' shift our recommendation to use the next Drush 4 release over 3.3. If the next Aegir release occurs before the next Drush release, we'll install 3.3 by default still, just in case we have missed any other incompatibilities.

greg.1.anderson’s picture

Keep us posted on when you want to release Aegir; I think it would be worthwhile to push out 4.3 just to get Aegir on the 4.x branch.

mig5’s picture

Hi Greg,

I am planning on releasing aegir 0.4-rc1 today (it's 9am friday here).

We're testing Drush HEAD now with our HEAD.

Edit: have had good results upgrading to our HEAD with your HEAD by multiple testers :)

omega8cc’s picture

Just tested install and upgrade Aegir HEAD -> HEAD with Drush 5.0-dev/HEAD and it just works, no issues! :)

greg.1.anderson’s picture

Darn; I don't think we'll get 4.3 out today. I'm glad that Drush HEAD and Aegir HEAD are working together, though.

mig5’s picture

No problem - we released 0.4-rc1 anyway. It's likely there'll be another rc soon enough as people find bugs, so we'll sync up then :)

Cheers!

msonnabaum’s picture

Status: Patch (to be ported) » Fixed

Backported to 4.x. Please test!

http://drupal.org/cvs?commit=501898

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.