Using hostmaster 1.7

reproduce

* Created server sles0045 and verified it
* Created server sles0044 and verified it
* Created platform d7sles0045 as /var/aegir/platforms/sles0045/drupal-7.12/ and verified it
* Installed web1.domain.com to sles0045 server using the d7sless0045 platform and minimal installation profile
* Created platform d7sles0044 as /var/aegir/platforms/sles0044/drupal-7.12/ and verified it
* Try to migrate web1.domain.com to sles0044 platform

It fails to error /var/aegir/platforms/sles0045/drupal-7.12/sites/web1.domain.com could not be removed from remote server sles0045. Changes might not be available until this has been done. (error: )

Full log here: http://pastebin.com/t6vSb2Xj

The rollback itself is not fully working, because now I have nothing working. All files are moved now to sles0044 server (the new platform), but aegir still thinks the web1.domain.com is using d7sles0045 platform in the sles0045 server. See the screenshots. Sles0045 is listing only settings.php.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Steven Jones’s picture

Title: Migrate can't delete settings.php » Migrate rollback shouldn't happen after a certain point
Project: Hostmaster (Aegir) » Provision
Version: 6.x-1.7 » 6.x-1.x-dev
Category: bug » feature

I'm going to re-purpose this issue, during a migrate, we get past a point of no-return and at this point we should just blunder on I think, rather than aborting, because this causes the frontend to get out of sync.

wroxbox’s picture

@StevenJones: Correct. Like in the situation I am having the rollback actually brakes more than fixes.

Steven Jones’s picture

I'm not 100% sure how we should go about fixing the issue, but I think the steps would look something like this:

  1. Identify the 'point of no return' after which even if we fail, we have to plough on with the migrate.
  2. Put some mechanism in place for capturing errors that occur after this point, and instead of them causing the Drush command to fail, the errors should be logged, but the drush_set_error should be cleared.
  3. We need to probably make sure that Drush doesn't invoke the rollback hooks once we're past a certain point.
  4. We might want to factor out the deleting of the old site into a drush command of its own, so that we can capture the error in that command, and ignore it if we want.
  5. The frontend should probably still report the command as having failed, but it should also succeed, as generally we'll have failed to delete the old site, not failed to migrate.
pooja.sarvaiye’s picture

I was not able to reproduce this issue with the steps mentioned above. I could migrate a site from one server platform to another server platform successfully. Can you provide more information which might help in reproducing the bug?

Steven Jones’s picture

Probably the easiest way to reproduce is to change a file within Drupal files directory of the site you're migrating to be owned by 'root', and make it read only, then the deletion of the old site should fail I think.

pooja.sarvaiye’s picture

Making settings.php owned by root before starting migration caused migrate rollback with descriptive error messages like this:
Could not change permissions of /var/aegir/platforms/aegir-master-D7/sites/site3-am2.example.com/settings.php to 640 (chmod to 640 failed on /var/aegir/platforms/aegir-master-D7/sites/site3-am2.example.com/settings.php)
May be the bug incurred due to file permission change while migrate was in process.

Steven Jones’s picture

Yeah, that's not quite what I said, because this would cause the migrate to be rolled back correctly I think.

You will need to create the file in the Drupal site's files directory.

Steven Jones’s picture

Version: 6.x-1.x-dev » 6.x-2.x-dev

Would be nice to get this done in 6.x-2.x

ergonlogic’s picture

Version: 6.x-2.x-dev » 7.x-3.x-dev
Issue summary: View changes

New features need to be implemented in Aegir 3.x, then we can consider back-porting to Aegir 2.x.