I keep getting migration failed.

Could not back up sites directory for drupal <----- This is where it fails
Returned from hook drush_provision_drupal_provision_backup
Removed stale backup file /var/aegir/backups/oldsite-20150728.204529.tar.gz <---- yet is is still able to remove the backup:

I am totally confused on this one.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

mshepherd’s picture

I'm seeing this same problem with a site I've imported from a previous Aegir version (version 1.x >> 3.x). I can neither backup now migrate the site.

drush provision-backup
Changed permissions of /var/aegir/platforms/.../t7y.illuminateweb.org.uk/settings.php to 640      [success]
Generated config in write(): Drupal settings.php file                                                           [success]
(/var/aegir/platforms/.../t7y.illuminateweb.org.uk/settings.php)
Changed permissions of /var/aegir/platforms/.../t7y.illuminateweb.org.uk/settings.php to 440      [success]
Change group ownership of /var/aegir/platforms/.../t7y.illuminateweb.org.uk/settings.php to       [success]
www-data
Platforms path /var/aegir/platforms exists.                                                                     [success]
Platforms ownership of /var/aegir/platforms has been changed to aegir.                                          [success]
Platforms permissions of /var/aegir/platforms have been changed to 755.                                         [success]
Platforms path /var/aegir/platforms is writable.                                                                [success]
Changed permissions of /var/aegir/platforms/.../t7y.illuminateweb.org.uk/settings.php to 640      [success]
Generated config in write(): Drupal settings.php file                                                           [success]
(/var/aegir/platforms/drupal-7.32/sites/t7y.illuminateweb.org.uk/settings.php)
Changed permissions of /var/aegir/platforms/.../t7y.illuminateweb.org.uk/settings.php to 440      [success]
Change group ownership of /var/aegir/platforms/.../t7y.illuminateweb.org.uk/settings.php to       [success]
www-data
Platforms path /var/aegir/platforms exists.                                                                     [success]
Platforms ownership of /var/aegir/platforms has been changed to aegir.                                          [success]
Platforms permissions of /var/aegir/platforms have been changed to 755.                                         [success]
Platforms path /var/aegir/platforms is writable.                                                                [success]
***Could not back up sites directory for drupal                                                                    [error]***
Removed stale backup file /var/aegir/backups/t7y.illuminateweb.org.uk-20150805.105352.tar.gz                    [success]
Deleted mysql dump from sites directory                                                                         [success]

It seems likely this is a permissions issue, but I can't figure it out just yet.

mshepherd’s picture

I solved my issue.

There were a number of files in .../private/temp with www-data:www-data ownership and 600 permissions. These files weren't present in the imported site so they must have been created after import. Changing permission on the files to 660 meant that I could backup and migrate the site.

.../private/temp# ls -la
total 44
drwxrws--- 2 aegir    www-data  4096 Aug  5 11:16 .
drwxrws--- 5 aegir    www-data  4096 May  2  2014 ..
-rw------- 1 www-data www-data 11732 Aug  5 10:08 fileA6VuET
-rw------- 1 www-data www-data  2175 Aug  5 10:08 fileHJph6P
-rw------- 1 www-data www-data     0 Aug  5 10:08 filejWqd5s
-rw------- 1 www-data www-data  7864 Aug  5 10:08 fileK6I4Bw
-rw------- 1 www-data www-data    76 Aug  5 10:08 fileKUZ87c
-rw------- 1 www-data www-data    51 Aug  5 10:08 filewe8EA9
-r--r--r-- 1 aegir    www-data   491 Aug  5 10:01 .htaccess
.../private/temp# chmod 660 file*
.../private/temp# ls -la
total 44
drwxrws--- 2 aegir    www-data  4096 Aug  5 11:16 .
drwxrws--- 5 aegir    www-data  4096 May  2  2014 ..
-rw-rw---- 1 www-data www-data 11732 Aug  5 10:08 fileA6VuET
-rw-rw---- 1 www-data www-data  2175 Aug  5 10:08 fileHJph6P
-rw-rw---- 1 www-data www-data     0 Aug  5 10:08 filejWqd5s
-rw-rw---- 1 www-data www-data  7864 Aug  5 10:08 fileK6I4Bw
-rw-rw---- 1 www-data www-data    76 Aug  5 10:08 fileKUZ87c
-rw-rw---- 1 www-data www-data    51 Aug  5 10:08 filewe8EA9
-r--r--r-- 1 aegir    www-data   491 Aug  5 10:01 .htaccess
mshepherd’s picture

I note that after the site was migrated, the files in that directory had aegir:www-data ownership and 660 permissions.

ergonlogic’s picture

The issue described in #2, which is by far the most common cause of backups failing, is due to a bug in Drupal core; see #2496173: file_unmanaged_save_data() doesn't clean up its temp files. The patches in that issue (for both D7 and D8) resolve this one, but must be applied to hosted platforms. Consider adding them to your makefiles.

FWIW, the backup task failure in such cases is intentional. Previously, errors in tarring up site files would be masked by a pipe in our system call. While this would be relatively harmless in the case of broken file permissions (it'd still block platforms from being deleted), it could also lead to corrupt backups.

Let us know if this fixes the issue for you.

mshepherd’s picture

Thanks,
I'll need to migrate several more sites in the coming days or perhaps weeks. I should a chance to test this out. Many thanks.
Matthew

ergonlogic’s picture

Component: Debian package » Code
Status: Active » Postponed (maintainer needs more info)

@SocialNicheGuru, can you let us know if changing the permissions/ownership of the site's files help here? Feel free to ping me in #aegir, if you'd like some help checking this out.

SocialNicheGuru’s picture

Status: Postponed (maintainer needs more info) » Active

I did a chown -R aegir files
There were a number of files that I could not change ownership on
For those files I did a sudo chmod 775
but it didn't work for my tmp directory.
I had to delete all files in the private/files/tmp directory on the private file system for it to work :(

dnotes’s picture

In my case there were other files that were not readable or writable by the user. (I'm trying to use boa aegir on a local vagrant box with the static platforms folder shared via nfs and then fused into the file system with forced permissions, and it's a bit messy for file permissions.)

Anyhow, if it turns out that the files/tmp directory is not your problem, you may find other files in your sites folder using e.g. find . ! -perm -u+r, or for files that are not assigned to the correct user, find . ! -name o1. Nice to know that this seems to be mainly a file ownership/permissions problem. Thanks for this issue.

omega8cc’s picture

We have reverted (in BOA HEAD) the patch from #2377819: Gzipping backups suppresses file permissions errors which caused just too many pseudo-problems / support tickets for us.

EDIT: I have explained why the fix is too aggressive and causes more problems than solves in this comment.

MrAdamJohn’s picture

As noted in #2 above, chmod 660 [path]/private/temp worked perfectly.

Is there work being done to carry the underlying bug fix into provision? Ping me off thread if needed.

Thanks, @msheperd and @ergonlogic!

helmo’s picture

Status: Active » Closed (works as designed)

I think the catch all solution here will be in #2616426: Add 'fix permissions' task

colan’s picture

Title: Cannot migrate sites: Could not back up sites directory for drupal » "Could not back up sites directory for drupal " due to permissions on temp files
Status: Closed (works as designed) » Needs review
FileSize
879 bytes

I don't believe there's any reason to back up temporary files. We can avoid the permissions issues altogether if we simply exclude the temporary files directory from each backup.

This has the happy side effect of reducing the size of backups (even if the permissions are correct).

Can we not simply do something like this?

bgm’s picture

+1 on the patch. It's a bit annoying to 1) run backup, fail, 2) run verify, 3) run backup again, hoping no files were created in the mean time.

colan’s picture

There's some debate about where the exclude option should go. I went with the one that worked for me, but didn't want to commit this yet until we're fairly sure it'll work in the general case. Otherwise, it'll break backups in other places.

So please review and test on your own systems.

colan’s picture

After letting automated backups run for a while, and inspecting the contents to ensure nothing was missing, I noticed that we're also archiving CSS and JS cache files as well as site-specific Git repositories. None of these should be included so I've added them to the exclusion list. Here's the patch for that.

For the site I was reviewing, the backup shrank from 27M to 9.3M. This makes sense because the additional exclusions are mostly already compressed so they can't be shrunk much more and don't see additional shrinkage (unlike the DB).

bgm’s picture

If we migrate a site from one platform to another, wouldn't this be equivalent to deleting the local git repo?

Request: Would it be possible to add "files/civicrm/templates_c" to the list? It's a smarty cache from CiviCRM. Admittedly it's not Drupal-specific, but there are many Aegir CiviCRM users, and it would really help as well.

colan’s picture

Status: Needs review » Needs work

Good catch re: Git repo. We need to remove that one.

For the Civi thing, it might make sense to add a hook here, and then modules can add their own exclusions. There will be more.

colan’s picture

This should account for both of the above items.

bgm’s picture

Looks good, based on code review. Thanks for implementing the hook!

helmo’s picture

Status: Needs review » Reviewed & tested by the community

Works as expected in a quick test.

colan’s picture

Minor side effect: When cloning (and probably migrating), this is causing some warnings to show up because the temp directory is missing. This isn't actually a problem because it gets created later in the process. So maybe we should stop issuing these?

chgrp(): No such file or directory FileSystem.php:451
Failed calling chgrp() on /var/aegir/platforms/platform1/web/sites/site1/private/temp.
filegroup(): stat failed for /var/aegir/platforms/platform1/web/sites/site1/private/temp FileSystem.php:227
Could not change group ownership of temp files in /var/aegir/platforms/platform1/web/sites/site1/private/temp to www-data (chgrp to www-data failed on /var/aegir/platforms/platform1/web/sites/site1/private/temp)

For context, it happens right after this:

Changed group ownership of private files in /var/aegir/platforms/platform1/web/sites/site1/private/files to www-data

  • colan committed 9dd5635 on 7.x-3.x
    Issue #2542236 by colan: Excluded directories that shouldn't be in site...
colan’s picture

Status: Reviewed & tested by the community » Active

Committed #18 as there were no reported problems a while after RTBC.

Setting back to Active for #21.

memtkmcc’s picture

That's correct -- Since private/temp is no longer included in backups and only recreated later during migration, this check needs to be removed to not cause confusion due to (otherwise harmless) warning in the task log.

memtkmcc’s picture

Status: Active » Needs review
colan’s picture

But if it does exist, shouldn't the group still be changed?

memtkmcc’s picture

1. It will not be included in any backup anymore, so not sure how it could exist there?
2. Its group will be changed a moment later in function _provision_drupal_create_directories anyway
3. This check doesn't change anything, it's just a check, no longer relevant, at least no longer relevant directly after the archive is expanded and before function _provision_drupal_create_directories is run.

EDIT for #3 -- This check doesn't change anything, because the directory doesn't exist at this point, so it's just a check for the directory existence, which obviously fails.

colan’s picture

That explanation works for me.

memtkmcc’s picture

I realised my point #3 was not clear enough, so added this:

EDIT for #3 -- This check doesn't change anything, because the directory doesn't exist at this point, so it's just a check for the directory existence, which obviously fails.

colan’s picture

Status: Needs review » Fixed

Looks like this was fixed in 4d870e1c2df3d204c38437e98a4624dccc037db5.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.