An Aegir installation was backing up to /var/aegir/backups (hey, I didn't set it up - though I suspect this may be default config anyway) and consequently running low on diskspace.

Eventually it attempted a backup when the disk was full. Part of the backup process is to temporarily rewrite the settings.php ... but you can't do that on a full disk. So site(s) went down when backups ran.

Initialized Drupal site example.org at sites/example.org
Loading drushrc "/var/aegir/platforms/drupal-7.12/sites/example.org/drushrc.php" into "site" scope.
Drush bootstrap phase : _drush_bootstrap_drupal_configuration()
Adding sites directory to /var/aegir/backups/example.org-20120406.104002.tar.gz
Temporarily uncloaking database credentials for backup
Template loaded: /usr/share/drush/commands/provision/platform/provision_drupal_settings.tpl.php
Changed permissions of /var/aegir/platforms/drupal-7.12/sites/example.org/settings.php to 640
Generated config Drupal settings.php file
Changed permissions of /var/aegir/platforms/drupal-7.12/sites/example.org/settings.php to 440
Change group ownership of /var/aegir/platforms/drupal-7.12/sites/example.org/settings.php to www-data
Re-cloaking database credentials after backup
Template loaded: /usr/share/drush/commands/provision/platform/provision_drupal_settings.tpl.php
Changed permissions of /var/aegir/platforms/drupal-7.12/sites/example.org/settings.php to 640

then

file_put_contents(): Only 0 of 5022 bytes written, possibly out of free disk space provision.file.inc:430
Could not generate Drupal settings.php file

uh-oh!

Changed permissions of /var/aegir/platforms/drupal-7.12/sites/example.org/settings.php to 440
Change group ownership of /var/aegir/platforms/drupal-7.12/sites/example.org/settings.php to www-data
Could not back up sites directory for drupal
Removed stale backup file /var/aegir/backups/example.org-20120406.104002.tar.gz
Changes made in drush_provision_drupal_provision_backup have been rolled back.
Deleted mysql dump from sites directory
Changes made in drush_db_pre_provision_backup have been rolled back.
Command dispatch complete

Filing against Hostmaster because I'm unsure of correct project.

Comments

xurizaemon’s picture

Verify could write the replacement settings.php to asdf.php first then move into place; that would give us an exit strategy if the write doesn't complete.

Obviously sensible backup management strategies aren't Aegir's domain, and a full disk is hardly Aegir's fault :)

xurizaemon’s picture

Project: Hostmaster (Aegir) » Provision
Issue summary: View changes

it's ok if people laugh, i laughed

anarcat’s picture

I think you are right, this is a bug: backup shouldn't rewrite settings.php, in my opinion, i am unsure why it does it.

Besides, it would probably be polite to check disk space, with something like http://ca.php.net/manual/fr/function.disk-free-space.php

Anonymous’s picture

The backup process rewrites settings.php because it temporarily unmasks the database credentials for the sake of being able to restore from the dump again later by hand if necessary on non-aegir-based hosts ( #826840: Save settings.php with uncloaked credentials on backup (for using on non-aegir hosts) )

What should we assume is 'too little disk space' then? We don't know the size of the backup until after we make it.

xurizaemon’s picture

Agreed, we can't tell if we have enough diskspace for a full backup. I think we can avoid backups vs full disk killing sites though.

My idea in #1 is just to avoid unlinking the old settings.php before we write the new one, but rather write then mv new settings.php into place.

This issue happens if the disk is full before the backup as well I believe.

Workaround: don't have full disks.

anarcat’s picture

I am not sure we need to rewrite settings.php at all... There is probably a way to generate a *copy* of the settings.php and add *that* to the tarball...

dubois’s picture

Ran into this problem too. A few times now. We move some files over from a legacy system into the Drupal files folder. Our backup size ballooned to the point of being larger than our Zabbix disk space warning threshold. Hilarity ensued when automatic backups ran while both of us (the tech team) were out of the office. Thankfully we had Varnish configured to keep serving the cached site if the backend goes down.

While I grant that suitable backup and disk space management is not Aegir's domain still backups should fail gracefully without taking sites down.

AlfTheCat’s picture

I ran into this as well, but got a worst-case variant.

My disk became full when an automated backup was being performed on hostmaster itself. That resulted in a fresh install screen for the hostmaster distribution. Meaning, anyone visiting my hosting frontend url was able to install a fresh instance of Aegir. I witnessed this twice, I now keep a keen eye on the volume on my disk but ideally there's some safety mechanism to prevent this from happening at all.

For, imo, this can pose is a significant security risk.

AlfTheCat’s picture

Issue summary: View changes

Updated issue summary.

ergonlogic’s picture

Version: 6.x-1.7 » 7.x-3.x-dev
Status: Active » Postponed (maintainer needs more info)

Is this still an issue in Aegir 3.x?

ergonlogic’s picture

Status: Postponed (maintainer needs more info) » Active