In some (reasonable!) mysql settings, big backups fail silently. This is because the mysql server closes the connection after some idle time. This is good, otherwise it may be blocked by zombie php processes. But: A big file backup behaves the same way (a DB backup otoh hits the database now and then ;-)).
When the mysql server closes, no dblog will be written any more about the problem.

Who is affected?

$ drush sqlq "show variables like 'wait_timeout'"
wait_timeout	180

Anyone whose wait_timeout is lower that a large backup may take.

Please,
* always use drush bam-backup to check if you really have "MySQL server has gone away" problem
* keep this issue focused on this problem (rather than other timeout problems)
* always report back with full drush output

With drush we get something like this:

$ drush bam-backup files  backup_files backup_settings
PDOStatement::execute(): MySQL server has gone away database.inc:2227                                                    [warning]
PDOStatement::execute(): Error reading result set's header database.inc:2227                                             [warning]
exception 'PDOException' with message 'SQLSTATE[HY000]: General error: 2006 MySQL server has gone away' in               [error]
.../includes/database/database.inc:2227
...
#8
.../sites/all/modules/backup_migrate/includes/filters.inc(109):
backup_migrate_filters_invoke_all('post_backup', Object(backup_file), Object(backup_migrate_profile))

Possible resolutions
a) on every request increase wait timeout (add sql like "set wait_timeout = 600")
b) only on cli requests increase wait timeout
c) make BAM increase wait timeout on backups
d) make BAM ping the DB now and then
(this may not be possible when e.g. a tar process is forked)

Proposed: c)

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

axel.rutz created an issue. See original summary.

geek-merlin’s picture

Issue summary: View changes
geek-merlin’s picture

Issue summary: View changes
geek-merlin’s picture

Issue summary: View changes
geek-merlin’s picture

Priority: Normal » Major

Setting major as this causes silent data loss.

geek-merlin’s picture

Status: Active » Needs review
FileSize
1.66 KB

Patch flying in that should fix the issue. Untested yet.

geek-merlin’s picture

Fixed method name typo in the patch.

geek-merlin’s picture

Title: Big backup breaks due to mysql wait_timeout » Big file backup breaks due to mysql wait_timeout
Issue summary: View changes
geek-merlin’s picture

Note: the standard setting of 1200s=20m may not be enough for very large file amounts (on my server >3gb).

Kris77’s picture

Will try your patch...

geek-merlin’s picture

Take this. Fixed SQL syntax.

geek-merlin’s picture

Wow. placeholders don't work here. Also fixed that and it works on my box.

devad’s picture

These are just some typical log error messages caused by low mysql wait_timeout settings. Posting for search purpose.

The first one is caused by cron backup run and the second one by manual backup run.

PHP Fatal error:  Uncaught exception 'PDOException' with message 'SQLSTATE[HY000]: General error: 2006 MySQL server has gone away' in /home/mysite/public_html/includes/database/database.inc:2204
Stack trace:
#0 /home/mysite/public_html/includes/database/database.inc(2204): PDOStatement->execute(Array)
#1 /home/mysite/public_html/includes/database/database.inc(683): DatabaseStatementBase->execute(Array, Array)
#2 /home/mysite/public_html/includes/database/database.inc(2383): DatabaseConnection->query('select name,  d...', Array, Array)
#3 /home/mysite/public_html/sites/all/modules/elysia_cron/elysia_cron.module(424): db_query('select name,  d...', Array)
#4 /home/mysite/public_html/sites/all/modules/elysia_cron/elysia_cron.module(478): elysia_cron_get('default', true, 'running', 0, false)
#5 /home/mysite/public_html/sites/all/modules/elysia_cron/elysia_cron.module(1154): elysia_cron_is_channel_running('default')
#6 [internal function]: elysia_cron_cleanup()
#7 {main}
  thrown in /home/relaxino/public_html/includes/database/database.inc on line 2204

PHP Fatal error:  Uncaught exception 'PDOException' with message 'SQLSTATE[HY000]: General error: 2006 MySQL server has gone away' in /home/mysite/public_html/includes/database/database.inc:2204
Stack trace:
#0 /home/mysite/public_html/includes/database/database.inc(2204): PDOStatement->execute(Array)
#1 /home/mysite/public_html/includes/database/database.inc(683): DatabaseStatementBase->execute(Array, Array)
#2 /home/mysite/public_html/includes/database/database.inc(2383): DatabaseConnection->query('SELECT expire, ...', Array, Array)
#3 /home/mysite/public_html/includes/lock.inc(167): db_query('SELECT expire, ...', Array)
#4 /home/mysite/public_html/includes/lock.inc(146): lock_may_be_available('schema:runtime:...')
#5 /home/mysite/public_html/includes/bootstrap.inc(433): lock_acquire('schema:runtime:...')
#6 /home/mysite/public_html/includes/bootstrap.inc(455): DrupalCacheArray->set(Array)
#7 [internal function]: DrupalCacheArray->__destruct()
#8 {main}
  thrown in /home/mysite/public_html/includes/database/database.inc on line 2204
geek-merlin’s picture

Thanks @devad for the googlefood, yes messages like that are to be expected.
Did the patch work for you?

devad’s picture

Hi @Alex.

I have used part of code from MySQL Wait Timeout module to fix my problem. I have put this line of code into my settings.php and it worked for me.

$databases['default']['default']['init_commands'] = array('SET SESSION wait_timeout = 300');

The module has Default, Drush and Cron parts, but i didn't need all that stuff, so I just used part from modules "Default" section and squeezed it into one line.

geek-merlin’s picture

@devad: If you want to help, test this patch and report back.

devad’s picture

@Alex: I can do tests via symplytest.me

However this test is rather specific due to large database need and it needs local testing engine which I am not good at.

pietrocap’s picture

Hi devad,
where did you put exactly the code "$databases['default']['default']['init_commands'] = array('SET SESSION wait_timeout = 300');" in your settings.php?

devad’s picture

@pietrocap

I have put it at the very end of my settings.php file.

But I suppose that it can be anywhere behind $database array definition.

Kris77’s picture

Hi @Axel.
I tried your patch but do not seem to work.

Just launch the bakcup remain on the page for a long time without doing anything.

pietrocap’s picture

Path doesn't work for me. Same as #20

geek-merlin’s picture

Title: Big file backup breaks due to mysql wait_timeout » Big file backup breaks with "MySQL server has gone away"
Issue summary: View changes

@Kris77, @pietrocap:

Note that big backups hit several time limits, and this patch only raises the DB time limit.
To overcome the other time limits (mainly server and php) it is best to use drush, as the webserver is not made for long-running processes.

To gather more insight into this, please test by running the backup from drush and report what you get, with something like:

# Add applicable source, destination and profile. In doubt read: drush help bam-backup
drush bam-backup 
RickJ’s picture

Just a note to say thanks for this fix - I'd been nearly tearing my hair out before I found this issue!

FYI, my scenario is nightly backups under cron, which worked fine, but then cron started failing to finish and clean up. This left the cron job flagged as still running, and the site stuck in maintenance mode.

The clue came when it occurred to me that the steady increase in site data had taken the backup time just over 3 mins, so it looked like a timeout issue - and led here. I considered just tweaking settings.php, but that affects all connections which is probably not desirable. This patch works perfectly.

geek-merlin’s picture

Status: Needs review » Reviewed & tested by the community

#23:
> ... This patch works perfectly.

Yes it was hard to debug and good that it helped you too.

couturier’s picture

Status: Reviewed & tested by the community » Patch (to be ported)

So @axel.rutz it sounds like this patch is ready to be ported?

Alex Andrascu’s picture

Status: Patch (to be ported) » Needs review

It would be good if we can have another pair of eyes reviewing this patch before setting it to RTBC. Let’s keep it on the backburner for now.

geek-merlin’s picture

#26 alex:
> It would be good if we can have another pair of eyes reviewing this patch before setting it to RTBC.

Valid requrest. You feel like code review or live test?

For the records: We use it on currently 13 sites.

couturier’s picture

I propose that we move forward with this patch. It solves a related timeout issue that users are still experiencing as of mid-November 2017, and @axel.rutz has thoroughly tested it on multiple sites.

RickJ’s picture

This patch is still needed to prevent SQL timeouts, but the existing patch file won't apply to 7.x.3.5.

I've updated it to work with the current release.

geek-merlin’s picture

Status: Needs review » Reviewed & tested by the community

It looks we've had some pairs of eyes on that so tentatively setting RTBC.

The last submitted patch, 12: backup_migrate-2829492-12-Big-backup-breaks.patch, failed testing. View results

RickJ’s picture

Hiding old outdated patch file - does that stop it being tested?

geek-merlin’s picture

This is a straight reroll that is needed after #2940451: Unlimited max_execution_time treated as 0 seconds got in.
Patch can be applied with git am and contains 2 commits for improved niceness.

DamienMcKenna’s picture

Why is the first if() statement needed in this part?

  // Also increase the time mysql keeps the connection open on file backup.
  if ($settings->get_source() instanceof backup_migrate_destination_filesource) {
    if (Database::getConnection() instanceof DatabaseConnection_mysql) {

I think it'd be fine to get rid of the outer if() statement.

RickJ’s picture

It's not my patch, but that if() condition does appear a bit superfluous. It's hard to see what difference the source type makes to the logic of extending the SQL timeout to match the page timeout.

geek-merlin’s picture

if ($settings->get_source() instanceof backup_migrate_destination_filesource) {

If we have a DB backup, nothing needs to be done, no DB timeout will be hit as the DB is busy all the time.
We can of course bikeshed about code beauty here ;-).

DamienMcKenna’s picture

If you're backing up a giant database isn't it possible it might hit a timeout in _dump_table_data_sql_to_file()? There are certainly a number of issues related to exporting giant databases.

geek-merlin’s picture

OK, that's a point. As raising the wait_timeout surely won't harm, feel free to drop the "if"!

geek-merlin’s picture

OK, removed the surrounding if, rest unchanged.

RickJ’s picture

It occurs to me that this feature is incomplete, as there's no admin facility to configure a value for the new setting backup_migrate_backup_max_time. So the value is in effect hard-coded as 1200.

I was considering adding it to the admin UI, but it would be the only global setting. All other settings are on a per-profile settings basis. For consistency this should be too. I'm happy to go ahead with the coding, but what's the consensus view?

DamienMcKenna’s picture

Status: Reviewed & tested by the community » Needs work

At the very least there should be a mention in the README.txt file about it.

RickJ’s picture

Status: Needs work » Needs review
FileSize
3.4 KB

OK, I thought I'd have a go at doing it properly!

This patch will apply on top of the one in #39, I've uploaded it for sanity-check of the concept. In my testing it works fine.

It adds a "Max backup time" value in the backup Advanced settings, which applies both for one-off settings and profiles. The run-time code uses this value instead of looking for a non-existing settings variable.

Also for completeness, I've done the same for Restore profiles, with corresponding code when Restore is executed.

Let me know what you think. If it's OK I'll merge it with patch #39.

Status: Needs review » Needs work

The last submitted patch, 42: max-backup-restore-time-42.patch, failed testing. View results

geek-merlin’s picture

> It occurs to me that this feature is incomplete, as there's no admin facility to configure a value for the new setting backup_migrate_backup_max_time.

Just to note: backup_migrate_backup_max_time exists since 6.x.
That said, nothing against documenting it.

RickJ’s picture

Status: Needs work » Needs review

Hi Axel

Yes, sorry, I mis-worded my comment, the variable backup_migrate_backup_max_time is not itself new in patch #39. However, the issue still remains that although the code was written to check the value of that variable, there is no UI to actually set it. And if there were, it would be a global value, contrary to all the other settings.

Please check out my patch #42, it creates a setting that works the same way as all the other settings and options. I should have marked it Do Not Test, as it only applies after your patch #39, so it fails to apply directly to the current dev build.

I also think it's important to have the same timeout logic for restore, because exactly the same issue could arise. A backup that failed without the DB timeout fix would also probably fail to restore for the same reason.

geek-merlin’s picture

Hey Rick,

this is classical feature creep, so i'd say:
* the original patch fixes the problem
* documenting the hidden variable, or even moving the setting like you started, is feature creep and belongs in a followup issue, to not block and bikeshed a tested problem solution

What i'd say in that followup issue:
* your code looks solid and well-done
* that said, i'd say it's too complicated, one global variable that is set high enuff for the DB size and speed is perfectly fine.
* and as this variable usually will be set in settings.php, i'd say, let's document it in README, and not add a visual setting (that will not work if overridden via settings.php!)
* that said, i guess we never had a support request about this as 1200s=20m is so high that raising it is a very exceptional issue.

tl;dr: let's discuss in a followup, and please don't over-engineer!

RickJ’s picture

Hi Axel

Thanks for your comments. I'd already realised earlier that I'd failed to consider settings.php as a means of setting the variable. And as you say, it's unlikely anyone would need to increase it, but the facility to add it to settings.php is there if required.

So I'm happy to trash my over-engineered solution!

However, I do think the code to extend the database timeout should be added to the restore case. Restores are rare compared to backups, so it's likely no-one's hit the problem, but it seems to me that a large backup that required the increased timeout would also probably need it if restored.

I'll get to this later, and add a patch to README, and the bit of code for restore.

geek-merlin’s picture

Caring for restores can not be wrong. I'll be happy to review this!

RickJ’s picture

I've added the code to handle restore. To avoid duplicate code I factored the timeout-tweaking stuff into a new function _backup_migrate_set_timeout(). This led me to look at the existing function _backup_migrate_check_timeout(), which makes an explicit check to see if the time has been exceeded.

This function did not take account of the increased timeout, and used only the PHP 'max_execution_time' value. This appears obviously wrong, so I've also changed this code to use the extended timeout value.

Also added a note to README.txt.

geek-merlin’s picture

I reviewed the patch and everything is as you described, and everything you described totally makes sense.

So if someone(tm) manually tests backup and restore, we can finally get this RTBC.

geek-merlin’s picture

DamienMcKenna’s picture

A small improvement to the README.txt file, minor tweaks to the PHP code.

Status: Needs review » Needs work

The last submitted patch, 52: backup_migrate-n2829492-52.patch, failed testing. View results

DamienMcKenna’s picture

  • DamienMcKenna committed 9b07ba6 on 7.x-3.x
    Issue #2829492 by axel.rutz, RickJ, DamienMcKenna, devad, Kris77,...
DamienMcKenna’s picture

Status: Needs work » Fixed

Committed.

geek-merlin’s picture

Yay! Thank ya all!

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

adam1’s picture

Running latest BAM 7.x-3.6, I regulary get below error messages.
Since in this version, the issue should have been fixed, I 'm asking why this is happening.

PDOException: SQLSTATE[HY000]: General error: 2006 MySQL server has      [error]
gone away in
/home/mysite/public_html/includes/database/database.inc:2227
Stack trace:
#0 /home/mysite/public_html/includes/database/database.inc(2227):
PDOStatement->execute(Array)
#1 /home/mysite/public_html/includes/database/database.inc(697):
DatabaseStatementBase->execute(Array, Array)
#2 /home/mysite/public_html/includes/database/database.inc(2406):
DatabaseConnection->query('SELECT s.lid, t...', Array, Array)
#3 /home/mysite/public_html/modules/locale/locale.module(719):
db_query('SELECT s.lid, t...', Array)
#4 /home/mysite/public_html/includes/bootstrap.inc(1773):
locale('%source backed ...', '', 'de')
#5
/home/mysite/.config/composer/vendor/drush/drush/includes/output.inc(154):
t('%source backed ...', Array)
#6
/home/mysite/public_html/sites/all/modules/contrib/backup_migrate/includes/backup_migrate.drush.inc(327):
dt('%source backed ...', Array)
#7
/home/mysite/public_html/sites/all/modules/contrib/backup_migrate/backup_migrate.module(1854):
_backup_migrate_message_drush('%source backed ...', Array, 'success')
#8
/home/mysite/public_html/sites/all/modules/contrib/backup_migrate/backup_migrate.module(1728):
_backup_migrate_message('%source backed ...', Array, 'success')
#9
/home/mysite/public_html/sites/all/modules/contrib/backup_migrate/backup_migrate.module(1631):
backup_migrate_backup_succeed('%source backed ...', Array,
Object(backup_migrate_profile))
#10
/home/mysite/public_html/sites/all/modules/contrib/backup_migrate/includes/backup_migrate.drush.inc(120):
backup_migrate_perform_backup(Object(backup_migrate_profile))
#11
/home/mysite/.config/composer/vendor/drush/drush/includes/command.inc(422):
backup_migrate_drush_backup('archive', 'mybackup_complete', 'complete')
#12
/home/mysite/.config/composer/vendor/drush/drush/includes/command.inc(231):
_drush_invoke_hooks(Array, Array)
#13
/home/mysite/.config/composer/vendor/drush/drush/includes/command.inc(199):
drush_command('archive', 'mybackup_complete', 'complete')
#14
/home/mysite/.config/composer/vendor/drush/drush/lib/Drush/Boot/BaseBoot.php(67):
drush_dispatch(Array)
#15
/home/mysite/.config/composer/vendor/drush/drush/includes/preflight.inc(67):
Drush\Boot\BaseBoot->bootstrap_and_dispatch()
#16 /home/mysite/.config/composer/vendor/drush/drush/drush.php(12):
drush_main()
#17 {main}