An odd issue popped up today when running database updates using CSV imports for ubercart products.

Server is Xen / Debian 6 / Percona / Latest BOA stable.

I've been running imports on a drupal site all day and from time to time I'm getting a stoppage stating "504 Gateway Time-out".
I hit refresh and it seems to be doing the import ok, and I'm a bit concerned about what might be causing this?
I also noticed that a couple of the imports would actually stop stating error "502 Bad Gateway".

I have:
@ini_set('max_execution_time', 0);
in the local.settings.php file for this host.

Is there something else I could add to avoid these time outs? I can confirm that out of the many imports I've done today, only a few do this and it appears to be the larger imported data.

Surely there is a logical explanation.
Thanks in advance.

Comments

snlnz’s picture

Status: Active » Fixed

After trawling the issue queue its obvious I might have failed to run the BOND.sh.txt tuner script to increase the default time-out values.

I have since run BOND.sh.txt and it appears to have fixed the problem, I am now not receiving the time out error messages and all seems to be well.

I have noticed a significant memory usage drop by about half since running the BOND.sh.txt tuner script so that's a bonus!

snlnz’s picture

Status: Fixed » Active

Unfortunately I got a bit further through the import process and received a 502 Bad Gateway error so it's still not fixed.

omega8cc’s picture

Project: Octopus » Barracuda
Component: Code » Miscellaneous

You probably need to raise also other limits, as we already explained here: http://drupal.org/node/1328832#comment-5191990 - they are not yet handled by BOND tuner.

snlnz’s picture

Thanks for your reply.

The following values exist currently
in /opt/etc/php-fpm.ini
<value name="request_terminate_timeout">3600s</value>
in /opt/etc/php.ini

max_execution_time = 3600 ; Maximum execution time of each script, in seconds
max_input_time = 3600    ; Maximum amount of time each script may spend parsing request data
memory_limit = 250M      ; Maximum amount of memory a script may consume (128MB)

Since your reply, I have just adjusted:
/var/aegir/config/nginx.conf

  fastcgi_connect_timeout         300; // Up from 60
  fastcgi_send_timeout           3600; // Up from 300
  fastcgi_read_timeout           3600; // Up from 300

and restarted nginx.

Will post further progress.

snlnz’s picture

Issue tags: +504 Gateway Time-Out

So I've been unusually successful with the last lot of mass product imports but as soon as I try a large complex CSV import I get the error again. This time I was watching /var/log/php/php-fpm-error.log

rotate
Nov 22 12:15:06.723529 [NOTICE] fpm_pctl_kill_all(), line 172: sending signal 15 SIGTERM to child 37078 (pool default)
Nov 22 12:15:06.723573 [NOTICE] fpm_pctl_kill_all(), line 181: 1 child is still alive
Nov 22 12:15:06.771591 [NOTICE] fpm_got_signal(), line 48: received SIGCHLD
Nov 22 12:15:06.771690 [WARNING] fpm_children_bury(), line 215: child 37078 (pool default) exited on signal 15 SIGTERM after 1803.835254 seconds from start
Nov 22 12:15:06.771703 [NOTICE] fpm_pctl_exec(), line 95: reloading: execvp("/usr/local/bin/php-cgi", {"/usr/local/bin/php-cgi", "--fpm", "--fpm-config", "/opt/etc/php-fpm.conf", "-c", "/opt/etc/php.ini"})
Nov 22 12:15:06.943509 [NOTICE] fpm_unix_init_main(), line 284: getrlimit(nofile): max:1024, cur:1024
Nov 22 12:15:06.943755 [NOTICE] fpm_sockets_init_main(), line 364: using inherited socket fd=6, "127.0.0.1:9000"
Nov 22 12:15:06.943879 [NOTICE] fpm_event_init_main(), line 88: libevent: using epoll
Nov 22 12:15:06.944035 [NOTICE] fpm_init(), line 52: fpm is running, pid 47150
Nov 22 12:15:06.946537 [NOTICE] fpm_children_make(), line 352: child 47151 (pool default) started
Nov 22 12:15:06.946591 [NOTICE] fpm_event_loop(), line 107: libevent: entering main loop
Nov 22 12:18:01.153584 [WARNING] fpm_request_check_timed_out(), line 146: child 47151, script '/data/disk/USER/distro/001/ubercart-6.x-2.7-6.22/index.php' (pool default) executing too slow (30.093366 sec), logging
Nov 22 12:18:01.153709 [NOTICE] fpm_got_signal(), line 48: received SIGCHLD
Nov 22 12:18:01.153748 [NOTICE] fpm_children_bury(), line 194: child 47151 stopped for tracing
Nov 22 12:18:01.153758 [NOTICE] fpm_php_trace(), line 139: about to trace 47151
Nov 22 12:18:01.154845 [NOTICE] fpm_php_trace(), line 167: finished trace of 47151

Hopefully this provides a bit more insight to the problem. I'm just a bit unsure what to adjust as I've tried all the areas of interest that have been discussed so far.

Thanks in advance.

omega8cc’s picture

These notices are expected and they simply say that the PHP-FPM daemon is reloaded every 30 minutes - this is because there is something like that defined in the auto-healing script. You may want to comment out this line /etc/init.d/php-fpm reload in the /var/xdrago/clear.sh script. Or disable it completely in /var/spool/cron/crontabs/root and restart cron.

snlnz’s picture

Really? Even this message?

fpm_request_check_timed_out(), line 146: child 47151, script '/data/disk/USER/distro/001/ubercart-6.x-2.7-6.22/index.php' (pool default) executing too slow (30.093366 sec), logging

This was message occurred exactly the same timing as when the import failed and issued the Gateway 502 error popped up. There's no consistency in the error message and it would be really good to know a better method of debugging the problem. I have this site running on a local dev environment and I don't get any issues on apache/mysql it's just the octopus server.

Hope you can help I'd be sad to have to resort to redeploying an apache based solution to fix the problem.

snlnz’s picture

Update: I've cloned the site to dev.sitename.com and effectively disabled the caching altogether as outlined on omega8.cc documentation. Now I don't get any errors and the importer works fine!

My question is, how do I disable any caching that automatically gets applied under boa, such as memcached, boost, speed booster just for this site or any other particular site that experiences this issue?

Thanks in advance!

snlnz’s picture

Thanks for the links. I think the issue is resolved with speed booster and cache disabled, plus I have installed advagg.
Will post an update over the next couple of days.

So in future anyone having trouble with 504 Gateway or 502 Gateway time outs while running batch.api bulk node manipulation first try to disable the caching system by touching the control files:
NO.txt file in sitename/modules/cache/
and
README.txt in sitename/modules/ubercart/

I struggled to resolve this for a while but the learning curve has been invaluable.

omega8cc’s picture

Any news on this issue?

snlnz’s picture

Status: Active » Fixed

After it was resolved I hadn't done any further testing. So simply disabling the speed booster and cache fixed it for me.
I haven't had any other issues since.
So it can be closed?

Automatically closed -- issue fixed for 2 weeks with no activity.

Anonymous’s picture

Issue summary: View changes

added bad gateway error