Some recent instability that may have lead to the outages the weekend of 2016-02-14 are likely due to limitations in php 5.3. Upgrading the php version on the Drupal.org webnodes is primarily limited by the remaining Drupal 6 sites: qa.drupal.org which will be statically archived shortly, and groups.drupal.org which will receive some module updates to ensure it runs on 5.4.
"Due to this same limitation, attempts to create references to any contained data, nested or otherwise, will fail silently. So $var = &$object['foo'] will not throw an error, and $var will be populated with the contents of $object['foo'], but that data will be passed by value, not reference. For more information on the PHP limitation, see the note in the official PHP documentation at· http://php.net/manual/arrayaccess.offsetget.php on ArrayAccess::offsetGet()"
Once we have tested php 5.4 in our staging environment we'll upgrade a single webnode and put it in rotation, and if that works well upgrade all the remaining webnodes.
Post Mortem
Following a series of outages caused by PHP instability, we began investigating the root causes. The first two outages on Feb 1, 2016 and Feb 8, 2016 were caused by what we believe to be an unrelated issue to the following three outages. Debugging the issue lead us to believe there was something which caused the garbage collection in combination with APC to error out and poison the php-fpm workers, eventually exhausting the database connections and causing the site to go offline until the poisoned php-fpm pool was restarted. These outages happened on:
- Feb 13, 2016 morning
- Feb 13, 2016 evening
- Feb 20, 2016 morning
On February 9, 2016, an update to PHP 5.3.3 distributed with CentOS 6 was pushed to the web servers. This update included a single change to PHP's garbage collection (https://rhn.redhat.com/errata/RHBA-2016-0141.html). We believe this change introduced a regression which we were hitting under load on Drupal.org.
To resolve the issue, we determined upgrading to PHP 5.4 (already on our roadmap) presented the path forward with the most benefit.
Comments
Comment #2
basic CreditAttribution: basic at Drupal Association commentedqa.drupal.org should be archived soon...
Comment #3
basic CreditAttribution: basic at Drupal Association commentedqa.drupal.org archiving is underway. The site is relatively large so it may take a day or two to complete.
Comment #4
basic CreditAttribution: basic at Drupal Association commentedQuick update: qa.drupal.org archiving continues (33G of static html so far), and we will likely begin the php 5.4 upgrades early next week.
Comment #5
basic CreditAttribution: basic at Drupal Association commentedwww1 and www7 have been put in rotation with php 5.4
Comment #6
basic CreditAttribution: basic at Drupal Association commentedDeployed to- stagingwww1, devwww2, private-devwww, git1, git2, static1, static2, www1, www2, www6, www7
Notes:
Comment #7
basic CreditAttribution: basic at Drupal Association commentedComment #8
basic CreditAttribution: basic at Drupal Association commented