Problem/Motivation

Random test fails due to segmentation faults have been observed in numerous core tests.

The test runner reports the fail in a class that changes from time to time as new code is committed. Previous classes include:

  • \Drupal\node\Tests\Views\FrontPageTest::testCacheTagsWithCachePluginTime()
  • \Drupal\editor\Tests\EditorSecurityTest::testEditorXssFilterOverride()
  • \Drupal\datetime_range\Tests\DateRangeFieldTest::testDatelistWidget()
  • \Drupal\locale\Tests\LocaleUpdateTest::testEnableUninstallModule()
  • \Drupal\views_ui\Tests\HandlerTest::testErrorMissingHelp()
  • Drupal\Tests\toolbar\Functional\ToolbarCacheContextsTest::testToolbarCacheContextsCaller()

However, since the test that reports the failure from the segfault changes unpredictably, do not rely on this list. To check whether a fail is a segfault: If the test says "Fatal error" with no other information, click "View results on dispatcher", then "Console Output", then "Full log", and search on the page for "segmentation fault".

Proposed resolution

For the PHP5.6 we have clear evidence that there are problems garbage collecting whilst building the container. Compare https://dispatcher.drupalci.org/job/drupal_patches/14991/consoleText and https://www.drupal.org/pift-ci-job/667697.

Therefore we are disabling garbage collection during container rebuilds for PHP5. As far was we can tell this is working around a known PHP5 that will not be fixed https://bugs.php.net/bug.php?id=72286. The potential impact is an increase in memory usage. For a single container rebuild this has been calculated to be 1.5mb.

There are other segfaults - which will be tracked in: #2223459: Segmentation fault, Out of memory in LocaleUpdateTest and others and #2876895: Intermittent segfaults on PHP 7.1

Remaining tasks

User interface changes

None

API changes

None

Data model changes

None

Original issue summary

testEditorXssFilterOverride
fail: [Completion check] Line 418 of core/modules/editor/src/Tests/EditorSecurityTest.php:
The test did not complete due to a fatal error.

Unknown
fail: [run-tests.sh check] Line 0 of :
FATAL Drupal\editor\Tests\EditorSecurityTest: test runner returned a non-zero error code (139).

Looks like we may have a new one on php5.6:

https://dispatcher.drupalci.org/job/php5.6_mysql5.5/2134/artifact/jenkin...

Both:
https://dispatcher.drupalci.org/job/php5.6_mysql5.5/2134/

And
https://dispatcher.drupalci.org/job/php5.6_mysql5.5/2140/

Are having segfaults in what look to be the same location in the code:
https://dispatcher.drupalci.org/job/php5.6_mysql5.5/2134/artifact/jenkin...

https://dispatcher.drupalci.org/job/php5.6_mysql5.5/2140/artifact/jenkin...

NULL pointer dereference in garbage collector triggered in Symfony\Component\DependencyInjection\Compiler\ServiceReferenceGraph, possibly related to #2603152: Fix PHP 7 testbot failures.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Anonymous’s picture

vaplas created an issue. See original summary.

Anonymous’s picture

catch’s picture

Anonymous’s picture

Yes, it's looks like duplicate, because I took mentioned issue as pattern, and forgot to change the name of the method (testChanged to testEditorXssFilterOverride). Sorry for the misunderstanding.

Anonymous’s picture

Title: Random fail in Drupal\editor\Tests\EditorSecurityTest::testChanged » Random fail in Drupal\editor\Tests\EditorSecurityTest::testEditorXssFilterOverride
xjm’s picture

Title: Random fail in Drupal\editor\Tests\EditorSecurityTest::testEditorXssFilterOverride » Intermittent segfaults on DrupalCI, including in primary test enviroment
Version: 8.4.x-dev » 8.3.x-dev

It's a segfault, so probably has nothing to do with the particular test and may appear to be in a different test tomorrow.

xjm’s picture

xjm’s picture

Issue summary: View changes

xjm credited Mixologic.

xjm credited tacituseu.

xjm’s picture

xjm’s picture

Title: Intermittent segfaults on DrupalCI, including in primary test enviroment » Intermittent segfaults on DrupalCI on PHP 5.6 (at least)
tacituseu’s picture

Anonymous’s picture

Issue summary: View changes

https://www.drupal.org/pift-ci-job/618711 looks clean

Exactly, in its place should be https://www.drupal.org/pift-ci-job/618661. IS updated.

xjm’s picture

#2853905: Figure out why there were segfaults in every test run on PHP 7.1.x-dev is no longer occurring, so that issue has been closed, but adding a reference here in case.

In general, there seem to be a lot of segfaults and stability problems related to garbage collection in our tests (or at least on CI). Adding a few other related issues as well.

xjm’s picture

Title: Intermittent segfaults on DrupalCI on PHP 5.6 (at least) » Intermittent segfaults on DrupalCI

PHP 7.1 had a segfault here: https://www.drupal.org/pift-ci-job/626032

So it's not 5.6 only.

tacituseu’s picture

That PHP 7.1 trace from #16 looks like a different bug/trigger.

cilefen’s picture

Issue tags: +Triaged D8 critical

@alexpott, @catch, @cottser, @xjm and I considered this issue and agree this is suitably a critical priority bug because intermittent (seemingly random) test failures interrupt the work of project contributors.

Wim Leers’s picture

I'm the editor.module maintainer. Let me know how I can help.

Since this is only occasionally happening, and when it does, it's a segfault… I don't feel like there's much I can do right now.

tacituseu’s picture

There seem to be 2 triggers:
1. inside Symfony\Component\DependencyInjection\Compiler\ServiceReferenceGraph::connect():
https://www.drupal.org/pift-ci-job/615775
https://www.drupal.org/pift-ci-job/618661
https://www.drupal.org/pift-ci-job/619448

2. inside Drupal\Component\Utility\NestedArray::mergeDeepArray() in FrontPageTest:
https://www.drupal.org/pift-ci-job/662744
https://www.drupal.org/pift-ci-job/662998
https://www.drupal.org/pift-ci-job/663460
https://www.drupal.org/pift-ci-job/663495
They are getting more frequent.

Crashes are inside garbage collector around gc_zval_check_possible_root() / gc_collect_cycles(), both sometimes result in zend_mm_heap corrupted.

Edit:
The third trigger might be inside system_update_8004() in UpdatePathTestBase from old issue #2828559: UpdatePathTestBase tests randomly failing that was just worked around with gc_disabled(), see dumps in #114 and #115.

Anonymous’s picture

Component: editor.module » other

Looks like twin prime segfaults :)

#19: fair remark, since #6 it is no more associated with the editor.module.

tacituseu’s picture

https://www.drupal.org/pift-ci-job/663713 is something else still, seen it too in https://www.drupal.org/pift-ci-job/664432, triggered in Symfony\Component\DependencyInjection\Compiler\ResolveDefinitionTemplatesPass::process()

mpdonadio’s picture

Not sure if this adds more info, but saw this again:

https://dispatcher.drupalci.org/job/drupal_patches/14492/console

mpdonadio’s picture

Mixologic’s picture

@mpdonadio: thats actually something strangely broken with the testbots. They are *supposed* to remove the core files once they do a gdb backtrace (because they are huge). But something, recently, has made them stay in place, and so when a new container tries to backtrace those core files, none of the memory exists anymore.

Im going to have to add some more output to see whats going on. 'sudo rm -rf $core_file' seems to have some kind of issue.

gambry’s picture

I've been redirected here from #2876281: \Drupal\node\Tests\Views\FrontPageTest::testCacheTagsWithCachePluginTime fails randomly, closed as duplicates of #2848133: Random segfaults in Drupal\node\Tests\Views\FrontPageTest.testCacheTagsWithCachePluginTime, closed as well as duplicate of this issue.
However I don't actually see the Drupal\node\Tests\Views\FrontPageTest.testCacheTagsWithCachePluginTime test mentioned anywhere, so I agree with @alexpott (on #2876281) RE this issue "should have a list of classes so we don't create dupes."

This just happened on #2786577: The Views integration Datetime Range fields should extend the views integration for regular Datetime fields:
\Drupal\node\Tests\Views\FrontPageTest::testCacheTagsWithCachePluginTime

fail: [Completion check] Line 211 of core/modules/node/src/Tests/Views/FrontPageTest.php:
The test did not complete due to a fatal error.

https://www.drupal.org/pift-ci-job/666090

From all reports in here looks the classes/tests affected are:
- \Drupal\node\Tests\Views\FrontPageTest::testCacheTagsWithCachePluginTime
- Editor.Drupal\editor\Tests\EditorSecurityTest::testEditorXssFilterOverride
- Views_ui.Drupal\views_ui\Tests\HandlerTest::testErrorMissingHelp
- Drupal\Tests\toolbar\Functional\ToolbarCacheContextsTest::testToolbarCacheContextsCaller

(However last one has got a different output error from the rest, so I don't know if it can be considered a symptom of the same issue)

alexpott’s picture

Issue summary: View changes

Editing the summary so that we get an idea of the tests where this is occurring and so people can search and find this issue.

alexpott’s picture

FileSize
856 bytes

#2488350-35: Switch to a memory cache backend during installation seems to up the rate of segfaults in Drupal\node\Tests\Views\FrontPageTest::testCacheTagsWithCachePluginTime. Some of the core dumps show the segfaults occurring whilst doing garbage collection. One of the times we create thousands of objects that would need to be garbage collected is container building. It looks like disabling garbage collection makes things more reliable - see #2488350-37: Switch to a memory cache backend during installation.

Patch attached is the one that seems to be making the difference.

mpdonadio’s picture

This aligns with the related GC issue about an underlying problem with PHP memory management that we are just bringing to the surface, since update does a rebuild. I wonder if an explicit GC after the enable would help with memory swell during rebuild. Since requests aren't persistent, not sure if this matters too much?

xjm’s picture

Title: Intermittent segfaults on DrupalCI » Intermittent segfaults on DrupalCI (some "did not complete due to a fatal error" with no additional info)
Issue summary: View changes

A "list of classes" is not actually going to help so much because it's just one class at a time, that changes kind of arbitrarily as we commit stuff. I think the list in the summary is misleading, because it's probably not going to be any of those classes again.

Rather than listing classes, the correct thing to do is to simply check the console log for the segfault. See step #5 in the summary of #2829040: [meta] Known intermittent, random, and environment-specific test failures for brief instructions. The results will always look something like:

Failing tests:

Entity.Drupal\system\Tests\Entity\Update\UpdateApiEntityDefinitionUpdateTest

✗ testStatusReport
  fail: [Completion check] Line 137 of
core/modules/system/src/Tests/Entity/Update/UpdateApiEntityDefinitionUpdateTest.php:
  The test did not complete due to a fatal error.

✗ Unknown
  fail: [run-tests.sh check] Line 0 of :
  FATAL
Drupal\system\Tests\Entity\Update\UpdateApiEntityDefinitionUpdateTest: test
runner returned a non-zero error code (139).

When you see a fatal with no additional details, go to the console full log and look for "segmentation fault". I'll update the summary with this information.

All that said, I've been thinking about creating this Twitter account for like four months: https://twitter.com/drupalsegfault :P

Hopefully it doesn't get to make that many tweets now that we have some debugging.

xjm’s picture

If it's helpful, I can provide a list of and links to every segfault in HEAD over the past several months, but it'd be a bit tedious and time-consuming trawling through my email. If that would be useful it's probably more efficient for Mixologic to grep logs or whatnot for it.

The pattern of clusters of one test failing at a time, then changing to something else, is pretty consistent in the HEAD fails. I bet it's different for patches because (of course) the patches are changing different things each time and so the conditions that lead to the segfault change. I have also suspected it may not even actually happening in the test that reports the fail, but somewhere else due to concurrency or whatnot.

alexpott’s picture

Status: Active » Needs review
FileSize
1.2 KB

@xjm I'm seeing \Drupal\node\Tests\Views\FrontPageTest::testCacheTagsWithCachePluginTime() all over the place at the moment and I was the second person to create a duplicate issue.

The patch attached increases memory usage during minimal install but comparing #2488350-35: Switch to a memory cache backend during installation and #2488350-37: Switch to a memory cache backend during installation it is appears to be fixing the PHP5.6 problem we have. The evidence is clear. The patch in comment 35 has failed 8 out of 13 times (on PHP5.6) and the patch in comment 37 has passed 9 out of 9 times. I think that the installer patch is causing the problem because it is causing additional passes over the container because it adds a service modifier.

Looking at the installs (which do plenty of container rebuilds) to see how this impacts memory usage.

Before

Minimal:

 [info] Starting Drupal installation. This takes a while. Consider using the --notify global option. [0.36 sec, 10.57 MB]
 [debug] Calling install_drupal(Object, Array) [0.37 sec, 10.78 MB]
 [success] Installation complete. [3.63 sec, 40.63 MB]

Standard:

 [info] Starting Drupal installation. This takes a while. Consider using the --notify global option. [0.51 sec, 10.58 MB]
 [debug] Calling install_drupal(Object, Array) [0.51 sec, 10.78 MB]
 [success] Installation complete. [13.92 sec, 59.35 MB]

After

Minimal:

[info] Starting Drupal installation. This takes a while. Consider using the --notify global option. [0.46 sec, 10.58 MB]
 [debug] Calling install_drupal(Object, Array) [0.46 sec, 10.78 MB]
 [success] Installation complete. [3.67 sec, 47.75 MB]

Standard:

 [info] Starting Drupal installation. This takes a while. Consider using the --notify global option. [0.32 sec, 10.58 MB]
 [debug] Calling install_drupal(Object, Array) [0.33 sec, 10.78 MB]
 [success] Installation complete. [11.7 sec, 59.8 MB]
tacituseu’s picture

Looked over quite a few for a pattern, the only thing in common I can see is that they fail in code paths / methods that manipulate large arrays or produce large amounts of objects.

The segfault is triggered by NULL pointer dereference inside garbage collector in branches of the code responsible for reclaiming memory, because when GC runs out of root buffer space (#define GC_ROOT_BUFFER_MAX_ENTRIES 10000) it tries to reclaim some by calling gc_collect_cycles().

That doesn't necessarily mean the bug is inside one of those methods, as @xjm points out it could be anywhere in the code, something leaves buggy object/array dormant in the buffer, then when it goes over the limit it tries to reclaim memory, which makes it go over the buggy thing and segfault.
As the code base grows and tests are getting more complicated, it will start getting triggered at more and more code paths.

Listing specific tests might be of use if there's some obvious thing they have in common, like some service they all use that has the buggy code.

alexpott’s picture

For me, #33 is exactly why container building is the most likely culprit. It creates 1000s of objects. It also explains why update.php was sensitive - plenty of container rebuilds occur during it.

tacituseu’s picture

Surprised #32 helps, from http://php.net/manual/en/features.gc.collecting-cycles.php:

When the garbage collector is turned off, the cycle-finding algorithm will never run. However, possible roots will always be recorded in the root buffer, no matter whether the garbage collection mechanism has been activated with this configuration setting.

The reason why possible roots are recorded even if the mechanism has been disabled is because it's faster to record possible roots than to have to check whether the mechanism is turned on every time a possible root could be found.

Looked over old logs, Drupal\node\Tests\Views\FrontPageTest seems to take under a minute per run, maybe x40 patch would be useful as it is the most common failure at the moment.

Edit:
What it could mean is that it isn't the first gc_collect_cycles() that fails but the n-th one, and the preceding one, is the one that corrupts. Since with the patch none will occur when rebuilding the container, it could imply that the corrupting one used to happen there.
Alternatively it might mean the buffer is full of "safe" objects from container rebuilds, and the "troubled" one that happens later isn't recorded due to lack of space.

alexpott’s picture

So for me container rebuilds are totally suspect for causing garbage collection segfaults.

Using the script below to target just a container rebuild. gc_collect_cycles() returns the number of zvals freed.

<?php
use Drupal\Core\DrupalKernel;
use Symfony\Component\HttpFoundation\Request;

$autoloader = require_once 'autoload.php';

$request = Request::createFromGlobals();
$kernel = DrupalKernel::createFromRequest($request, $autoloader, 'prod');
$kernel->boot();
$kernel->preHandle($request);

var_dump(gc_collect_cycles());
$kernel->invalidateContainer();
$kernel->rebuildContainer();
var_dump(gc_collect_cycles());

With patch

int(0)
int(4018)

Without patch

int(0)
int(21399)

So the patch is affectively isolating a place where we have a prodigious number of objects created and isolating its affects.

xjm’s picture

@alexpott: Here's the list of segfaults in HEAD over the past four months (may have missed a few). Draw your own conclusions. To me it's pretty clear that it moves around and clusters in different tests as different things are committed.

  • Jan. 21: ImageFieldDisplayTest
  • Jan. 25: ImageFieldDisplayTest
  • Feb. 16: ContactStorageTest
  • Feb. 17: ContactStorageTest
  • Feb. 18: ContactStorageTest
  • Feb. 19: ContactStorageTest
  • Feb. 20: InstallUninstallTest
  • Feb. 21: InstallUninstallTest
  • Feb. 22: InstallUninstallTest
  • Feb. 28: EditorSecurityTest
  • Mar. 1: DateRangeFieldTest (repeatedly)
  • Mar. 6: EditorSecurityTest
  • Mar. 27: ConfigTranslationUiTest and DisplayTest
  • Mar. 28: ConfigTranslationUiTest
  • Mar. 28: DisplayBlockTest
  • Apr. 4: ShortcutSetsTest (repeatedly)
  • Apr. 14: InstallUninstallTest (repeatedly)
  • May 7: UpdateApiEntityDefinitionUpdateTest
  • May 7: FrontPageTest
alexpott’s picture

@xjm thanks for the list. It's really useful to help avoid duplicates.

Here's a patch that only disables gc when container building. Running the script in #36 results in the same improvements for garbage collection doing way less. With the patch post container rebuild we have 4018 objects to garbage collection, without the patch we have 21399.

I'm also pretty sure that we can remove the gc_disable() from update.php with this patch too but I think we should investigate that in #2842393: Discover why gc_disable() improves update.php stability

alexpott’s picture

@catch suggested some improvements to the comment so that it is not PHP5.6 centric. The improvements to the amount of objects to garbage collect happens regardless of PHP version.

Also remove test I was playing around with trying to make the segfaults occur.

catch’s picture

Status: Needs review » Reviewed & tested by the community

Patch looks great. Since we have a little bit of time before the next patch release I'd be happy committing this to both 8.4.x and 8.3.x - the only way we can tell conclusively if this helps test stability is if the rate of segfaults goes down. Also would be nice if we can remove the one-off fix in update.php but agreed on doing that in its own issue, less variables at once the better.

Bumping to RTBC.

Gábor Hojtsy’s picture

I agree this looks like it makes a ton of sense based on the above analysis.

tacituseu’s picture

The difference in number of zvals freed by gc_collect_cycles() is due to http://php.net/manual/en/features.gc.collecting-cycles.php:

If the root buffer becomes full with possible roots while the garbage collection mechanism is turned off, further possible roots will simply not be recorded. Those possible roots that are not recorded will never be analyzed by the algorithm. If they were part of a circular reference cycle, they would never be cleaned up and would create a memory leak.

Gábor Hojtsy’s picture

Status: Reviewed & tested by the community » Needs work

@tacituseu: hm, if this causes memory leaks then that does not sounds RTBC to me?!

tacituseu’s picture

More of an increased memory usage in this case, it should be freed at the end of the request, but the way this patch differs from the workaround in update.php is that it isn't limited to test environment (code in update.php is behind drupal_valid_test_ua())

alexpott’s picture

Status: Needs work » Needs review

So re the additional memory used as #32 shows - during a standard install we actually don't use more memory. We do during a drush minimal install but not more the 64mb target.

Profiling just a rebuild shows us this:

HEAD

int(0)
int(21399)

default: 23.50 MiB - 625 ms

Patch

int(0)
int(4018)

default: 25.00 MiB - 618 ms

This is for a single rebuild which is general the case when installing a module via the UI. Personally I keep this is an acceptable trade-off for something that only occurs during installation, extension changes and full cache clears. Also regarding the targeting of only DrupalCI in update.php - the problem with this is that it is an assumption that the segfaults only affect DrupalCI. Yes it's doing things - especially the number of installations and extension changes that no normal site is doing but it is entirely possible that other sites are triggering segmentation faults here. We know that PHP garbage collection runs out of space at 10,000 objects for a normal PHP build. We know that rebuilding the standard container - regardless of DrupalCI or not - will create 20000+ objects.

tacituseu’s picture

@alexpott: speaking of segfaults, gave testbot a bit of a headache (see #2876705: No space left on device / broken test runners), sorry.

alexpott’s picture

So let's try that here :) It might break another bot but it is worth it.

tacituseu’s picture

Must be an issue @Mixologic mentioned in #25.

alexpott’s picture

So the test-only part of #47 completely crated test bot - see https://dispatcher.drupalci.org/job/drupal_patches/14991/

The patch on #47 is way way more green that that.

I discussed whether or not we should add a drupal_valid_test_ua() check to the code - both of us are -1 on it because whilst we are seeing this only testbot environment it is highly likely that this affects real sites too. For me the additional memory cost is worth the stability.

Re-uploading #39.

tacituseu’s picture

#47 has no segfaults, unlikely to finish though, only because it has second pass under concurrency=1 as PHPUnit-FunctionalJavascript.

mpdonadio’s picture

+++ b/core/lib/Drupal/Core/DrupalKernel.php
@@ -870,6 +870,15 @@ protected function initializeContainer() {
+      $disable_gc = gc_enabled();
+      if ($disable_gc) {
+        gc_collect_cycles();
+        gc_disable();
+      }
       $container = $this->compileContainer();

I've been staring at this for a bit. Why do we need to save off whether GC is enabled here or not?

I trawled through the PHP issue queue. I think this is the bug we are hitting: https://bugs.php.net/bug.php?id=72286 May be worthwhile to add to the comment. Triggered a PHP7 test.

tacituseu’s picture

@mpdonadio: so it won't be forced by gc_enable() later on environments that mean to have it disabled.

From https://bugs.php.net/bug.php?id=72286:

[2017-01-02 14:07 UTC] nikic@php.net
Closing this bug, as the issue is fixed in PHP 7.0 and PHP 5.6 is going out of active support.

The last submitted patch, 47: 2859704-47.patch, failed testing.

tacituseu’s picture

#47 timed out, results:
274 passes of \Drupal\node\Tests\Views\FrontPageTest for 8.3.x, 278 for 8.4.x, without even a single segfault.
Control, test-only patch had ~20% (out of 50 tests) failure rate before test bot gave up.

alexpott’s picture

Adding to the comment as per #51.

So we could add a version compare and only do this for PHP5. Therefore we'd have no memory impact in PHP7 and there's no reports of segfaults here on PHP7. Seems like a good idea given that https://bugs.php.net/bug.php?id=72286 is a known PHP5.6 issue and there will not be a fix.

alexpott’s picture

mpdonadio’s picture

Status: Needs review » Reviewed & tested by the community

I like where this patch ended up, and am glad an upstream problem was indentified (even though it won't be fixed) so there is less guessing at the cause.

The last submitted patch, 55: 2859704-55.patch, failed testing.

xjm’s picture

Status: Reviewed & tested by the community » Needs review
Issue tags: +Needs issue summary update

"Cratering the bot" with a timeout but no segfault doesn't seem to demonstrate the fix to me, just that we exceeded the timeout. I can see RTBCing the patch based on other reasons if we can't reliably prove a fix with CI, but I want to make sure we're not RTBCing on incorrect info.

there's no reports of segfaults here on PHP7

See #16 where I retitled the issue because I saw a segfault on PHP7. If @tacituseu is correct and a different bug caused that segfault, then at least we need to open a followup for the 7.1 segfault if it still exists. InstallUninstallTest has segfaulted on at least 5.5 and 7.1.

Also, in general, remember that the relative abundance of reports on 5.6 will always be higher because it is used for the primary test environment and so runs probably hundreds of times more tests than the other PHP versions.

Setting NR for (at a minimum) confirmation that neither #47 nor the statement about PHP7 change the RTBC, and to get a followup for the "other" segfault if there are two as @tacituseu has suggested a couple times.

tacituseu’s picture

@xjm: here's the "cratering" log https://dispatcher.drupalci.org/job/drupal_patches/14991/consoleText, it is the same patch as in #47 just without the workaround, it made so many core dumps that the test bot ran out of space, #47 times out because of number of repetitions is too high (100 would pass).

Re #16 : the 7.1 failure is in unserializer, didn't see a single one like that in 5.5-5.6 traces.

Edit: I wouldn't run #47 on 5.6 without the patch until #25 is resolved, as it disrupts other issues since the instance is alive for some time after and tries to schedule other patches.

alexpott’s picture

Status: Needs review » Reviewed & tested by the community

Thank @tacituseu. So this clearly rtbc because timing out on DrupalCI with no segfaults clearly proves the fix because without it we break a testbot and cause subsequent test runs to fail. With the fix the testbot continues to work.

alexpott’s picture

Wrt to the PHP7.1 segfault observed in #16 perhaps we should just leave this issue open after the commit and monitor things for a while? And if we've seen no new segfault after 2 weeks then we can close it?

The alternative is to commit the patch without the PHP version check because whilst we've not got a test case for PHP7 failing like PHP5, there's not much harm other then increased memory usage when doing a container rebuild which is not exactly the most performant part of the code base anyway.

alexpott’s picture

I'm in favour of committing #56 and leaving the issue open. The nature of these faults is that they are intermittent there's a lot of work and information contain on this issue and we only have good evidence that it fixes one particularly common case. There's a good chance it will fix the other PHP5 segfaults but we can not be sure. Also there's nothing to stop us removing the version check at a later date if we can prove this makes PHP7 more stable too.

catch’s picture

For me I'd commit this, mark it fixed, have a major task to remove the version check.

Garbage collection between PHP 5 and 7 is completely different, so I don't think we'll get the same segfault as such, but it still might be worth removing for other reasons.

xjm’s picture

Thanks @tacitseu, that's really helpful!

alexpott’s picture

Issue summary: View changes
alexpott’s picture

Issue summary: View changes

  • catch committed 83741b3 on 8.4.x
    Issue #2859704 by alexpott, tacituseu, xjm, vaplas, mpdonadio, catch,...

  • catch committed f798b80 on 8.3.x
    Issue #2859704 by alexpott, tacituseu, xjm, vaplas, mpdonadio, catch,...
catch’s picture

Status: Reviewed & tested by the community » Fixed

Opened #2876895: Intermittent segfaults on PHP 7.1 to keep going on the PHP 7.1 segfaults.

Committed/pushed to 8.4.x and cherry-picked to 8.3.x, thanks!

Wim Leers’s picture

Wow, impressive research here!

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

xjm’s picture