Fatal PHP errors don't cause tests to fail [#560646]

Comment	File	Size	Author
#36	simpletest-completion-check.2.patch	2.88 KB	carlos8f
#30	simpletest-completion-check.png	15.7 KB	carlos8f
#28	simpletest-completion-check.patch	1.9 KB	carlos8f
#23	560646-runtests-concurrency.patch	1.52 KB	boombatower
#12	catch_fatals.patch	2.6 KB	chx
#11	catch_fatals.patch	2.6 KB	chx
#9	when_in_trouble_write_in_c.patch	2.06 KB	chx
#7	when_in_trouble_write_in_c.patch	2.05 KB	chx

Comment #1

boombatower commented 22 September 2009 at 00:53

Project:	Project Issue File Review	» Drupal core
Version:	6.x-1.x-dev	» 7.x-dev
Component:	Code	» simpletest.module

I thought about this the other day, and you most likely have confirmed it.

I think the current way of detecting fatal errors works 100% on single process running, but when running multiple processes concurrently like bot does it may not always work when the errors occur on the other side of an HTTP request. This is not a pifr issue, but instead a limitation of the error catching with simpletest.

This is a really fun issue to solve, but the solution here is to make it keep track of all the prefixes or something like that and then check all the error logs, or to use a common one, but I'm not sure if that will make write errors.

Log in or register to post comments

Comment #2

boombatower commented 22 September 2009 at 00:54

Title:

PIFR not catching fatal error in this patch

» SimpleTest does not *always* catch PHP fatal errors on HTTP request side when running in concurrency mode

Log in or register to post comments

Comment #3

damien tournoud commented 22 September 2009 at 06:35

Title:	SimpleTest does not always catch PHP fatal errors on HTTP request side when running in concurrency mode	» PIFR 1.x do not catch PHP fatal errors on parent site
Project:	Drupal core	» Project Issue File Review
Version:	7.x-dev	» 6.x-1.x-dev
Component:	simpletest.module	» Code
Status:	Active	» Closed (won't fix)

The only issue is that PIFR 1.x parse the result of running the run-tests shell script, and do not load the assertions that are in the DB. Because of the fatal error, the test results for a particular test are truncated. This will be solved by rolling PIFR 2.x. By the way, this only happens when fatal errors occurs on the parent site.

Log in or register to post comments

Comment #4

boombatower commented 22 September 2009 at 23:21

Title:	PIFR 1.x do not catch PHP fatal errors on parent site	» SimpleTest does not always catch PHP fatal errors on HTTP request side when running in concurrency mode
Project:	Project Issue File Review	» Drupal core
Version:	6.x-1.x-dev	» 7.x-dev
Component:	Code	» simpletest.module
Status:	Closed (won't fix)	» Active

Yes, it does only happen on parent site, which is what I meant by on the request side.

I thought it dump out the message in run-tests...I'll double check, but I am fairly certain it does.

Also the limitation I brought up is an actually issue and can occur.

Either way both of these problems are with SimpleTest and not PIFR.

So we need to double check run-tests.sh and fix the issue I mentioned about concurrency.

Log in or register to post comments

Comment #5

carlos8f commented 10 December 2009 at 03:12

subscribe.

Log in or register to post comments

Comment #6

carlos8f commented 10 December 2009 at 03:32

Title:	SimpleTest does not always catch PHP fatal errors on HTTP request side when running in concurrency mode	» Fatal PHP errors don't cause tests to fail
Priority:	Normal	» Critical

I hope we can get some attention to this issue, since it's obviously horrible for quality assurance. See #655814: Test: Error reporting with SimpleTest/PIFR where I did some testing on this.

Log in or register to post comments

Comment #7

chx commented 10 December 2009 at 16:19

Status:

Active

» Needs review

Status	File	Size
new	when_in_trouble_write_in_c.patch	2.05 KB

I wonder what the bot would say. This is a very C inspired patch hence the name :) Note that i registered printf to shutdown with because print is not callable like that.

Log in or register to post comments

Comment #8

10 December 2009 at 16:41

Status:

Needs review

» Needs work

The last submitted patch failed testing.

Log in or register to post comments

Comment #9

chx commented 10 December 2009 at 16:45

Status:

Needs work

» Needs review

Status	File	Size
new	when_in_trouble_write_in_c.patch	2.06 KB

Bah.

Log in or register to post comments

Comment #10

10 December 2009 at 17:11

Status:

Needs review

» Needs work

The last submitted patch failed testing.

Log in or register to post comments

Comment #11

chx commented 10 December 2009 at 18:55

Status:

Needs work

» Needs review

Status	File	Size
new	catch_fatals.patch	2.6 KB

We now test for content-type for text/html before printing.

Log in or register to post comments

Comment #12

chx commented 10 December 2009 at 19:15

Status	File	Size
new	catch_fatals.patch	2.6 KB

OK, OK

Log in or register to post comments

Comment #13

carlos8f commented 11 December 2009 at 03:08

Status:

Needs review

» Needs work

+++ modules/simpletest/drupal_web_test_case.php	2009-12-10 19:14:57 +0000
@@ -1285,8 +1285,18 @@ class DrupalWebTestCase extends DrupalTe
+    $headers = $this->drupalGetheaders();
+    if (isset($headers['Content-Type']) && $headers['Content-Type'] == 'text/html; charset=utf-8') {

Header names returned from drupalGetHeaders() are normalized to lowercase. This has to be 'content-type' to work.

+++ includes/bootstrap.inc	2009-12-10 18:54:43 +0000
@@ -2247,3 +2250,9 @@ function drupal_static_reset($name = NUL
+  if (drupal_get_http_header('Content-Type') == 'text/html; charset=utf-8') {

Should probably be drupal_get_http_header('content-type'), to be consistent with lowercase normalization on HTTP header names.

+++ includes/bootstrap.inc	2009-12-10 18:54:43 +0000
@@ -2247,3 +2250,9 @@ function drupal_static_reset($name = NUL
+    print chr(0);

Why use a null byte? Seems kinda sneaky :) Why not use an HTML comment like ?

This patch is creative, but after getting it to work, unfortunately I couldn't verify that it improved error reporting at all. It seems like it only attempts to address incomplete drupalGet() responses due to fatal errors. From my testing today, PIFR currently catches these, but run-tests.sh does not (with concurrency mode on *or* off).

I'm working with run-tests.sh right now and noticing that it actually passes tests that fail both the web test interface and the PIFR bot. See http://qa.drupal.org/pifr/test/22368 for one example: this actually passed run-tests.sh with 16/0/0 passes/fails/exceptions in DrupalErrorCollectionUnitTest. How strange...

This review is powered by Dreditor.

Log in or register to post comments

Comment #14

boombatower commented 11 December 2009 at 04:31

Assuming this is still about what I wrote the issue is when:

run-tests.sh --concurrency > 1

Log in or register to post comments

Comment #15

carlos8f commented 11 December 2009 at 04:47

@boombatower: i used three things: run-tests.sh with --concurrency 1, --concurrency 2, and no concurrency set. run-tests.sh passed one test class 16/0/0 while PIFR failed it 16/1/0 in each case. Unfortunately I can't reproduce exactly what command syntax I was using and with what patches, but I'll see if I can document it more.

I suspect it might have to do with batch size, i.e. running with a TestGroup or --class TestClass.

Anyway, I'm pretty determined to get to the bottom of this :)

Log in or register to post comments

Comment #16

boombatower commented 11 December 2009 at 05:19

As I attempted to document above, I can encounter the error or not even if failed test and concurrency > 1.

** > 1 simply means there is a chance. **

Essentially what has to happen is that a test with the fatal error has to complete after another test has started, thus the "last_prefix" field no longer points to the test prefix that had the fail, and it is never picked up.

There are two solutions:
1) Different method of detecting fatal errors, (not terribly simple)
2) Better way of grabbing errors from files.

As for #2 it would by simpler to pull from all files, but then we loss track of what test they are for and such, so we either need a place to store that array, maybe run-tests.sh only (prefer not, but web ui can't run multiple concurrency so doesn't have this issue).

Log in or register to post comments

Comment #17

carlos8f commented 11 December 2009 at 06:15

Thanks for the elaboration. As far as fatals in the .test code (which I think is the main issue, since PIFR has caught the ones in drupalGet() responses so far) it's extremely reproducible that entire test classes drop out of the PIFR results, so my thought is to have PIFR mark the entire class as failed if it enumerated it at the beginning and it didn't come back in the results. That would be like a 'Detect aborted test' check.

I will have more to say after I've installed PIFR locally and played around with it.

Log in or register to post comments

Comment #18

chx commented 11 December 2009 at 06:44

carlos8f, i used a 0 byte because we are writing after the so anything you write is illegal html anyways and need to be sliced off. I suspected something was wrong with the code and lower case it is :/ the problem with the latest path is the lack of Content-Type does not trigger the error handler, this needs investigation.

Log in or register to post comments

Comment #19

carlos8f commented 11 December 2009 at 10:11

@chx: since you're detecting and filtering out the null byte before parse() is called on the HTML, there seems to be no worry of invalid HTML. Not sure what you mean by lack of Content-Type, the content type is 'text/html; charset=utf-8' as far as I see. Anyhow, glad to have you working on this.

Log in or register to post comments

Comment #20

chx commented 11 December 2009 at 12:12

Well I would be glad if others would now work on this :) What I meant is that the current patch does if (isset($headers['Content-Type']) && $headers['Content-Type'] == 'text/html; charset=utf-8') but what if you have a fatal error and no Content-Type header? The previous patch ommitted this and then I got many failures. This requires a bit of investigation.

Log in or register to post comments

Comment #21

carlos8f commented 11 December 2009 at 20:06

Status:

Needs work

» Active

OK, so the latest incident involves #653940: Clean-up: Tests do not report all errors in which a fatal error was introduced in comment.test. It crashes browser-based SimpleTest as well as run-tests.sh, but PIFR somehow marked the patch as 'passed' and it was committed. This is unacceptable and really requires some attention.

Log in or register to post comments

Comment #22

glacialheart commented 11 December 2009 at 20:33

subscribing

Log in or register to post comments

Comment #23

boombatower commented 11 December 2009 at 22:55

Status	File	Size
new	560646-runtests-concurrency.patch	1.52 KB

This should fix problem once I figure out how to pull db_prefixes used.

Edit: patch is a bit off, but doesn't work yet anyway.

Log in or register to post comments

Comment #24

boombatower commented 11 December 2009 at 23:53

Yea, so I've been staring at the code...there is no way to relate prefixes to test info, unless we log in in some manor. The reason being that the prefix is created in DWTC and if that process dies..there goes the relation.

The data must either be logged in some manor (several possibilities), or we refactor the code as I have suggested and attempted to create the environments outside of test run and place test run in isolated process.

This provides a number of advantages including:
1) performance gain from re-using schemes
2) easy tracking of errors
3) complete and proper isolation which will remove static issues completely

Log in or register to post comments

Comment #25

12 December 2009 at 04:28

Status:

Active

» Needs review

biocomp.pat requested that failed test be re-tested.

Log in or register to post comments

Comment #26

carlos8f commented 12 December 2009 at 04:38

Status:

Needs review

» Needs work

I'm at work on a patch for PIFR that will implement the sanity check I described in #17. That will plug the gap, but the SimpleTest concurrency improvements will still be necessary for proper error reporting.

Log in or register to post comments

Comment #27

sun

German

Karlsruhe

commented 12 December 2009 at 21:13

subscribing

Log in or register to post comments

Comment #28

carlos8f commented 12 December 2009 at 21:22

Assigned:	Unassigned	» carlos8f
Status:	Needs work	» Needs review

Status	File	Size
new	simpletest-completion-check.patch	1.9 KB

OK, I think I have found a definitive solution in SimpleTest:

     // Iterate through all the methods in this class.
-    foreach (get_class_methods(get_class($this)) as $method) {
+    foreach (get_class_methods($class) as $method) {
       // If the current method starts with "test", run it - it's a test.
       if (strtolower(substr($method, 0, 4)) == 'test') {
+        // Insert a fail record. This will be deleted on completion to ensure
+        // that testing completed.
+        $method_info = new ReflectionMethod($class, $method);
+        $caller = array(
+          'file' => $method_info->getFileName(),
+          'line' => $method_info->getStartLine(),
+          'function' => $class . '->' . $method . '()',
+        );
+        DrupalTestCase::insertAssert($this->testId, $class, FALSE, t('The test did not complete due to a fatal error.'), 'Completion check', $caller);

My idea is to preemptively fail every test method. If execution does not complete, the fail record is left in the database!

I spent some quality time with PIFR last night (strictly professional of course) and verified boombatower's reports that fatal errors don't get read out of the logs reliably with concurrency > 1. Since my test machine had 2 cores, I set concurrency=2. A fatal error was reported about half the time. With the production test clients, the situation is different because of higher concurrency settings. It seems that the fatal error is almost never caught in production, where it is most vital to catch it.

The test results of this patch are very conclusive, namely that the fails are not caused by the patch itself, but by fatal errors in HEAD. The Comment tests are indeed currently broken as I described in http://drupal.org/node/653940#comment-2366180, and this is the first time it has been revealed!

I also benchmarked the effect of the extra queries, and determined that it doesn't noticeably increase the runtime of tests. So that's also good.

Log in or register to post comments

Comment #29

12 December 2009 at 22:01

Status:

Needs review

» Needs work

The last submitted patch failed testing.

Log in or register to post comments

Comment #30

carlos8f commented 12 December 2009 at 22:06

Status:

Needs work

» Needs review

Status	File	Size
new	simpletest-completion-check.png	15.7 KB

Attached a screenshot of what the results look like.

Log in or register to post comments

Comment #31

carlos8f commented 13 December 2009 at 02:25

Note that #28 will need a re-test after http://drupal.org/node/653940#comment-2369428 has landed.

Log in or register to post comments

Comment #32

sun

German

Karlsruhe

commented 13 December 2009 at 03:42

+1 from me, very nice patch

Log in or register to post comments

Comment #33

dries commented 13 December 2009 at 12:28

Comment #24 sounds compelling -- care to elaborate boombatower?

Log in or register to post comments

Comment #34

chx commented 13 December 2009 at 15:24

I emailed Jimmy because I had no queue access about something similar

       $this->setUp();
       db_insert('simpletest_fatals')->values(array('testId' => $this->testId, 'method' => $method))->execute();
       try {
         $this->$method();
         // Finish up.
       }
       catch (Exception $e) {
         $this->exceptionHandler($e);
       }
       db_delete('simpletest_fatals')->condition('testId', $this->testId)->condition('method', $method)->execute();
       $this->tearDown();

And then at the end iterate simpletest_fatals and there you have it. I am not sure I like the idea of inserting and deleting an assert like that. You are taking away any OOP advantages we had by breaking the encapsulation.

Log in or register to post comments

Comment #35

carlos8f commented 13 December 2009 at 20:10

@chx, see my patch in #28 that implements this :) i've pinged boombatower and waiting to hear back.

I think what boombatower's overhaul entails is basically isolating the test runs as much as possible. That would mean generating the testId in a parent process, spawning a child process to run the test and nothing more, and then doing cleanup in the parent process which would not have crashed if a fatal error occurred in the test. This is perfectly logical since one of SimpleTest's traditional flaws is that it isn't properly isolated from PHP, thus a fatal error ruins everything unless there are multiple layers of PHP to provide a fallback.

I didn't understand the 'performance gain from re-using of schemes' part though. Elaboration, yes.

Log in or register to post comments

Comment #36

carlos8f commented 13 December 2009 at 20:08

Status	File	Size
new	simpletest-completion-check.2.patch	2.88 KB

chx and I talked in IRC and decided to add a DrupalTestCase::deleteAssert() method since deleting it manually was kind of messy. DrupalTestCase::insertAssert() now returns a message ID to make it easier to delete.

Log in or register to post comments

Comment #37

13 December 2009 at 20:21

Status:

Needs review

» Needs work

The last submitted patch failed testing.

Log in or register to post comments

Comment #38

carlos8f commented 13 December 2009 at 20:39

Once again, the fail in #36 is caused by BROKEN TESTS ELSEWHERE:

The test did not complete due to a fatal error.	Completion check	options.test	60	OptionsWidgetsTestCase->testRadioButtons()

Running Option widget tests locally:

Fatal error: Call to undefined method OptionsWidgetsTestCase::assertFieldValues()

assertFieldValues() is in FieldTestCase, but OptionsWidgetsTestCase does NOT extend FieldTestCase. Problem!

This patch is worth it's weight in gold, as far as I'm concerned :)

Log in or register to post comments

Comment #39

carlos8f commented 13 December 2009 at 22:57

Status:

Needs work

» Needs review

Latest development in the saga:

http://drupal.org/node/552436#comment-2371556

Log in or register to post comments

Comment #40

14 December 2009 at 05:40

chx requested that failed test be re-tested.

Log in or register to post comments

Comment #41

chx commented 14 December 2009 at 18:48

Status:

Needs review

» Reviewed & tested by the community

This is very pretty especially with a specific id being deleted.

Log in or register to post comments

Comment #42

webchick

she/they

English

Vancouver 🇨🇦

commented 14 December 2009 at 19:25

Status:

Reviewed & tested by the community

» Fixed

This is awesome-tastic awesome souflé smothered in awesomesauce.

Thanks for your great work on this, carlos! Committed to HEAD.

Log in or register to post comments

Comment #43

boombatower commented 14 December 2009 at 19:50

So we don't actually catch the fatal error, hmm. This is a nice temporary patch, but it should record the actual error, otherwise the fact that qa.d.o displays assertions will be useless and I will get complaints that the bot fails and they don't know why.

Glad to see some initiative in providing a quick fix.

Log in or register to post comments

Comment #44

carlos8f commented 14 December 2009 at 20:07

boombatower:

It now points them to the file, method, and start line, and tells them 'fatal error', although obviously the actual error would be helpful. In a perfect world, every developer that writes a patch would run simpletest locally (at least with the tests that relate to their patch) before they submit to d.o. That way they would see the actual error 100% of the time and save our bot queue the trouble :)

Log in or register to post comments

Comment #45

boombatower commented 14 December 2009 at 20:28

That was the reason I didn't both including the assertion messages in the first place (and had issues with XML-RPC scaling). Yet, from my experience and complaints received this definitely isn't a perfect world. I agree with you, developers should debug tests locally.

Log in or register to post comments

Comment #46

chx commented 15 December 2009 at 01:04

So yeah but we needed pifr v2 and its cool detailed report because sometimes the bot fails differently to local to begin with. But, it's fine to know at least which method fatal'd.

Log in or register to post comments

Comment #47

29 December 2009 at 01:10

Status:

Fixed

» Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

Log in or register to post comments

Comment #48

chrisns commented 31 March 2011 at 13:56

I did this patch which solved it for us
https://gist.github.com/896344/
simple but seems to work for now

Log in or register to post comments

Fatal PHP errors don't cause tests to fail

Comments