Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Status: Needs review » Needs work

The last submitted patch, drupal8.run-tests-fatal-clean.0.patch, failed testing.

sun’s picture

Status: Needs work » Needs review
FileSize
4.38 KB

Added the test-specific error.log to the review log output.

Apparently, I do not see any output from the new cleanup function in the review log for patch #0 in the testbot review log. However, this works correctly on my local machine. That possibly means that the simpletest_test_id table does not contain a db prefix on the testbot. (?)

Status: Needs review » Needs work

The last submitted patch, drupal8.run-tests-fatal-clean.2.patch, failed testing.

sun’s picture

Status: Needs work » Needs review
FileSize
4.55 KB

Review log for #2 contains the expected cleanup operations after the fatal error.

However, there's a second fatal error in another test case now.

The $test_id appears to be the same for all test runners? How is that possible? For Simpletest module, it changes for every test case being executed in the batch.

sun’s picture

I fail to see how {simpletest_test_id}.last_prefix is able to work when tests are executed concurrently.

run-tests.sh calls into simpletest_last_test_get() (like Simpletest module itself), but that is totally not designed for concurrency in parallel threads.

Status: Needs review » Needs work

The last submitted patch, drupal8.run-tests-fatal-clean.4.patch, failed testing.

sun’s picture

Status: Needs work » Needs review

The concurrency problem with last_prefix cannot be fixed without a database schema change, which inherently means it cannot be fixed without adjusting the entire PIFR stack.

The PIFR stack must work across major versions of Drupal, so it would need a conditional, version-agnostic adaption all over the place.

I've no interest in pursuing that.

So while this patch here fixes a tough problem, which causes @jthorson to have to manually reset and clean individual testbots every second day, it's not possible to commit the fix.

sun’s picture

Alternatively, let's see what happens if we assign a new test_id for every class. This might blow up though.

Status: Needs review » Needs work

The last submitted patch, drupal8.run-tests-fatal-clean.8.patch, failed testing.

sun’s picture

Status: Needs work » Needs review
FileSize
7.83 KB
8.44 KB

Looks like that stop-gap fix worked. Not sure about PIFR implications, but attached patch completes the fix and properly removes the bogus assumption that there's only one $test_id.

sun’s picture

Priority: Normal » Critical
Issue tags: +Needs backport to D6, +Needs backport to D7
sun’s picture

oink. Now the review log contains the additional cleanup debug messages for each test case:

Node access and fields 50 passes, 0 fails, and 0 exceptions

- Found database prefix 'simpletest644034' for test ID 280.
Node access rebuild 19 passes, 0 fails, and 0 exceptions

- Found database prefix 'simpletest983969' for test ID 282.
Node access pagination 30 passes, 0 fails, and 0 exceptions

- Found database prefix 'simpletest60768' for test ID 281.
Node access records 26 passes, 0 fails, and 0 exceptions

- Found database prefix 'simpletest433378' for test ID 283.
Node Access on any table 159 passes, 0 fails, and 0 exceptions

That should only be output in case of a fatal error. Need to think how to fix this.

sun’s picture

Attached patch simplifies the changes and should resolve the verbose debug output issue for positive tests.

Status: Needs review » Needs work

The last submitted patch, drupal8.run-tests-fatal-clean.13.patch, failed testing.

sun’s picture

Status: Needs work » Needs review

So #13 actually works as intended.

However, as of now, DrupalUnitTestCase only sets and changes the database prefix, but does not update it correctly in Simpletest's test_id table. The patch in #1563620: All unit tests blow up with a fatal error fixes that.

catch’s picture

So #13 exposes the unit tests bug that's been hidden for months?

Berdir’s picture

Status: Needs review » Needs work
+++ b/core/scripts/run-tests.shundefined
@@ -347,8 +342,12 @@ function simpletest_script_execute_batch($test_id, $test_classes) {
         if ($status['exitcode']) {
-          echo 'FATAL ' . $test_class . ': test runner returned a non-zero error code (' . $status['exitcode'] . ').' . "\n";
+          echo 'FATAL ' . $child['class'] . ': test runner returned a non-zero error code (' . $status['exitcode'] . ').' . "\n";

Separate issue, but is there a way to extract the actual error and display it somehow? It can be very hard to see what the actual error is currently.

I'm not sure what exactly is the state here, does this need to go in first or the linked issue? Or doesn't it matter? Anyway, this certainly needs a version of the patch without the forced fatal error in node.test.

sun’s picture

Status: Needs work » Needs review

The actual error is supposed to be captured and inserted into SimpleTest's assertion table. In case of fatal errors the actual error is read from the PHP error.log, and this happens after the fact (and is only possible in run-tests.sh, since simpletest.module runs in the same process as the test).

Note that this issue/change is about web/integration tests, not unit tests.

In order to be able to clean up the test environment after a test run with fatal error(s), the test runner needs to know the database prefix that was used to set up the test.

Simpletest normally records the database prefix for each test being run. Simpletest itself attempts to clean up the test environment after each test run. In case of a fatal error, those Simpletest functions are no longer executed.

The current run-tests.sh script does not contain any clean up code currently. This patch basically adds the same clean-up operations to run-tests.sh, which are executed whenever a test runner completes.

However, run-tests.sh supports a special notion of concurrently executing multiple test runner threads in sub-processes, which was only added to the run-tests.sh script, but not to Simpletest. Right now, run-tests.sh uses the same test ID for all tests being executed. This means that, in concurrency, run-tests.sh is not able to retrieve the database prefix that was used to setup the test site, because the concurrently executed test runners overwrite the recorded database prefix. Therefore, this patch changes run-tests.sh to use a separate unique test ID for each test runner, so each test runner can retrieve the database prefix, and in turn, clean up the environment after test execution.

This also implies a functional change -- the configuration option that controls whether the test environment should be cleaned after a test run is ignored.

Unit tests do not properly set up a database prefix currently - even though they are supposed to record any errors and fatal errors in the same way like web tests. That's why this patch depends on #1563620: All unit tests blow up with a fatal error

sun’s picture

Note: We're testing these patches in #1591812: Sun's Simpletest patches currently.

sun’s picture

Technically, this should work correctly now.

Status: Needs review » Needs work

The last submitted patch, drupal8.run-tests-fatal-clean.20.patch, failed testing.

sun’s picture

Status: Needs work » Needs review
FileSize
8.55 KB

Excellent.

And now. Let's try to kill a bot.

Status: Needs review » Needs work

The last submitted patch, drupal8.run-tests-fatal-clean.22.patch, failed testing.

sun’s picture

No improvement. :( Sorry.

http://qa.drupal.org/pifr/test/135299 is stuck with another, subsequent test now.

The review log still contains MySQL data file I/O errors. I don't know why.

Technically, the concurrency fix for run-tests.sh worked. But at some point, MySQL suddenly starts with I/O errors on data files.

I wonder whether the testbot client itself attempts to perform additional clean-up operations on its own, which might conflict?

EDIT: The only actual improvement is that the patch/test didn't come back as "green" (with 0 passes).

Berdir’s picture

That does sound like a full file system (RAM in our case) issue...

sun’s picture

Right. However, that's exactly what this patch tries to prevent.

Berdir’s picture

Well, I think it's lying, because it's not actually removing the tables.

I've applied the latest patch and started executing all tests with a concurrency of 4. And my table list kept growing and growing despite it telling me that it removed them. Did some SHOW TABLES like 'prefix%'; and the exact amount of tables that it claimed to have removed where still there.

Berdir’s picture

Ah, I know why.

  foreach (db_find_tables($db_prefix . '%') as $table) {
    db_drop_table($tables);
    $count++;
  }

Guess what's wrong here ;)

jthorson’s picture

Status: Needs work » Needs review
FileSize
10.39 KB

Nice catch!

Let's try this one. (Same patch, without the 's'.)

Status: Needs review » Needs work

The last submitted patch, drupal8.run-tests-fatal-clean.29.patch, failed testing.

jthorson’s picture

Nice ... expected failure, many fatals, but no PDO exceptions or "bot-killing" /tmpfs errors this time! That's easily the most exciting 'failed' test I've seen in a LONG time! :)

jthorson’s picture

This version strips out the entity_create('foo'), and the commented code inside of index.php EDIT: Half of a patch which shouldn't have been here in the first place! I believe that should be enough to run green (and thus represent the patch for actual commit).

The last submitted patch, drupal8.run-tests-fatal-clean.32.without-fatal.patch, failed testing.

jthorson’s picture

Doh ... looks like a previous patch snuck into #29.

Here's two patches ... one with the test-bot killing fatals (#22 with the extra 's' removed), and one without (should be the 'green' patch for application).

Status: Needs review » Needs work

The last submitted patch, drupal8.run-tests-fatal-clean.34.without-fatal.patch, failed testing.

jthorson’s picture

Status: Needs work » Needs review

Odd ... the without-fatal test failed the first time through, with PDO errors on an image-related test. Re-test run came back clean.

sun’s picture

To make sure we're not introducing random test failures, I'm resending this patch for re-test all over again a bit... (already done so two times)

sun’s picture

Dries’s picture

Status: Needs review » Reviewed & tested by the community

I reviewed this patch and it looks good. I'm going to mark it RTBC and give it a bit more time before getting this committed.

Dries’s picture

Status: Reviewed & tested by the community » Fixed

Committed to 8.x. Thanks.

Berdir’s picture

Looking a http://qa.drupal.org/pifr/test/279838, which has some fatal errors but also working tests. The testbot however now doesn't list anymore which tests failed and which don't, so you have to go through the detail log to find out what exactly happened. So the nice thing is that you actually can do that and actuall see a PHP fatal error there but there is no nice overview anymore.

Can we do anything anywhere (run-tests.sh, testbot, ..) to improve that?

catch’s picture

Version: 8.x-dev » 7.x-dev
Status: Fixed » Patch (to be ported)
jthorson’s picture

aspilicious’s picture

Status: Patch (to be ported) » Needs review
FileSize
2.07 KB
8.32 KB

Patch and a diff to proof the patch is identical as the D8 version.

sun’s picture

Status: Needs review » Postponed

heh. Just wanted to cry and ask why and how on earth #44 was able to pass, but "fortunately", only the reported test result is misleading - the review log contains fatal errors.

That is, because this patch depends on the changes in #1563620: All unit tests blow up with a fatal error, so postponing on that :)

sun’s picture

The primary dependency landed: #1541958: Split setUp() into specific sub-methods

The second dependency is: #1563620: All unit tests blow up with a fatal error

After that, we can backport this one (whereas this patch is the most straightforward of the chain).

sun’s picture

Version: 7.x-dev » 8.x-dev
Status: Postponed » Closed (fixed)
Issue tags: -Needs backport to D6, -Needs backport to D7

The primary purpose of backporting these changes was to retain a comparable testing framework between D8 and D7, so as to make future backports easier.

Getting close to D8 feature freeze, and facing a heavily diverged simpletest framework between D8 and D7, it's close to impossible to retain "similarity" between major core versions, and doing so involves a huge risk of breaking things badly in D7.

Therefore, I'm calling out the end to testing framework backports.

For D7, you get to work with the current mess. At the very least, that's "predictable."

Needless to say, actual bug fixes are excluded from that. However, given that there are 30,000+ test assertions that pass against D7, people will have to make pretty good arguments and show solid proof that there is actually a bug somewhere. ;)

Let's focus on the future. (Which isn't Simpletest either way.)

Thanks everyone for working on this and making Drupal awesome! :)