Webchick pointed out a situation where no patches could work, and two commits had just been pushed, but 8.x *said* it was green.

On investigation, it seems that the second commit had an error, and retest demonstrated this.

And of course, I've seen qa say "Retest request ignored because test already in progress". But I never thought much about how bad this could be. If I'm describing this correctly, then there are lots of opportunities for false "branch clean" scenarios, where the branch is in fact not clean, and therefore there is a flaw in the branch *and* all patches fail.

If I understand this correctly, it's a relatively serious error that we should fix.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

jthorson’s picture

Priority: Normal » Major

Yup ... we saw this a few weeks ago, and again at Munich.

Any commits which are pushed while that particular branch is already testing do not get tested individually. Thus, it is possible for an untested commit to break head without being caught by a branch test, after which all patches fail.

However, refactoring this to ensure that *every* commit gets tested individually will be quite a challenge ... the current code does not allow for re-queuing a test which is already queued (i.e. until PIFT receives a result), and the testbots do a generic checkout of the latest code when testing; with no existing logic for checking out a given commit. Jimmy and I have discussed adding commit id's to tests in the next iteration of code; but the current code was not structured around this requirement.

Instead of 'every' commit getting tested, I'd propose a solution where we place some code in a commit hook to check whether the branch is currently under test; and if so, flag that branch as needing re-testing once complete. We could then add a check against this 're-test' flag when building the test array sent from PIFT to PIFR. This approach would still require some database changes if we wanted to apply it against all projects ... but if were to build it just for Drupal core (for the time being), we could use a variable to store the flag and get away without touching the database (which I want to avoid for reasons of stability during the push to D8 code freeze).

penyaskito’s picture

I'm not very aware of the internals of the current CI system, but consider that ensuring that *every* commit is tested and shown in the testing page on qa.d.o would make patch rerolling easier too, which is a nice improvement for speeding up development.

If this is affordable on the mid term, it deserves to be reconsidered.

rfay’s picture

Actually, I'd just propose a reliable "Cancel testing" technique be introduced (which we need for various reasons), and then any commit which encounters "already testing", which we already catch, can do "Cancel" and "Test".

jthorson’s picture

Untested, but this may do it.

jthorson’s picture

Status: Active » Needs review
rfay’s picture

So if I read it right, that could get several tests going at once. And the last to complete would be the last to set the branch status. But there's no guarantee that the last to complete would be the latest commit...

boombatower’s picture

Status: Needs review » Active

From what we discussed in IRC this makes since the testbot will continue testing and simply had the results ignored so we just keep resetting the tests. The downside is we never get results when commits are made really really rapidly, but I don't think that is something we should worry about especially given that conduit has revisions on results so we can handle this cleanly there.

nit: unnecessary parentheses

boombatower’s picture

Status: Active » Needs review

cross-post

@rfay the results from any testbots running when reset should be ignored. pifr should only accept the "current" active testbot results for a specific test.

testbot #1 starts on test #2
commit
test #2 requeued
testbot #3 starts on test #2
testbot #1 reports and is ignored, performs a next() like normal
testbot #3 reports and test #2 is finished

alternatively if testbot #1 reports after testbot #3 the results should still be ignored from #1.

rfay’s picture

@boombatower, Works for me then.

boombatower’s picture

If we do this then we might want to come up with a way (possibly reading log) to distinguish, just change the log message status, or leave as is for the message: Test not assigned to client: (t: [test_id], c: [client_id]).

jthorson’s picture

boombatower’s picture

Status: Needs review » Reviewed & tested by the community

Any reason for the extra space above?:

define('PIFR_SERVER_LOG_TEST_REQUEST_RETEST', 11);

Otherwise, looks like what we discussed, only better since it has a new log message that will make things nice and clear. (needs testing)

jthorson’s picture

The rest of the log definitions were grouped with like constants, using whitespace to delineate the groups.

The extra space was to seperate this definition from the previous logical grouping.

boombatower’s picture

Makes sense, just can't see any in context and been a long time. :) Nice work.

rfay’s picture

To test it, let's add to the log output the git rev-parse HEAD of the project under test (just add a new git command along with the rest of them). That way we'll know more than just that whatever the branch is. We should have done this a long time ago. Also, I didn't look carefully, but the watchdog should show the test interruption and replacement.

jthorson’s picture

Since we don't really have a staging site with full git integration, I'm going to commit this, roll an alpha release, and deploy it to prod to verify the functionality.

jthorson’s picture

Committed to 6.x-2.x. (Commit ID: 77402e8)

jthorson’s picture

Status: Reviewed & tested by the community » Fixed

Deployed to staging and prod. Will untag the '-alpha1' on the release after a week of soak time.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.