Multiple commits result in unreliable testing [#1757668]

Webchick pointed out a situation where no patches could work, and two commits had just been pushed, but 8.x *said* it was green.

On investigation, it seems that the second commit had an error, and retest demonstrated this.

And of course, I've seen qa say "Retest request ignored because test already in progress". But I never thought much about how bad this could be. If I'm describing this correctly, then there are lots of opportunities for false "branch clean" scenarios, where the branch is in fact not clean, and therefore there is a flaw in the branch *and* all patches fail.

If I understand this correctly, it's a relatively serious error that we should fix.

Comment	File	Size	Author
#11	pifr_always-requeue-branch-tests_1757668-11.patch	2.51 KB	jthorson
#4	pifr-always_requeue_branch_tests-1757668.patch	685 bytes	jthorson

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Comment #1

jthorson CreditAttribution: jthorson commented 27 August 2012 at 18:29

Priority:

Normal

» Major

Yup ... we saw this a few weeks ago, and again at Munich.

Any commits which are pushed while that particular branch is already testing do not get tested individually. Thus, it is possible for an untested commit to break head without being caught by a branch test, after which all patches fail.

However, refactoring this to ensure that *every* commit gets tested individually will be quite a challenge ... the current code does not allow for re-queuing a test which is already queued (i.e. until PIFT receives a result), and the testbots do a generic checkout of the latest code when testing; with no existing logic for checking out a given commit. Jimmy and I have discussed adding commit id's to tests in the next iteration of code; but the current code was not structured around this requirement.

Instead of 'every' commit getting tested, I'd propose a solution where we place some code in a commit hook to check whether the branch is currently under test; and if so, flag that branch as needing re-testing once complete. We could then add a check against this 're-test' flag when building the test array sent from PIFT to PIFR. This approach would still require some database changes if we wanted to apply it against all projects ... but if were to build it just for Drupal core (for the time being), we could use a variable to store the flag and get away without touching the database (which I want to avoid for reasons of stability during the push to D8 code freeze).

Comment #2

penyaskito

he/him

Spanish

🧑🏽‍🌾Seville 💃, Andalusia 🇪🇸, UTC+2 🇪🇺

CreditAttribution: penyaskito commented 27 August 2012 at 18:47

I'm not very aware of the internals of the current CI system, but consider that ensuring that *every* commit is tested and shown in the testing page on qa.d.o would make patch rerolling easier too, which is a nice improvement for speeding up development.

If this is affordable on the mid term, it deserves to be reconsidered.

Comment #3

rfay

English

Palisade, CO, USA

CreditAttribution: rfay commented 27 August 2012 at 18:54

Actually, I'd just propose a reliable "Cancel testing" technique be introduced (which we need for various reasons), and then any commit which encounters "already testing", which we already catch, can do "Cancel" and "Test".

Comment #4

jthorson CreditAttribution: jthorson commented 28 August 2012 at 21:45

File	Size
pifr-always_requeue_branch_tests-1757668.patch	685 bytes

Untested, but this may do it.

Comment #5

jthorson CreditAttribution: jthorson commented 28 August 2012 at 21:50

Status:

Active

» Needs review

Comment #6

rfay

English

Palisade, CO, USA

CreditAttribution: rfay commented 28 August 2012 at 21:52

So if I read it right, that could get several tests going at once. And the last to complete would be the last to set the branch status. But there's no guarantee that the last to complete would be the latest commit...

Comment #7

boombatower CreditAttribution: boombatower commented 28 August 2012 at 21:56

Status:

Needs review

» Active

From what we discussed in IRC this makes since the testbot will continue testing and simply had the results ignored so we just keep resetting the tests. The downside is we never get results when commits are made really really rapidly, but I don't think that is something we should worry about especially given that conduit has revisions on results so we can handle this cleanly there.

nit: unnecessary parentheses

Comment #8

boombatower CreditAttribution: boombatower commented 28 August 2012 at 21:59

Status:

Active

» Needs review

cross-post

@rfay the results from any testbots running when reset should be ignored. pifr should only accept the "current" active testbot results for a specific test.

testbot #1 starts on test #2
commit
test #2 requeued
testbot #3 starts on test #2
testbot #1 reports and is ignored, performs a next() like normal
testbot #3 reports and test #2 is finished

alternatively if testbot #1 reports after testbot #3 the results should still be ignored from #1.

Comment #9

rfay

English

Palisade, CO, USA

CreditAttribution: rfay commented 28 August 2012 at 22:28

@boombatower, Works for me then.

Comment #10

boombatower CreditAttribution: boombatower commented 31 August 2012 at 20:32

If we do this then we might want to come up with a way (possibly reading log) to distinguish, just change the log message status, or leave as is for the message: Test not assigned to client: (t: [test_id], c: [client_id]).

Comment #11

jthorson CreditAttribution: jthorson commented 1 September 2012 at 21:34

File	Size
pifr_always-requeue-branch-tests_1757668-11.patch	2.51 KB

Here you go!

Comment #12

boombatower CreditAttribution: boombatower commented 4 September 2012 at 23:26

Status:

Needs review

» Reviewed & tested by the community

Any reason for the extra space above?:

define('PIFR_SERVER_LOG_TEST_REQUEST_RETEST', 11);

Otherwise, looks like what we discussed, only better since it has a new log message that will make things nice and clear. (needs testing)

Comment #13

jthorson CreditAttribution: jthorson commented 5 September 2012 at 01:27

The rest of the log definitions were grouped with like constants, using whitespace to delineate the groups.

The extra space was to seperate this definition from the previous logical grouping.

Comment #14

boombatower CreditAttribution: boombatower commented 5 September 2012 at 16:52

Makes sense, just can't see any in context and been a long time. :) Nice work.

Comment #15

rfay

English

Palisade, CO, USA

CreditAttribution: rfay commented 6 September 2012 at 03:03

To test it, let's add to the log output the git rev-parse HEAD of the project under test (just add a new git command along with the rest of them). That way we'll know more than just that whatever the branch is. We should have done this a long time ago. Also, I didn't look carefully, but the watchdog should show the test interruption and replacement.