Problem/Motivation
DrupalCI currently uses a number of work-arounds to signal itself, Jenkins, and D.O what's going on.
These different strategies are inconsistent and hard to maintain.
Proposed resolution
Based on #2843562-5: Patch plugin doesn't signal an application error
Some design requirements for DrupalCI exception handling and other signaling:
A D.O-level ‘CI Error’ should mean that the CI system broke, not that there was an testing or validation error during CI.
All drupalci code should strive to avoid ‘CI Error’ at D.O level. ‘CI Error’ comes from drupalci returning anything other than 0.
This means BuildTaskException
should always be handled, and should always be handled consistently by Build::executeBuild()
. Intermediaries can catch and process these exceptions, but then must re-throw them.
The Build
object eventually turns a BuildTaskException
into buildstate.json
. (todo: rename buildstate, give BTE a way to generate its own json.)
buildstate.json
is a single quoted string which can be read by D.O. It should contain a user-facing explanation of what happened. It could appear on D.O in the in-comment test run results, such as "Build Successful" or "CI Error".
Plugins should handle exceptions from code they call, for likely issues such as a container is missing. These should then be turned into a BuildTaskException
which is then thrown to be handled by Build
.
Comments
Comment #2
Mile23Comment #3
Mile23Note that
Build::executeBuild()
says this:That is, it always returns 2 for an exception.
Comment #4
MixologicRight, thats what we want.
This:
Reads funny.
It is only *currently* that way, because anything other than 0 causes jenkins to FAIL the build. Drupal.org does not *currently* know what to do with FAILURE.
So, We first have to give drupal.org something it can use when it sees FAILURE - i.e. buildstate.json needs to be populated with a short message describing the nature of the failure: Patch Failed to Apply, PHP syntax Error, Composer Require Error, etc etc.
Second thing is change drupal.org to not just default to CI ERROR when it sees FAILURE. It should see FAILURE and then parse buildstate.json and display that message Instead of "DrupalCI Error"
Then, once that is working, we can return to having drupalCI spit out proper signal responses. I..e it should always spit out a 2 if there is a BuildTaskException, and 1 if there were failed assessments, and 0 if everything is hunky dory. And probably something other than 0, 1, or 2 if a portal in the spacetime continuum opens up to unlighted chambers beyond time and space amidst the muffled, maddening beating of vile drums and the thin monotonous whine of accursed flutes, and Azathoth gets up off his throne and declares a segfault.
Comment #5
MixologicRe-reading it seems like you were already saying what I said. Carry on.
Comment #6
MixologicIm gonna take a quick stab at this. It was actually listed on my sprint as a thing to do before next week anyhow.
Comment #8
MixologicOkay, I havent actually ran any tests on this yet, but I did wanna get this WIP out there.
Some TODOS: calls to terminateBuild ought to be < 50 chars and make sense, and spit out good data in the exceptionDetails. for example, the Patch plugin should fix this: #181174: Included rejected patch file as text on results page by sticking whatever patch failure message we have into the details.
Im gonna go ahead and 'unplan' this and convert it to a task.
Comment #9
Mile23I like this architectural change to drupalci, but I'm switching this back to a plan so we can be clear on what issues need to happen in other projects.
BTW the tests fail. I'd fix them but I still don't understand a plan of action, which is why this issue is a plan. :-)
OK, so we have three things, and this will largely restate what I think you're saying:
So if the constraint where Jenkins gives us the choice between 0 or ARMAGEDDON is hard (or at least not something we want to muck with), then we can do the following:
This would mean that we should never return anything but 0 from drupalci if we can help it, but it also means we have to modify D.O to read in our status file.
This leads to two questions:
Comment #11
MixologicOkay, I fixed the tests - some were due to things erroring a little harder than they did previously (like git cloning over the top of a replicated local checkout). I forgot to add in the Issue number when I pushed the commits so they are here:
jenkins has three relevant statusus, that are shouty for some reason, but I reshout them here:. PASS, UNSTABLE and FAILED. Since we are executing drupalci via a jenkins bash plugin, It assumes that 0 = success, and 1 or greater = FAILED. The only way we get to UNSTABLE (which means, drupalci ran okay, but some tests did not pass), is by returning 0, and also having some test xml that jenkins can interpret as meaning "mark unstable"
So the constraint jenkins gives us isnt hard, its just that we dont want to muck with it *yet*.
D.O is currently interpreting FAILED as ARMAGEDDON, and yes, I've discussed it with Neil, but haven't opened an issue, to have d.o. read in our status file. I wanted to have our status file contain valid info, be named the right thing, and have a format defined inside it. I.e. we need to button down what the API is going to be.
But basically instead of D.O seeing FAILED via the jenkins api and reporting END_OF_WORLD, we want it to see FAILED at the api and read in our buildoutcome.json, and tell the user why any plugin decided to terminate the build.
So to answer your second question: drupalci shouldnt be fixed to only ever return 0. It should eventually return 0 for everything passed, 1 for tests failed, and 2 for exceptions encountered. The only thing that we *currently* care about overriding to be 0 is applying patches as that is currently the only thing that *should* be a 2, but isnt so that d.o. can get its dummy xml file to communicate to the user that a patch failed to apply.
So the next steps are
1. Make sure that buildoutcome.json is populated with a sane message every time something craps out. (I just renamed it)
2. Once 1 is done, Open an issue with pift to start consuming that data when it detects a FAILURE.
3. Once 2 is done, we can reset all of the signals that drupalci emits back to their ideal state (0 - all things go,1 - assesments failed,2 - exception thrown).
Once all of that is finished we can think about creating a service for the buildResult, and allowing plugins to *also* insert summary news? It might be better for d.o. to not have to figure out what it should display in a little green/red/white pill. e.g. when we add codesniffer, we'll have our X tests passed, X tests failed pills. Do we add new, codesniffer pills? if so, what do they say? or do we just leave that up to d.o. to sort out and not have plugin's define that?
Comment #13
Mile230436a1e uses
\JsonSerializable
and turnsBuildResultsInterface
into an immutable, by removing setters, because no one should be changing result outputs.Added a test to ensure that we have the keys we expect in
BuildResults
, because eventually that will becomebuildoutcome.json
.Still needs a test to ensure that
buildoutcome.json
was written and that it contains the keys we need. We'd probably add that to existing pass and fail tests.OK, so if the rule is that
buildoutcome.json
can only contain one status output, then D.O just says what it says. The user can then poke around for more info, and we're done. :-)If we want to have a sort of warning status along with the pass/fail/explode, then we might add that. For instance, if coder fails, but doesn't fail the test run, then it could have its own pill. We'd want to structure that into
buildoutcome.json
, so it could look like this:Then D.O could pick that up and figure out what to do with it.
This would mean that if you send errors-dont-stop-build to a given plugin, then errors get demoted to warnings.
Comment #15
Mile23OK, so after chatting in IRC I seem I'm barking up the wrong tree WRT having more warning stuff in
buildoutcome.json
. Jenkins has us covered.Added tests for the presence of
buildoutcome.json
in artifacts to all the Application tests.Comment #16
MixologicDood. b56007c is awesome. Thanks.
Also, This altogether looks like a good first step. Im going to push it to dev and watch all the tests.
I'll mark the 'plan' needs work, because I need to open some follow ups for D.O.
Also, we need to add a couple of things into simpletest that detect other failed states, like 'no tests found' - that one is sorta tricky because we call the plugin twice, and there might not be javascript tests, so its almost like if *both* simpletest plugins failed we need to set the build outcome to "No tests executed"
Comment #17
MixologicOther situations for buildstate:
Fatal errors running run-tests to get the groups either because of faulty php, or because of missing @groups:
https://dispatcher.drupalci.org/job/default/284809/console
Invalid composer dependency: (note that broken_dependency isnt a thing anymore, but real, wrong ones may still exist somehow)
https://dispatcher.drupalci.org/job/default/284623/console
The requested package drupal/broken_dependency could not be found in any version
Another good example of that is something that adds a 'repositories' key to their modules composer.json (a nono) and then 'requires' that : http://cgit.drupalcode.org/yamlblock/tree/composer.json
Comment #18
MixologicAllright, now we're getting somewhere.
Lots of things have been added back in, so now we should audit all of the BuildTask plugins to make sure that when things fail, and if they fail hard (like cannot keep processing) we 'terminate' the build.
CreateDatabase is a good example of this. If it cannot create the database, it spews an error message, and keeps on truckin, and it ought not do that.
Comment #19
Mile23Well the origin branch for this has been deleted, and when I merged dev into it locally there was no diff between it and dev. So marking this closed. :-)