Testrunner code cleanup and refactoring [#2558929]

While working on the logic for version-specific job template selection, I encountered some issues with the order that logic was being executed within the RunCommand and JobBase classes; and realized that some of the logic there was going to need to be restructured.

Given that this was going to be a require a fairly invasive restructuring, I decided to also tackle some long-planned code cleanup activities which were going to be required at the same time.

I'll push my resulting work up to the repository for review and comments ... While the branch does run successfully locally, I anticipate there may be a number of merge conflicts which will need to be resolved before these changes can be moved across to production; given the changes which have occurred on that branch while this refactoring was taking place. I think the best approach would be (once this is tested as a standalone branch) for me to go back to the point where this diverged from production, determine where the merge conflicts are, and incorporate the logic associated with each of those commits into this new structure in order to 'catch up the logic' before a final merge to the production branch.

Comment	File	Size	Author
#4	2558929-jeremy_code_cleanup_and_refactor.patch	102.66 KB	Mixologic

Comments

Comment #1

28 August 2015 at 01:50

jthorson created an issue. See original summary.

Comment #2

jthorson commented 8 September 2015 at 03:49

Priority:	Normal	» Critical
Status:	Active	» Needs review

This needs testing and merging. It's holding up other work.

Branch: 2558929-jeremy_code_cleanup_and_refactor

Comment #3

isntall commented 9 September 2015 at 00:07

I ran several branch jobs thru using this code and they turned out the way I expected, with one exception. Upon running another job of that same type that passed as well.

Next will be for some contrib jobs, class jobs, etc.

Comment #4

Mixologic

He/Him/His

English

Portland, OR

commented 9 September 2015 at 18:24

Status	File	Size
new	2558929-jeremy_code_cleanup_and_refactor.patch	102.66 KB

Uploading a patch of this branch for dreditor code review purposes.

Comment #5

Mixologic

He/Him/His

English

Portland, OR

commented 9 September 2015 at 20:29

From what I see the following things are changing in the branch:

Re-architected how build step processing runs, and adding “argument prioritization”.

This is not fully implemented yet

Added JobCodeBase
Updating how error handling works for job buildsteps.
Job Definition validation.
Various code cleanup and renaming of variables.

Renaming all the drupalci to DrupalCI stuff, renaming things like jobType, getBuildVar etc.

It would be best to have variable renaming and such in a separate branch from structural/functional changes.

Changing DCI_Variable names

This can have impacts on the Jenkins jobs, as they set those environment variables

.
These changes do move us in a positive direction:

Having the Codebase as an object with properties will be very useful.
Encapsulating a JobRequest and separating out the JobDefinition is definitely the way we’d like to go, we can eliminate passing around arrays of data as arguments.
Error handling can be centralized for a job

There are a few things I would like to see improved or help improve:

Avoid array and string manipulation where possible for known patterns: Use pathinfo to get the extension of a file.
Use a yaml parser to turn a yaml file into data.
We should not rename variables to be less descriptive (avoid using a variable name like ‘$arg or $data’. Dockerphp made this mistake with a $data variable recently which caused the output stream to sever itself (https://github.com/ryanaslett/docker-php/pull/1)).
Uncaught exceptions that are known should be caught in a try/catch block.
Ideally we want to avoid using class methods. Output::error is something we want to get rid of, as we cannot test for it.
Consistent coding standards (PSR2 or Drupal Coding standards) would help speed up review time.
Creation of the job codebase, job definition, and build_id, etc. - I believe all of that should go into the constructor of the job, rather than the execute portion of RunCommand. I see those as all parts of the job, not parts of the run command. Using RunCommand spreads logic about what’s happening around, instead of encapsulating what’s going on with a job into one place.
Wherever possible, smaller, compartmentalized branches for changes will make review much easier and faster. Many of the bullet points above could have been separate branches.

Comment #6

hestenet

He/Him

Portland, OR 🇺🇸

commented 9 September 2015 at 20:44

As we’ve talked about rather extensively outside this issue - we’d prefer to have input into new features, refactoring, changes in architectural direction before the time is put into making those changes. Your roadmap and vision for DrupalCI is not the same as our own roadmap. When there are changes of this magnitude it will take us some time to review them. With the current state of the codebase lacking tests we are very paranoid about deploying any code and are trying to ensure we’ve done thorough internal review and as much manual testing as we can.

Overall, this is code we can move forward with, and as mixologic outlined above it does some great things for us. However, with webchick’s expressed concerns about risking the stability of the test runner we need to be certain that doing so won’t be disruptive to the RC process or Barcelona sprints. Because it is not critical to RC1 it simply feels safer to merge after Barcelona. Ideally, we should be writing tests so that we can verify we don’t regress the codebase in refactoring.

Please understand that we’re hard focused on the priorities raised by the core maintainers- and the most important priority right now is stability. This is why we’re leery about any refactor or rearchitecture happening at this stage. Though it may be the ‘right’ way to solve some of the issues that are on our priority list, less invasive solutions are preferable wherever possible. If you will make an effort to include us in your thinking, we can and will make an effort to be more detailed and specific in our technical feedback. Mixologic’s comment is a good faith effort to do that and I hope it will be taken as such.

Comment #7

jthorson commented 9 September 2015 at 22:25

I have a fundamental objection to spending the rest of my efforts on this project in a standalone branch until after Barcelona. This is a broken process that only results in wider and wider divergence between my work and the "production" branch, exponentially increasing the effort required to finally review and accept the code, as well as increasing the risk of stranded code/investment.

The DA staff works during the week, I work on the weekends. When I sit down to 10 hours of coding over the course of a Saturday/Sunday, waiting until Monday to get your pre-approval of changes for roadblocks I come up against while developing a new feature, before I actually code a solution, is simply not an option. Imposing this gate cripples my ability to contribute to this project ... Given the sense of urgency around completing this project, and given that these changes lay the foundation needed for several key improvements to the project (including those on the two prioritized lists, which I'm only told have been descoped 'after the fact'; despite there being no mention of this descoping on the associated meta issue which hosts those lists), I can not rationalize the resistance to putting these in place *before* Barcelona ... I expect a significant amount of additional refactoring to happen during the Barcelona sprints, and would much prefer that we iterate on top of the improvements contained here instead of plowing ahead on top of the existing 'production' branch and then having to go back and backport all of those changes to align with the new classes introduced in this refactoring.

The iterations should be small, consumable, and merges frequent ... and a look at the commit log on this branch will demonstrate that there is no piece of this branch that can not be easily reviewed in isolation. I was kicked off the -dev branch and asked to work in a issue-specific branch, so I did ... but what it sounds like you would have preferred in this case are 32 individual commits/changes, each in their own branch, all branching off the previous change. Forgive me if I consider that a little ridiculous; as two-thirds of those commits represent <20 line diffstats.

Not everyone in the Drupal community works 9-5 on Drupal contribution. I am a batch worker ... large periods of time, at irregular intervals, when I can find it. This needs to be accounted for when establishing your preference "to have input into new features, refactoring, changes in architectural direction before the time is put into making those changes" ... this simply is not realistic, and I'd challenge that as the primary architect and developer behind the entire test runner codebase (and, I would suggest, as the only person who really understands it well enough to evaluate any proposed changes without a significant time investment), a certain amount of leeway is warranted.

Based on the comments above, this code has been fully tested by Archie on staging, and to the best of my knowledge, passed 100%. No other commit in the last thirty days has been subject to more than a couple of staging runs; and I question why this one is any different.

The best answer I can come up with is 'fear of the unknown'; and frankly, as the individual responsible for the vast majority of this codebase, I'm saying that as long as it works on staging, yes ... Trust me. I ran the PIFR bots for years, alone, without being subject to any code review. I'm not trying to argue that my code is beyond review ... but I am saying that I don't need to be lectured for days about the risk of deploying code without unit tests to the live testbot environment, nor do you need to subject my code to a higher level of analysis and review than your own. Ironically, this code is a huge step forward towards fixing the code so that those tests CAN be written - thus, IMO, the 'extra caution because we can't unit test things' is a flawed argument from the outset.

Today, I'm now being told that this is blocked based on the risk of possibly breaking something in production ... based on externally received comments that stability needs to be our number one priority going into Barcelona. No offense intended towards the source of those comments, but those comments were made without having any context regarding the changes in question, nor the benefit that they provide us moving forward ... instead they are based on an exaggeration perception of the scope and risk associated with the requested changes. Again ... I managed the PIFR bots for years ... those risks are nothing new to me. I've assessed those risks and the associated rewards, and am stating that my recommendation is that committing this and moving forward *before Barcelona* is in the best interests of the project and the community. Not only because it will facilitate more efficient progress during the Barcelona sprints, but also because it significantly improves the learnability, stability, and security of the codebase over what is currently in place.

The repeated 'changes of this magnitude' comments are a misrepresentation which contributes to this exagerated risk ... The 2000 line diffstat is a result of mutiple cut-and-pastes, as a lot of existing code was moved and encapsulated within child objects. But outside of the RunCommand class (which is infinitely more readable now), and variable renames for consistency with the rest of the codebase, there have not been any significant changes to the actual functional code. I'd argue that calling the build processing logic a 're-architecture' is a misrepresentation as well ... as all I've done is broken the processing up into smaller functional blocks and re-ordered some of the calls; but the underlying functional code hasn't really changed.

The concerns expressed in Ryan's comment above are all trivial enough to pursue in follow-up issues. Four of the seven concerns listed are inherited from the existing code, one is a process concern that is external to the code, and the last two are matters of personal preference. I don't see anything here that justifies this having sat for over a week after I specifically asked that it be prioritized in order to help move the rest of my work forward ... nor do I agree with the suggestion that it wait until after Barcelona. This project has consistently made 90% of it's progress in the week before, during, and after each DrupalCon; and I feel that too much is lost by putting off foundational changes such as these until after that period passes, instead of letting us iterate on top of those improvements during the pre/during/post-con sprints.

Differing visions or priorities aside, this is the first time that I have requested such a favor ... and more time has been spent trying to convince me why it shouldn't move forward than it would have taken to review things in the first place. Ryan's post above is the first time that an actual technical objection to any piece has been communicated to me, and frankly, 12 days is entirely to long to wait for what amounts to a few nitpicks that can easily be solved in follow-up issues after the fact.

Comment #8

webchick

she/they

English

Vancouver 🇨🇦

commented 10 September 2015 at 05:40

Since I'm being drawn up into this from my comments as a core maintainer...

I can't say whether the changes represented in that branch present risk or not. That's for you guys to work out. I just know we are severely under the gun over here to try and get to zero critical issues (it's currently 5) by Sunday the 13th, so Wednesday, Sep. 16 is available as a possible RC1 date. That's 65 hours from now, so even 1 hour caused by testbot instability is too much.

This deadline is important, because it is the only remaining release candidate window left prior to Barcelona, and meeting it makes the difference between Barcelona being a celebration of 4.5+ years of hard work by 3000+ people, or it being yet another bug fixing slog fest where dozens of already burnt-to-a-crisp people sit toiling away in a dark room at all hours while the rest of the attendees are out on a beach somewhere. And, pardon my French, but fuck that. :P

That said, it does make a lot of sense to have the code base in a place that will enable the best sprinting in Barcelona that is possible. You would have a window on Sep 17/18 which is a) after the RC1 deadline of Sep. 16 and b) prior to the extended Barcelona sprints, which start Saturday the 19th. The only caveat is key people might be traveling those days.

Comment #9

joshuami

he/him

English

Portland, OR

commented 11 September 2015 at 02:55

I understand this has been a bit challenging for all those involved. Everyone is working towards the same goal of making DrupalCI better.

@basic has committed to merging the refactor feature branch into dev while we continue making minor hotfix / bugfix releases to production (and dev) branches. He and I also took a bit of time this afternoon to update the Git branching for Drupal.org projects documentation. It will be best for everyone if we follow a process that roughly matches how the team interacts with code contributions for inclusion into a production service. We prefer small, incremental code changes—which is the whole idea behind continuous integration.

I'll check with the team and see how we feel about a merge on the 17th. Right now we are stable enough to provide the testing needed for the Con... which is an awesome milestone.

@jthorson, all of us truly and genuinely appreciate all of the time you put into keeping the old testbots alive, for the work you did to hand off those testbots when staff could pick up the maintenance and support, and for the work you've done to help get DrupalCI testbots into a state where the team could push that into a true production environment. That work was huge.

Your role on the DrupalCI should change a little now that we are well past the point that 90% of the work getting done at the Cons. Staff are carrying the pager (metaphorically, as there are all these great apps now) and getting the calls when the service is now. Our budget (thanks sponsors) is funding the environment (this level of CI stuff ain't cheap). It means we take the heat when bots crash or act up :)

Let's talk more at Barcelona face to face rather than relying on the issue queues for this.

Comment #10

basic commented 11 September 2015 at 03:12

Priority:	Critical	» Major
Status:	Needs review	» Reviewed & tested by the community

I've merged this to dev. This should give the foundational branch for you to continue working on a syntax linting feature branch.

commit fa309f84d8e346f12743141eb24c5a2d7ca042b8
Merge: d95e2d9 ef5bbda
Author: Rudy Grigar <basic@drupal.org>
Date:   Thu Sep 10 20:07:01 2015 -0700

    Merge branch '2558929-jeremy_code_cleanup_and_refactor' into dev

Comment #11

Mixologic

He/Him/His

English

Portland, OR

commented 29 September 2015 at 21:53

Issue tags:

+Needs deployment

Comment #12

isntall commented 2 November 2015 at 22:02

Status:	Reviewed & tested by the community	» Fixed
Issue tags:	-Needs deployment

dev has been merged into production, and production is live.

Comment #13

16 November 2015 at 22:04

Status:

Fixed

» Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

Testrunner code cleanup and refactoring

Comments

Comment #1

Comment #2

Comment #3

Comment #4

Comment #5

Comment #6

Comment #7

Comment #8

Comment #9

Comment #10

Comment #11

Comment #12

Comment #13

Change record drafts

Referenced by

News items

Our community

Documentation

Drupal code base

Governance of community