Define conventions around drupal core git interaction [#1583840]

Disclaimer: I really appreciate all the work drupal core developers do. This is about suggesting ideas for the benefit of all.

Problem/Motivation

Since we moved to git, we have a lot more flexibility about how to accept changes for the project. Originally, on CVS, each maintainer was responsible for commit messages, merge strategies and tagging.
Now that we are using git, those aspects have changed, and now the responsibility is sometimes shared by authors.

Workflow

With Git, we can decide on how to include the changes into branches(instead of just use commit on CVS).
This is related to the workflow(s) we are using. Currently, we seem to be using two workflows:

one for normal issues: Centralized, applying the patches manually(avoiding git-am when possible, see the attribution section too).
and one for initiatives: Dictator and Lieutenants, merging from public initiative sandboxes.

It's OK to use more than one workflow, but we should use the same conventions for commit messages and merge strategy levels.

Merge strategy

About the centralized workflow: it does not make sense to have merge commits(e.g. f356dc2), because there is no really original information in commits.
These merges are probably auto-generated when pull'ing before push, and in that case it makes sense to just rebase the changes(e.g. do git fetch, git rebase origin/8.x). In general it is a bad idea to make merges without reason.

About the lieutenants workflow:

It does make sense to have merge commits from initiative sandboxes to upstream.
Initiative sandboxes should not merge from upstream unless needed(code added upstream needed for the initiative), see references for details.
Initiative owners should ensure they receive a clean history, see references for details.

References:

Linus opinion on clean history, and a little summary of .
Integrating contributed work section on the Pro Git book.

Commit messages

Historically, the drupal project has enforced coding standards, and I remember a lot of issues being marked to needs work because of empty spaces, or typos on documentation.
Also, a lot of the history of drupal core code embraces the commit messages handbook page.

So, it seems like as a community, we support the idea of paying attention to detail.

But, recently, mainly from initiative merges, we have included a lot of commits that do not conform with the commit messages handbook page.
It's not only about the attribution and issue linking, but instead about the contents of the commits, they include trivial changes such as 4a8adb2(those should be kept private when possible). Think i.e. is that commit really one that should be at the side of 0275068 where attribution is shared?

I have also noticed that some commit messages start with "- #nid" instead of "Issue #nid", I guess that's an old version of dreditor on a pusher maintainer?

So, we need to decide if we are going to use commit messages handbook page for all commits, or we just drop it in favor of any other convention(s). Please notice this includes commits merged from initiative sandboxes.

Attribution

On drupal core issue queue, we usually have many people participating on one issue, so we usually want to make attribution for many people on one commit.

On CVS, we were forced to include attribution on the commit message, because there was no VCS supported way to do it.
Now with Git, we can use real attribution, but sadly only for one author and one commiter.

Some projects like the linux kernel or git itself, uses special strings at the end of the commit message for that propose, maybe we should consider using the same.
sdboyer also propossed another way using git notes.

Any solution we choose for this, we need to be consistent for every commit added upstream(also all commits coming form initiative merges).

See #2323715: [policy, no patch] Determine format for commit credit for individuals/organizations/customers for a possible implementation.

Proposed resolution

(description of the proposed solution, the rationale behind it, and workarounds for people who cannot use the patch)

Discuss about the three topics to agree on conventions.

Remaining tasks

Receive feedback about the three points
Agree on conventions
Public conventions as a handbook page on documentation.

Comments

Comment #1

dww

we/he/they

commented 16 May 2012 at 10:36

I think this is a really important issue, although I can't bump it to major in good consciousness. Given how completely obsessed we are with the fine details in every aspect of the code (including comments), it shocks me how sloppy and apparently unconcerned we are about the history of the changes. To me, this is immensely valuable information. When debugging a problem, git blame and git bisect are two of your best friends, but unless the history is clean and coherent, these tools are much less useful, and their output doesn't necessarily bring you to the right issue where the decisions that led to the buggy code were actually made. After spending hours, sometimes 100s of them, on a given change, we leave it up to the over-worked core maintainers to construct the right commit message, which they often do totally improv and rushed, they instantly commit / push with 0 input/review from the community, and then it's locked that way forever. (Note: I know webchick often asks in IRC for help on the commit message, but it's inconsistent and she doesn't always have time for that nor get the help she deserves).

As marvil07 points out, this occasionally sloppy history is compounded many times via merging from initiative sandboxes, where there's even less attention to the detail of the history itself (and divergent standards/practices for maintaining the history).

As I wrote almost exactly 1 year ago on g.d.o I think it *must* be the responsibility of sandbox maintainers and/or initiative leads to ensure clean proper history that can be safely merged into core. We can't possibly expect the core maintainers themselves to clean up the messes in the sandboxes.

In terms of the commit message itself, a very simple improvement might be to include a section in the recommended issue summary template where everyone participating in an issue can be constructing a proper, well-formatted, clear, concise commit message to accompany the issue. That would be useful both for our patch workflow and issues that involve merging commits from a sandbox. We should let the patch authors and issue reviewers/participates help draft and refine the commit messages -- ensures some peer review, should hopefully save the core maintainers time, and help improve the overall quality and consistency of the commits. Of course, the sandboxes themselves also need good commit messages (and the right granularity of commits) for those merges not to introduce crap into the upstream history. But even still, considering the commit message part of what has to get written for a real issue (in addition to the code, comments, potentially change notice, etc) seems like a win all around.

I know some people advocate the "commit early, commit often, even trivial small things, and don't waste your time with issues for every change" approach. Okay, fine. Do that on your local clone. You can even push that dreck to your sandbox if you must. But don't ask a core maintainer to merge it into core. Make a new branch where you consolidate "fixed stuff", "fixed more stuff", "got it working", "sort of", "whoops", and "try this!" into a single "Issue #1234567: Implemented the frobulator." commit that actually tells someone looking at git blame output what happened, why, and where to read the details. If you say "but the detailed history is really important" then don't delete your sandbox once it's merged and reference the "raw" branch where all your flailing was happening in your clean commit/merge message. But don't force me to wade through dozens of your incoherent changes unless I actually need to crawl that deeply into your brain to figure out WTF you were thinking. ;)

Attribution is a harder problem since Git natively only supports a single author and committer. I don't think the Linux approach of "signed-off-by" conventions at the bottom of the commits is the way to go, nor the "Issue #X by Y, Z" we do now. In both cases, you can't fix it after the fact. Our convention is particularly problematic since it eats up valuable space in what should be the very short 1-line (e.g. ~50 chars) summary of what changed. I think sdboyer's proposal for git notes is the only sane way to actually do attribution right given our situation. However, I honestly think that deserves to be spun out into a separate issue (which can be linked from here) since it's a hefty conversation to have on its own.

Bottom line: the history of when/why/how we're changing our code is at least as valuable as the latest version of the code itself, and yet as a community, we rarely value this history and give it much attention at all. I strongly support changing this and fostering a culture where clean history is nearly as important as clean code. The good news is it's a *lot* easier to keep the history clean than the code itself, so long as people understand the tools we're using. ;) And if we can make any changes to our tools to make it easier, I'm ready to help.

Thanks,
-Derek

Comment #2

catch

he/him

English

commented 16 May 2012 at 11:07

In terms of branch merges, this is a duplicate of #1462472: [policy] Define standard for merging in branches from core sandboxes.

On the centralized workflow, it does not make sense to have merge commits(e.g. f356dc2), because there is not really original information on the commits.
This merges are probably auto-generated when pull'ing before push, and in that case it make sense to just rebase the changes(e.g. do git fetch, git rebase origin/8.x). In general it is a bad idea to make merges without reasons.

I think this one is "remind Dries to do git pull before committing patches".

Some projects like the linux kernel or git itself, uses special strings at the end of the commit message for that propose, maybe we should consider using the same.
sdboyer also propossed another way using git notes.

I'd quite like to move to something like this, and completely agree we should use one method of attribution and stick to it (at least for things like contributor stats), and that author of commit messages won't work for us.

Comment #3

jhodgdon

she/her

English

commented 16 May 2012 at 15:23

Component:	documentation	» other
Issue tags:		+Coding standards

Although this is not really "coding" standards, it is a process standard... and it's not documentation. So tagging to trigger that people who are interested in standards and processes can find this issue, and changing component.

Comment #4

Crell commented 17 May 2012 at 04:42

A couple of thoughts in no particular order:

Core committers should get in the habit of doing a pull --rebase. That does a pull but a rebase of the local commits instead of a merge, resulting in the same end result as if you had pulled first before committing anything locally. This is a "safe" use of rebase.

I largely agree with Linus's stance, but not entirely. There are two issues.

One, it assumes that I'm working in a vacuum. On almost anything interesting I am pushing my branch regularly to a sandbox of some sort, for simple backup purposes if nothing else. But usually, I want to collaborate with others. Collaborating with others means rebasing becomes a very bad idea. In fact, rebasing a branch and then pushing the rebased branch is something that d.o actively prevents you from doing. There are ways around it, but they're highly annoying. We may need to give up on that "safety check".

Two is the definition of "when necessary". The kernel branch for WSCCI, for instance, has a LOT of merges from upstream; more than I want, really. Generally they happened for one of two reasons: Some code changed upstream that I needed (affecting something in the kernel's line of site), or because I want to roll a patch for testbot. The latter is an artifact of the problem that there is really only one viable way to test a change to Drupal, and that is to post a head-compatible patch to an issue and let testbot handle it. That requires merging from upstream first to generate a viable patch at all. While that could be done in a temporary side branch, that is a bigger hassle and also can lead to misleading results; very often there are merges that need to be manually resolved when merging from upstream, which in turn change the very code I'm testing. That manual merge really should be done in the mainline kernel branch, not in a throw-away branch, because we'll just have to redo it anyway.

So for initiatve and other sandbox work, making testbot work for arbitrary branches in sandboxes is a prerequisite for not abusing Git merges. Given how extensive the changes are that we're talking about in some cases, I simply cannot maintain a tidy history and get validation that we haven't just broken the entire system.

For downstream branches, I have been strongly encouraging people to use feature/issue branches in the WSCCI sandbox that I can merge. That way Git tracks their credit for me. I am, frankly, not all that concerned about commit counts so "does this change deserve 2 commit objects or 3" is not something I lose sleep over. Lately I have been rebasing feature branches, then ff-merging them into my mainline, then deleting the feature branch. I also sometimes remove or squash intermediary commits if I subjectively think they make sense. (Some developers would commit something, then commit a revert, then do it a different way. I'd generally rip out both the original commit and the revert.) That keeps the history reasonably linear aside from the upstream syncing that I cannot avoid, as above.

If we can figure out a way to handle the upstream syncing better than we do now (which I agree entirely results in a messier history than we want, but I cannot avoid right now), we'll be in a much better place.

Comment #5

marvil07 commented 4 October 2012 at 17:48

Issue tags:

+Contributor Experience (CX)

Adding a new tag because of the attribution part of this issue.

Comment #6

marvil07 commented 11 October 2012 at 18:46

Issue tags:

+consistency

Another tag I saw on a related issue: #1791872: [Policy, no patch] Add special tag to identify issues ready for a high level only review

Comment #6.0

marvil07 commented 11 October 2012 at 18:46

Issue summary:

View changes

Fixed language-proper typos up to half of the article.

Comment #7

marvil07 commented 30 October 2014 at 16:57

Issue summary:	View changes
Related issues:		+#2323715: [policy, no patch] Determine format for commit credit for individuals/organizations/customers

Link to possible implementation ticket for attribution.

Comment #8

30 October 2014 at 16:57

Version:

8.0.x-dev

» 8.1.x-dev

Drupal 8.0.6 was released on April 6 and is the final bugfix release for the Drupal 8.0.x series. Drupal 8.0.x will not receive any further development aside from security fixes. Drupal 8.1.0-rc1 is now available and sites should prepare to update to 8.1.0.

Bug reports should be targeted against the 8.1.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.2.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Comment #9

30 October 2014 at 16:57

Version:

8.1.x-dev

» 8.2.x-dev

Drupal 8.1.9 was released on September 7 and is the final bugfix release for the Drupal 8.1.x series. Drupal 8.1.x will not receive any further development aside from security fixes. Drupal 8.2.0-rc1 is now available and sites should prepare to upgrade to 8.2.0.

Bug reports should be targeted against the 8.2.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.3.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Comment #10

30 October 2014 at 16:57

Version:

8.2.x-dev

» 8.3.x-dev

Drupal 8.2.6 was released on February 1, 2017 and is the final full bugfix release for the Drupal 8.2.x series. Drupal 8.2.x will not receive any further development aside from critical and security fixes. Sites should prepare to update to 8.3.0 on April 5, 2017. (Drupal 8.3.0-alpha1 is available for testing.)

Bug reports should be targeted against the 8.3.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.4.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Comment #11

30 October 2014 at 16:57

Version:

8.3.x-dev

» 8.4.x-dev

Drupal 8.3.6 was released on August 2, 2017 and is the final full bugfix release for the Drupal 8.3.x series. Drupal 8.3.x will not receive any further development aside from critical and security fixes. Sites should prepare to update to 8.4.0 on October 4, 2017. (Drupal 8.4.0-alpha1 is available for testing.)

Bug reports should be targeted against the 8.4.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.5.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Comment #12

30 October 2014 at 16:57

Version:

8.4.x-dev

» 8.5.x-dev

Drupal 8.4.4 was released on January 3, 2018 and is the final full bugfix release for the Drupal 8.4.x series. Drupal 8.4.x will not receive any further development aside from critical and security fixes. Sites should prepare to update to 8.5.0 on March 7, 2018. (Drupal 8.5.0-alpha1 is available for testing.)

Bug reports should be targeted against the 8.5.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.6.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Comment #13

30 October 2014 at 16:57

Version:

8.5.x-dev

» 8.6.x-dev

Drupal 8.5.6 was released on August 1, 2018 and is the final bugfix release for the Drupal 8.5.x series. Drupal 8.5.x will not receive any further development aside from security fixes. Sites should prepare to update to 8.6.0 on September 5, 2018. (Drupal 8.6.0-rc1 is available for testing.)

Bug reports should be targeted against the 8.6.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.7.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Comment #14

30 October 2014 at 16:57

Version:

8.6.x-dev

» 8.8.x-dev

Drupal 8.6.x will not receive any further development aside from security fixes. Bug reports should be targeted against the 8.8.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.9.x-dev branch. For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

Comment #15

30 October 2014 at 16:57

Version:

8.8.x-dev

» 8.9.x-dev

Drupal 8.8.7 was released on June 3, 2020 and is the final full bugfix release for the Drupal 8.8.x series. Drupal 8.8.x will not receive any further development aside from security fixes. Sites should prepare to update to Drupal 8.9.0 or Drupal 9.0.0 for ongoing support.

Bug reports should be targeted against the 8.9.x-dev branch from now on, and new development or disruptive changes should be targeted against the 9.1.x-dev branch. For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

Comment #16

30 October 2014 at 16:57

Version:

8.9.x-dev

» 9.2.x-dev

Drupal 8 is end-of-life as of November 17, 2021. There will not be further changes made to Drupal 8. Bugfixes are now made to the 9.3.x and higher branches only. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Comment #17

30 October 2014 at 16:57

Version:

9.2.x-dev

» 9.3.x-dev

Comment #18

30 October 2014 at 16:57

Version:

9.3.x-dev

» 9.4.x-dev

Drupal 9.3.15 was released on June 1st, 2022 and is the final full bugfix release for the Drupal 9.3.x series. Drupal 9.3.x will not receive any further development aside from security fixes. Drupal 9 bug reports should be targeted for the 9.4.x-dev branch from now on, and new development or disruptive changes should be targeted for the 9.5.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Comment #19

30 October 2014 at 16:57

Version:

9.4.x-dev

» 9.5.x-dev

Drupal 9.4.9 was released on December 7, 2022 and is the final full bugfix release for the Drupal 9.4.x series. Drupal 9.4.x will not receive any further development aside from security fixes. Drupal 9 bug reports should be targeted for the 9.5.x-dev branch from now on, and new development or disruptive changes should be targeted for the 10.1.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Comment #20

quietone commented 20 March 2023 at 23:44

I came across this issue last night. I think a lot has happened in the 8 years since this has been open, there are either other issues for individual topics or a policy statement. Here are the issues/policies I have found.

Workflow
https://www.drupal.org/about/core/policies/core-change-policies/backport...

Merge strategy
#1462472: [policy] Define standard for merging in branches from core sandboxes

Commit messages
#2323715: [policy, no patch] Determine format for commit credit for individuals/organizations/customers

Attribution
#2230579: [meta] Allow crediting reviewers (and other non-coders) as first-class contributors
https://www.drupal.org/about/core/policies/maintainers/how-is-credit-gra...
https://www.drupal.org/core/maintainers/issue-credit

Tags
Did not find anything about using annotated tags.

Anyone know if there is a discussion somewhere about the use of annotated tags?

Comment #21

quietone commented 21 March 2023 at 02:01

Status:

Active

» Closed (outdated)

I talked to xjm about this and we agreed that this can be closed and to make a follow up to discuss the use of annotated tags.

The followup is #3349217: [policy, no patch] Should core use gpg-signed git annotated tags

@marvil07, thanks for making this issue and helping to improve Drupal. I am closing as outdated because much of this has been resolved in the meantime in other issues,

Define conventions around drupal core git interaction

Problem/Motivation

Workflow

Merge strategy

Commit messages

Attribution

Tags

Proposed resolution

Remaining tasks

Comments

Comment #1

Comment #2

Comment #3

Comment #4

Comment #5

Comment #6

Comment #6.0

Comment #7

Comment #8

Comment #9

Comment #10

Comment #11

Comment #12

Comment #13

Comment #14

Comment #15

Comment #16

Comment #17

Comment #18

Comment #19

Comment #20

Comment #21

Related issues

Referenced by

News items

Our community

Documentation

Drupal code base

Governance of community