[policy] Define standard for merging in branches from core sandboxes [#1462472]

Spin-off from #1321540-83: Convert DBTNG to namespaces; separate Drupal bits

Problem

Ugly Drupal core commit log/history after merging in a larger complex branch from a sandbox.

Goal

Actively support and endorse merging of larger/complex changes from core sandboxes.
Retain a clean commit log/history for Drupal core.
Define a standard for merging in branches from core sandboxes.

Details

The commit log/history looks very awful after merging in #1321540: Convert DBTNG to namespaces; separate Drupal bits. Drupal core's commit history makes little sense due to this merge now. It's close to impossible to determine and review the individual commits that belong to this change(set).
The common practice is to prepare a to-be-merged branch in a separate merge branch upfront.
Read a high-level description of this in Rebasing and merging: some git best practices and the often referenced Linus Torvalds' direction on clean history.
That dedicated merge branch is rebased onto the latest core branch. Hence,
1. all change commits are known to be based on the latest code (which may be the case here now, but we factually don't know)
2. the git commit log/history is clean; i.e.:
  - all commits from the merged branch cleanly appear after each other
  - no upstream merge commits from the sandbox
Additionally, pull request branch maintainers are asked to clean up the merge branch during the rebase; e.g.,
- combine multiple related commits into one (avoiding plenty of "Add more/fix use statements" commits)
- squash other needlessly verbose commits
That is, because such commits are completely stupid and pointless with regard to the long-term history. Simply have a quick look at node.module's history now — next to a plethora of senseless upstream merge commits of the sandbox, you also see commit messages like "Add more use statements and class renaming.", which are completely missing context and thus don't make sense in terms of Node module's history.
If there's any need for retaining the original branch for downstream repos/branches, then that branch remains to be unaffected.

We should really follow that practice of prepared merge branches, because, in fact, it made more sense to review the patch file in the issue, which ultimately contradicts the merge of the branch (and thus, it would have made more sense to commit the patch instead). A properly prepared merge branch avoids all of these issues.

Docs on the web

Linux Kernel patch submission guidelines

Comments

Comment #1

xjm

she/her

English

CreditAttribution: xjm commented 1 March 2012 at 13:12

Closed #1462712: Discuss best practices for managing core merges as duplicate. Here's what I said over there:

Rebasing the main sandbox/merge branch is probably not a good idea because it alters the actual history of the branch development.

However, if we are going to do more of these merges in the future, here's the workflow I'd suggest:

Have a main branch for the sandbox, the one we intend to merge into core.
Branch off this main branch when making changes, possibly maintaining a parallel "development" branch.
When a change is "ready," rebase or squash any number of commits from the change back into the main branch, in a single "nice" commit with a helpful diff, more like we have in core currently.

This would also apply to individual developers making a self-contained change. The change might include multiple commits in that developer's branch, but the developer would squash it into a single commit when merging back into the main sandbox branch.

Comment #2

sun

German

Karlsruhe

CreditAttribution: sun commented 1 March 2012 at 13:33

Rebasing the main sandbox/merge branch is probably not a good idea because it alters the actual history of the branch development.

No, the rebase changes the history of the (temporary) branch to be merged upstream only. The original development branch(es) can happily remain as is.

To roughly clarify what to do, once the sandbox branch is ready:

# 'bigchange' is our sandbox development branch.
git checkout bigchange
git checkout -b bigchange-merge
git rebase -i 8.x

After that, bigchange still contains the unmodified original development branch containing the entirety of the history, including upstream merge commits, individual commits, and everything else. In short: Everything in bigchange remains as-is.

However, bigchange-merge is completely rebased against latest HEAD of Drupal core, and can be further cleaned up and modified as needed, before the branch is merged into core.

After bigchange-merge has been merged into core, if needed, bigchange can simply merge in upstream again to take over the new history. Its actual content/history doesn't matter at all, because if there's any other dependent otherbigchange topic branch in the sandbox that will be merged into core, then that will be equally branched and rebased into an otherbigchange-merge, too.

Comment #3

xjm

she/her

English

CreditAttribution: xjm commented 1 March 2012 at 13:48

I'm still not sure I agree about rebasing the main branch against 8.x before the merge. What about commits that are in response to changes in core? If you rebase, suddenly those commits are out of context.

Comment #4

Crell CreditAttribution: Crell commented 1 March 2012 at 16:38

One huge squash merge also offers zero advantages over just having a patch. You get no history, no context, no nothin'. The history looks no different than having a patch file, so we gain nothing. That doesn't mean we shouldn't rebase out some commits, but we should not destroy the entire history.

We should also consider that making a "tidy" merge branch and then merging it is extra steps for the branch/patch submitter and committer need to synchronize on. That could easily add an extra week to the time-to-commit of an issue as that gets sorted out. That harms development velocity.

I will also note that the "log doesn't make sense" issue is only true if you're using a linear log view. If you're using a branch-aware log view (gitk, gitX for Mac, or just git log --graph), you get a more accurate representation of the history. Really, a linear log for a non-trivial git project makes no sense in the first place unless you're just looking at a small part of recent history.

Comment #5

xjm

she/her

English

CreditAttribution: xjm commented 1 March 2012 at 16:55

To clarify, I did not mean "one huge squash merge"--I meant many small squash merges into the main sandbox branch, and then a single full merge from that branch into core HEAD.

Edit: E.g.:

(On branch 8.x-thingy-xjm)
Fix spelling.
Missed one.
Add some more.
Whitespace.
Add foobar widgets.

Get squashed into:

(On branch 8.x-thingy)
Add foobar widgets.

Which gets merged into:

(On branch 8.x)
Merge remote branch '8.x-thingy' into 8.x
...other commits...
...other commits...
Add foobar widgets.
...other commits...
...other commits...

I should note this is similar to the workflow we use at work, so I'm partial. :)

Edit: Also, it sounds like Crell is thinking this is something that would be done at the end of development. I'd disagree with any proposal along those lines. I am saying a small-scale part of the daily workflow that involves bundling fixing one's whitespace and typos in something with the work on that thing before one foists that commit off on other devs. :)

Comment #6

neclimdul

He/Him

CreditAttribution: neclimdul commented 1 March 2012 at 19:05

Rebasing the main sandbox/merge branch is probably not a good idea because it alters the actual history of the branch development.

No, the rebase changes the history of the (temporary) branch to be merged upstream only. The original development branch(es) can happily remain as is.

rebasing destroy's history. end of story. dates are changed, order is rearranged, history is changed. If you're on a private branch and preparing a patch or even a series of patches this destruction is sometimes acceptable. With large changes it is not acceotabke, this is how weird regressions happen.

While a straight chronological view can get a /little/ weird with merges, they are actually more clear because they show when actual development was happening and tools like gitk, giggle and the others crell mentioned can help you inspect when, where and why regressions happened which would be completely impossible with a rebased branch.

As far as having 2 branches for changes, this doesn't actually help it, if anything it makes it worse because there's no way to actually tell that to resolve some commits' history you actually need to look on a different branch to get the full history. As I said before, history is lost.

edit: fix quotes

Comment #7

gdd

he/him

English

Portland, OR

CreditAttribution: gdd commented 1 March 2012 at 19:25

I pretty much completely agree with everything Crell said. I'm not necessarily against rebasing out some pointless commits, but on the other hand I do think it is a little bit of a waste of time on the part of the maintainers managing those sandboxes. Git inherently lends itself to small atomic commits, and I'm going to be less likely to do them if I know I'm going to have to go through and clean them all out again later. This doesn't mean that perhaps we shouldn't all become a little more cognizant of our commit messages and cleanup commits (I haven't really been giving that much thought to be honest.) I completely agree that one squash-merge is a total waste. Note that at the very beginning of the initiative process, I opened this topic up for discussion here:

http://groups.drupal.org/node/148184

A lot of that discussion is still directly relevant.

I plan on having CMI merged into core. The 20-odd contributors to the branch deserve full credit and I want to see them get it.

Comment #8

Damien Tournoud CreditAttribution: Damien Tournoud commented 1 March 2012 at 20:50

What we need to learn here is how to build patchsets.

The history of the trial and error of the developer working on a patch has absolutely zero value. On the other hand, splitting a big change into separate smaller changes that make sense in isolation and can be reviewed separately has a lot of value.

In that particular case, I don't see how a proper patchset would have been done (it's basically moving a lot of code around), but at the minimum the merge history of the development branch should have been removed:

| *   1bc5b98 Merge remote-tracking branch 'dries/8.x' into dbtngtng
| * |   e9453d8 Merge branch '1400748-namespaces' into dbtngtng
| | * \   c616a7b Merge branch '8.x' into 1400748-namespaces
| * | | |   a731ac1 Merge branch '1400748-namespaces' into dbtngtng
| | * | |   89a813c Merge branch '8.x' into 1400748-namespaces
| | * | | |   0e7882b Merge remote-tracking branch 'dries/8.x' into 1400748-namespaces
| * | \ \ \ \   b5e33f6 Merge branch '8.x' into dbtngtng
| * | | | | |   0ee2386 Merge branch '1400748-namespaces' into dbtngtng
| | * | | | |   910535c Merge branch '8.x' into 1400748-namespaces
| * | | | | | |   86c93f3 Merge branch '1400748-namespaces' into dbtngtng
| * | | | | | |   3f0cb72 Merge branch '1400748-namespaces' into dbtngtng
| * | | | | | |   adbd7c9 Merge branch '8.x' into dbtngtng
| * | | | | | |   0879ffd Merge branch '8.x' into dbtngtng
| * | | | | | | |   0017077 Merge remote-tracking branch 'dries/8.x' into dbtngtng
* | | | | | | | | |   3c78ca4 Merge branch '8.x' of git.drupal.org:project/drupal into 8.x

Comment #9

neclimdul

He/Him

CreditAttribution: neclimdul commented 1 March 2012 at 21:10

I have a hard time disagreeing with anything Damien said. We definitely as a community don't understand managing development branches or auto merge commits. Education and probably some helpful process guidelines for getting a clean merge to core are inline. Maybe we're all arguing the same point and disagreeing on the semantics. Thanks Damien.

Comment #9.0

Damien Tournoud CreditAttribution: Damien Tournoud commented 1 March 2012 at 21:13

Issue summary:

View changes

Updated issue summary.

Comment #10

cyberswat CreditAttribution: cyberswat commented 12 March 2012 at 18:12

I use the --no-ff flag to accomplish something similar to this type of grouping. It gives us the ability to quickly evaluate a group of merges for a feature and keep them grouped together so they can be targeted efficiently. Each developer uses a regular merge as work is happening with the release manager using a merge --no-ff. This has proven to be invaluable where I work.

Comment #11

webchick

she/they

English

Vancouver 🇨🇦

CreditAttribution: webchick commented 9 January 2013 at 06:27

Priority:

Major

» Normal

We've merged from sandboxes a few times now, and no one's bumped this issue in almost a year, I'm going to go out on a limb and say it's not "major" anymore.

It would be good to get this wrapped up one way or another, though.

Comment #12

webchick

she/they

English

Vancouver 🇨🇦

CreditAttribution: webchick commented 9 January 2013 at 06:27

Component:

other

» documentation

Also, feels like documentation to me.

Comment #13

neclimdul

He/Him

CreditAttribution: neclimdul commented 11 January 2013 at 15:27

Seems realistic though I'd like to see more interest in things like this. I feel like passing up and ignoring things like this leads to things like this https://twitter.com/Dries/status/288682798636756993 not happening.

disclaimer: I'm not in a position to really complain as I'm in as much of a time crunch for community efforts as anyone else I just wanted to voice my opinion on the importance.

Comment #14

marvil07 CreditAttribution: marvil07 commented 15 March 2013 at 23:21

Backlink to #1583840: Define conventions around drupal core git interaction, which consider merges and also other aspects on git interaction.

Comment #14.0

marvil07 CreditAttribution: marvil07 commented 15 March 2013 at 23:21

Issue summary:

View changes

Updated issue summary.

3 March 2014 at 14:37

Version:

8.6.x-dev

» 8.8.x-dev

Drupal 8.6.x will not receive any further development aside from security fixes. Bug reports should be targeted against the 8.8.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.9.x-dev branch. For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

Comment #23

3 March 2014 at 14:37

Version:

8.8.x-dev

» 8.9.x-dev

Drupal 8.8.7 was released on June 3, 2020 and is the final full bugfix release for the Drupal 8.8.x series. Drupal 8.8.x will not receive any further development aside from security fixes. Sites should prepare to update to Drupal 8.9.0 or Drupal 9.0.0 for ongoing support.

Bug reports should be targeted against the 8.9.x-dev branch from now on, and new development or disruptive changes should be targeted against the 9.1.x-dev branch. For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

Comment #24

3 March 2014 at 14:37

Version:

8.9.x-dev

» 9.2.x-dev

Drupal 8 is end-of-life as of November 17, 2021. There will not be further changes made to Drupal 8. Bugfixes are now made to the 9.3.x and higher branches only. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

3 March 2014 at 14:37

Version:

9.5.x-dev

» 11.x-dev

Drupal core is moving towards using a “main” branch. As an interim step, a new 11.x branch has been opened, as Drupal.org infrastructure cannot currently fully support a branch named main. New developments and disruptive changes should now be targeted for the 11.x branch. For more information, see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.