Problem/Motivation

The commit message format is not easy to scan to find the reason the change was made.

From this comment drumm states the GitLab is unlikely to support a list of user names in the commit message. GitLab provides Commit message templates which use a defined set of variables.

There have been several issues about the format of the commit message, some related to credit and some not. However, the credit system is implemented on Drupal.org and doesn't not rely on parsing the commit message. Thus, the message can change.

The solution can be multiple lines, agreed to in #2811033: Discuss if git commit messages should be multiple lines.

For different options lets use this commit as the test case

Issue #3420997 by sorlov, quietone, DanielVeza, smustgrave, alexpott, mstrelan: Convert MediaSource plugin discovery to attributes

Steps to reproduce

Visit https://git.drupalcode.org/project/drupal/-/commits/11.x?ref_type=heads

Proposed resolution

Option D

See #44

Implementation being done in #3486271: Change format for git commit message for the Drupal Core project..

Remaining tasks

Decide on an option from the following:

Option A, uses conventional commits

task(media): #3420997: Convert MediaSource plugin discovery to attributes

Authored-by: sorlov <xxx@no-reply...>
Authored-by: quietone <xxx@no-reply...>
Authored-by: DanielVeza <xxx@no-reply...>
Reviewed-by: smustgrave <xxx@no-reply...>
Reviewed-by: alexpott <xxx@no-reply...>
Reviewed-by: mstrelan <xxx@no-reply...>

See comment #3.


Option B, uses conventional commits

task(media): Convert MediaSource plugin discovery to attributes
Closes: #3420997

Authored-by: sorlov <xxx@no-reply...>
Authored-by: quietone <xxx@no-reply...>
Authored-by: DanielVeza <xxx@no-reply...>
Reviewed-by: smustgrave <xxx@no-reply...>
Reviewed-by: alexpott <xxx@no-reply...>
Reviewed-by: mstrelan <xxx@no-reply...>

See comment #6.


Option C, uses conventional commits

[#999999] task(media): Convert MediaSource plugin discovery to attributes

Authored-by: sorlov <xxx@no-reply...>
Authored-by: quietone <xxx@no-reply...>
Authored-by: DanielVeza <xxx@no-reply...>
Reviewed-by: smustgrave <xxx@no-reply...>
Reviewed-by: alexpott <xxx@no-reply...>
Reviewed-by: mstrelan <xxx@no-reply...>

See comment #8.


Option D, uses conventional commits

Same as C, but removes optional scope.

[#999999] task: Convert MediaSource plugin discovery to attributes

Authored-by: sorlov <xxx@no-reply...>
…

See comments #8, #23.


Option E

a simplified-extended variant of our current custom

[#999999] Convert MediaSource plugin discovery to attributes

Authored-by: sorlov <xxx@no-reply...>
Authored-by: quietone <xxx@no-reply...>
Authored-by: DanielVeza <xxx@no-reply...>
Reviewed-by: smustgrave <xxx@no-reply...>
Reviewed-by: alexpott <xxx@no-reply...>
Reviewed-by: mstrelan <xxx@no-reply...>
Component: media

See comment #32.

User interface changes

API changes

Data model changes

Release notes snippet

Comments

quietone created an issue. See original summary.

quietone’s picture

Issue summary: View changes
dww’s picture

To me, the two most essential things we need are:

  1. A valid 1-line summary that shows you what the commit changed, without an incomprehensible wall of usernames.
  2. That summary must contain a way to find the issue where the change was proposed, discussed, implemented.

https://www.conventionalcommits.org/en/v1.0.0/ is gaining popularity, and there's a lot to recommend it. It basically handles point #1, but doesn't solve for #2. Here's the TL;DR of the spec:

<type>[optional scope]: <description>

[optional body]

[optional footer(s)]

In a Slack thread on this topic a while ago, @larowlan and I decided that we could still put the issue ID into the "description" area without breaking anything. For example, here's a recent core commit:

Issue #3420997 by sorlov, quietone, DanielVeza, smustgrave, alexpott, mstrelan: Convert MediaSource plugin discovery to attributes

In the new world, this would become:

task(media): #3420997: Convert MediaSource plugin discovery to attributes

Authored-by: sorlov <xxx@no-reply...>
Authored-by: quietone <xxx@no-reply...>
Authored-by: DanielVeza <xxx@no-reply...>
Reviewed-by: smustgrave <xxx@no-reply...>
Reviewed-by: alexpott <xxx@no-reply...>
Reviewed-by: mstrelan <xxx@no-reply...>

Yes, I fully understand that we don't need the credits embedded in the commit history anymore (and haven't for a long time). But part of the joy of Git is that you can work offline with it. I don't want to need a live internet connection to d.o to see who was involved in a given change. I still think the history should speak for itself, as much as possible.

I'll also note that the type aspect of conventional commits is exactly what I've been clamoring for in other issues on this topic, to stop using the meaningless "Issue" bytes, and replace that with the category of the issue (bug, task, feature, etc). In conventional, the type is supposed to be "fix" for bug and "feat" for feature, but you can use whatever other values you want, e.g. "task". I'm fine sticking to the standard for "fix" and "feat", although personally, I think "bug" and "feature" are more self-documenting. 🤓

The optional scope would map to an issue's Component. That's a nice touch, but I'm not as set on including that as the other aspects.

So that's my modified conventionalcommits proposal. Not putting it in the summary yet since it's just me, but putting it here for consideration.

Thanks!
-Derek

nod_’s picture

I was pleasantly surprised by Typo3 commit messages that link to the review page (their setup is different than ours so grain of salt and everything). They're nice and informative, for example: https://forge.typo3.org/projects/typo3cms-core/repository/1749/revisions...

quietone’s picture

Issue summary: View changes

Moved the suggestion in #3 to an option in the issue summary.

Where does 'media' come from in "task(media):" ?

cmlara’s picture

For me a good commit message should tell you the main points you need to know, and should be self sufficient. The assumption for commit messages is that D.O. looses all of its data tomorrow and we have to rebuild from just the Git Logs.

The issue number I would suggest would be better placed just above the signoffs and could use the classic "Closes: #isssue or !mr" syntax. It is useful if you need to get into the deep nitty gritty of an issue discussion, however you should only need to do that after fully reading the commit message that may answer your question.

I believe we absolutely should drop usernames from the first line.

Yes, I fully understand that we don't need the credits embedded in the commit history anymore (and haven't for a long time). B

This does help maintain the copyright audit logs which are necessary for licensing. Our current use of the MR's from a shared issue fork means only the person to open the MR is credited as the author which is a legal problem long term.

dww’s picture

Re #5: Thanks for putting #3 into the summary.

Where does 'media' come from in "task(media):" ?

The spec:

<type>[optional scope]: <description>

My interpretation from #3:

The optional scope would map to an issue's Component. That's a nice touch, but I'm not as set on including that as the other aspects.

Re: #6: Maybe it's only because I've been doing this way in Drupal and every other kind of thing I code for decades, but having the "issue" ID prominent in the summaries is essential. There are lots of places in Git (CLI or otherwise) where *all* you see is the 1-line summary. I'd hate to not know the unique ID for the issue when scanning those.

bbrala’s picture

At our company we also use conventional commits in a form that DWW suggests. Only adjustment we use is brackets around the issue number. We do issue number at the front because of readability. The issue number size is consistent most of the times, so it makes for a cleaner git log if that is at the front.

[jira-ticket] <type>[optional scope]: <description>
[optional body]
[optional footer(s)]

For example:

[SWIS-1234] feat: add blog overview
[SWIS-1234] refactor: show a message when there are no results

Some categories don't have any blog posts at the moment.

So i really really like the suggestion to switch to that. There is a few things this could also help with.

For realease management you could just check all commits with feat and find a list of all added features that release. Same for different issue types. There are even automated tools that could generate that list for you.

Other than that, the suggestion by DWW also keeps all the info we have right now, which is great to have the complete history in git.

Using the scoping could be usefull, this could be subsystem. Just trying to think what would happen if it's a global think, not directly tied to a subsystem.

I have not looked at how automatable it is, but i think this could actually be built from most information that an issue will have.

bbrala’s picture

Issue summary: View changes
StatusFileSize
new62.81 KB

Real log exmaple for reference:

cmlara’s picture

Fair point on the one line review.

Interesting how different development lines happen where some
have always done it first line and some never have.

Last sample by @bbrala does look fairly clean IMHO and helps visually focus on the parts one is most interested in.

kim.pepper’s picture

Another example of a commit message that has no room for a meaningful message because it's just a list of contributors:

commit dae385398eb9860650b50f2aac087c33ee35083b (HEAD -> 11.x, origin/11.x)
Author: Lauri Eskola <lauri.eskola@acquia.com>
Date:   Sat Apr 27 00:43:13 2024 +0300

    Issue #3438895 by finnsky, larowlan, plopesc, m4olivei, ckrina, mherchel, catch, alexpott, nod_, deviantintegral, jwitkowski79, nod_, jponch, jwitkowski79, rkoller, Ana Barcelona, YurkinPark, finnsky, javi-er, alvarito75, ctrlADel, AaronMcHale, Emma Horrell, akshayadhav, claireristow, baluv3, bronzehedwick, NikMis, deviantintegral, hot_sauce, kostyashupenko, gnuschichten, keyboardcowboy, gdd, silviu, Prashant.c, ehsann_95, pjudge, rkoller, Shreya_th, starshaped, bal krishna, meeni_dhobale, mherchel, jo→
quietone’s picture

Issue summary: View changes

I have interpreted #6 and #8 and added them as options to the issue summary.

bbrala’s picture

For option #8 we need a # in front to supprt issue linking in gitlab. See: https://docs.gitlab.com/ee/user/project/issues/crosslinking_issues.html

bbrala’s picture

Issue summary: View changes
bbrala’s picture

Issue summary: View changes

Ok, that breaks the issue summary because of d.o. replacement... o_O

Not sure how to communicated that then.

bbrala’s picture

Issue summary: View changes

Found a way (non existing node)

bradjones1’s picture

I am in favor of a first line that is descriptive of the thing being done, and an issue number. I _think_ it would need to be in the form of "Closes #XXX" for GitLab to auto-close the issue, however we might decide we don't need that magic.

The authored-by annotations are great for linking users to commits and could be used for commit credit. Even if it's not necessary for commit credit (I know that's being wired up somehow) it's still helpful in so far as it is an audit log of who was involved, and even can flow through to other things like github where we have a mirror, and you would get "credit" in their glowing green bar thingie too.

quietone’s picture

Issue summary: View changes
Status: Active » Needs review

There are few people discussing this but of those that have I think there is a preference for Option B. I have updated the issue summary with that as the proposed resolution.

kim.pepper’s picture

Option B +1

feyp’s picture

Looking at commits in GitLab, it would be great if I didn't have to expand the commit message to get to the most important information.

I like that authors and reviewers are moved into the description and that the information is presented in a more structured manner. This information is valuable to have, because it gives credit where credit is due, but it is not the information I'm immediately interested in when looking at commits.

I'd like to have the following information on the first line:

  1. The issue number
  2. A meaningful commit summary

The issue number should be in the first part of the first line. That way, I can click right through to the issue without having to expand the commit message.

Ideally, the commit summary should start with an imperative verb in upper case (e.g. Add xyz, Remove xyz, Convert xyz).

Honestly, I don't care at all about the conventional commit stuff, I don't need that. It just takes away valuable space on the first line in the part of the commit message that is immediately visible without expanding the message and is hard to correct later, if there was a mistake and it is wrong for some reason (e.g. a change not correctly identified as a breaking change on commit). It means we have to be even more careful checking issue metadata before marking an issue as RTBC, right now we often don't even manage to adjust the issue title into a meaningful commit summary. And I really don't like the "all lower-case" commit summary line in conventional commits, although that doesn't seem to be part of the standard, it is just in all the examples?

W/r/t #17, the issue closing pattern in GitLab is just a regular expression that can be configured. The default issue closing pattern matches a whole bunch of key words already, but none of the proposals will auto-close issues in GitLab by default. B) is close to what would be needed, but won't match because of the colon. But I'm also not sure that we really want auto-closing for issues.

Looking at the proposals, A) and C) match some of my requirements. B) does not.

Edit: Overall, I think all of the proposals are already a huge improvement on what we use currently.

acbramley’s picture

I very much agree with #20, it would be a big loss IMO to lose the issue number from the beginning of the commit.

IMHO "task(media):" is just cruft that I don't care about. Much more important to have the issue number in that place as 99% of the time when I'm looking at a commit message it's to find the issue that it belonged to.

Other than that, I'm a fan of Option B but please reconsider moving the issue number

poker10’s picture

Currently the issue number is the most important part of the commit message in our company, because it connects with the specific issue, where all details are stored. Issue titles can be sometimes pretty general (or not fully indicative), but the issue numbers is the one item here, which can provide all relevant information.

So I agree with previous comments to keep the issue number in the beginning of the commit, thus we prefer option A (or B with the issue number in the beginning).

dww’s picture

Issue summary: View changes

Huge -1 to B. As I and others have said, the only things I REALLY care about in the first line are the issue ID and a short, meaningful summary of the change.

I also don’t really care about conventional commits as such. However, I think (task|feat|bug) is vastly more useful in the summary than “Issue”.

If the “scope” part of conventional commits is considered noise, let’s drop it. That was always intended to be optional. Added option D to the summary, which is basically #8.

I’m very strongly in favor of D over B.

Thanks,
-Derek

joachim’s picture

I agree with all the comments about the issue number being the most important. That's the thing I want from a git log, if I'm doing `git log -p` or `git blame` then `git show`.

I'm not sure about the usefulness of the scope part in option C. Would it be the module name? What about the core components? How do we distinguish between Core-Core and Core-Components, especially when they have the same name (e.g. there's Core/Plugin and Component/Plugin).

Generally, if I want to see what parts of Drupal a commit touched, I do `git show --name-only`.

+1 to D.

fathershawn’s picture

Thank you @dww for Option D. The scope is definitely noise in contrib as it would be the same for every commit in the project. Even in core, anything not in a module might be scope (core)?

+1 for Option D.

bramdriesen’s picture

Big + for conventional commits! We're using the one from Angular in our projects at work. That paired with semantic release is really super easy to work with!

I think option D with an optional scope would work best. So you could still do things like "tagging" submodules.

[#999999] task(submodulename): Changed to dependency injection
dww’s picture

Re #26: that’s exactly option C.

But folks are worried about “optional scope”. I tend to agree it’s mostly going to be noise, often duplicate with the summary itself (Eg the summary should tell you basically what part(s) of core are changing), and if you really need to know, we’ll have it in the diff itself and the linked issue.

That said, I’d be happy with D, C or A. Anything but B. 😂

feyp’s picture

That said, I’d be happy with D, C or A. Anything but B. 😂

Same here, preference now for D, so thanks for adding that :). My comment was triggered by comment #18 really, which stated that the consensus seemed to be option B. Until then, I was just a quiet follower ;).

dww’s picture

P.s. I’m -1 to trying to have GitLab auto close issues based on commits. I don’t think that’s a useful goal for us, especially in core. We often need at least 2 different MRs for any given change depending on backports. There are change records to review and publish. While auto-close might make sense for smaller contribs (who can already format their commits however they/we want), this issue is for core, and I don’t think we ever can rely on auto close just because any single commit was pushed.

dww’s picture

Issue summary: View changes
  • Removed the duplicate B from proposed resolution and set that to “TBD”.
  • Added a little clarification to D for context.
bramdriesen’s picture

Re #27: Ah ok, that was not clear to me in the format of the IS as this is not following the way conventional commits is doing things.

The context you added to the comments where this is in that format helps! Thanks ;-).

baluertl’s picture

StatusFileSize
new23.48 KB

I welcome this discussion as the Drupal Core (and many contrib) project's commit message customs definitely can be improved.

Moving as one

I see that this issue has been created for Drupal Core. However, in my view introducing a new pattern only for Drupal Core but leaving out the contrib space would inevitably alienate the two from each other. Many community members who contribute to both kinds of projects (therefore composing commit messages for their MRs) would have to keep in mind two different customs. Moreover, further development would be needed on Drupal․org to display the suggested commit message at the bottom on each issue page.

Whichever proposal wins I'd suggest tying its introduction to the date of a Core version release (optimally a major one but if D12 is too far away on the timeline, then the next minor). Regarding the introduction to contrib space, we can choose a calendar day to appoint as the recommendation being in effect (eg. from Jan 1, 2025).

Practicality first, fanciness after

I can imagine that adopting the standards of Conventional Commits might ease newcomer contributors to join the Drupal community from the global developer society by decreasing the number of “drupalisms”. However, I also agree with previous comments prioritising the practical value of each message part to our needs rather than just bringing in some fancy brand name to put on the list showing off how standardised the Drupal project is.

Bear with the people

Let's face the fact: most Drupal contributors have no practice yet in writing commit messages. In our decade-old patch uploading workflow, a community member trying to fix the code was not required to compose a commit message but to share the suggested code change only in its raw form. Only a smaller subset of our community, the maintainers were dealing with commit messages while the significance of keeping commit history clean remained hidden for the wider audience.

During this transitional period shifting over to a more modern MR-based workflow, I see in my everyday work even this change causes friction for many. Therefore we need to keep our commit message pattern dead simple as possible if we expect past and current contributors to adopt something totally new for themselves.

Decrease focus on scope

Partly based on my previous point, regarding the optional scope (the (media) part in the examples): if something causes questions (comment #5) within the circle of experienced developers then projecting it to the scale of the wider masses will definitely cause general confusion.

If any maintainer sees added value in including this information in the commits of their projects then offer them the opportunity to do so at the end of the commit description. This way commits will still remain searchable but do not spoil the visual consistency when multiple commits get listed some containing and some not.

Do not “hardcode”* changeable information

Regarding the (task|feat|bug) type: if I understand correctly this information would be in connection with the value of the Category field on the issue edit page.
Screenshot
If my assumption is correct, then I find it unfortunate to copy over data from an easy-to-change source system into a less easily changeable system. This can (and I bet will) cause discrepancies leading to confusion again. I would love to see some numerical statistics or hear maintainers' voices about how important role this issue category field plays in their everyday job. Such feedback would help us in this discussion a lot to make a more prepared decision about the necessity of sacrificing these 4–5 characters in a valuable spot of the commit message.
*Quotation marks because existing commits can be amended.

baluertl’s picture

Issue summary: View changes

Updating IS with option E.

marvil07’s picture

Status: Needs review » Needs work
Related issues: +#1583840: Define conventions around drupal core git interaction

Current commit message format

There use to be documentation around commit message format, looking a bit around that use to be node nid 52287, but now that redirects to node nid 3156588, which is about issue credits.
AFAIK that format is the following.

Issue #[issue number] by [comma-separated usernames]: [Short summary of the change].

The current official drupal core hint about what the commit message format should be, following Granting credit to issue contributors is:

Commit the issue using the default commit message in the "Credit & committing" widget

That is currently part of drupal.org custom code that produces a template for maintainer, including core maintainers.
So a message like the example on the description is produced.

Issue #3420997 by sorlov, quietone, DanielVeza, smustgrave, alexpott, mstrelan: Convert MediaSource plugin discovery to attributes

The problem

The commit message format is not easy to scan to find the reason the change was made.

Yes and no.

Commit messages have historically been succinct.
The pattern in Drupal core development commit messages has been for almost all of its existence, especially since Drupal 5.x development around 2006, relying on drupal.org issues as the main way to have information around a given change.
The issue usually has most details, especially around decisions, approaches considered, and more.

So, for some of us it is just a reflex to go to the related issue for any commit we want to know more about.
Granted, that is a something that needs to be learned if it is the first time you look at drupal core git history.
That may not be the best developer experience, but the path is quite clear: go to the issue referred on the commit message.
Also, the issue is the first of second thing on the first line of the commit, so it is part of the git subject, shown on most of the tooling that reads git.

The other thing here, and likely the reason not directly highlighted on the issue description is that our current format implies listing all contributors of an issue, and that list can be long.
That is a practice that comes from times where we did not have issue credits, #2288727: [meta] Provide credit to organizations / customers who contribute to Drupal issues, which is now the used solution.

Drupal is a really special project, in which it is the norm that multiple people help a given issue.
That is great, and helps in many ways.
In fact in the last year more than 79% of the commits involve 2 or more mentions vs. the rest being one mention; which usually means two, i.e. the author and the reviewer/commiter. See the end of the comment for the source of this affirmation.

If the amount of names, especially because they are before the actual description, is the actual problem that wants to be solved here, yes, moving that into git trailers is a good idea.

Instead if by "not easy to scan to find the reason the change was made" we think about self-containment of a given commit message, then we may need some extra steps.
Notice that other major software projects use self-contained commit messages, more on that below.

Self-containment of a commit message can help to actually read commit messages without looking externally to drupal.org issues.
That would be really great to have, but it would be a change in process.
Given the collaborative nature of the drupal community, building that summary as part of the issue process could be a good idea.
The simplest way is likely to use a new section on the issue summary description to do that.

On proposed solutions

All the existing alternatives proposed above use Conventional Commits 1.0.0 as its base, but they do not really address how we would like to embrace that in detail.
Yes, there is one example per option; but that is far from a clear guideline, especially about git trailers.

Also, for this to happen, the drupalorg project related code needs to be changed to provide the desired format.
Otherwise commiters will need to do the user formatting on git trailers manually.

Namely, what is missing would be to establish how to use \<type\>, [optional body], [optional scope], and [optional footer(s)].

We should decide on the set of types to use here, or a guideline on what to use.
E.g. embracing other project set or not.

The optional scope is also not clear.
The examples point to "media", which is the main module changed on drupal core git commit 7201d4bfba4aaedf5ff68ab9ccfa035554fa88be, from where the commit messag sample seems to come from.
The natural fit may be the core subsystem, as defined over MAINTAINERS.txt, but that may needs some standardization of names.

If we want to make the commit message more self-contained, then the body should be filled most of the time.

The optional footers also need to be defined, conventional commits point to a custom BREAKING CHANGE: \<description\> footer and then point to other git trailers, which mean any git trailer.
Examples suggest to use Authored-by: and Reviewed-by:.

The list of possible git trailers should be defined, so I guess that is reason enough to move this issue back to NW.

I would suggest to embrace other major project git trailers instead of creating new ones.
Both linux kernel and git projects use Reviewed-by:, but Authored-by: is not used.
For that linux kernel project use Co-developed-by:, and git uses Co-authored-by:.

Apart from those, we may want to define other possible trailers, and if we want to embrace or not BREAKING CHANGE: \<description\>.

To keep or not to keep issue number

I think this depends on what we want to do on commit message self-containment.

If we want to keep the existing approach to have most of the information on the issue, then the issue number should still be the first thing on the commit message.

If if we move to self-contained commit messages, in our current process, the issue discussion is still useful to reference, so keeping it at least as a git trailer is a good idea.

It may also be worth to point to a previous relevant comment around why Issue may still be useful:

Because prepending "Issue" adds some semantic value to the self-contained commit message (even if it's implied, it makes explicit the fact that the following number refers to an issue).

On some inspiration from other major projects

Other major projects could be a source of inspiration on how to do things.

The following links point to how the linux kernel and the git projects define their git trailers, along with more information about how they structure the commit messages.

https://www.kernel.org/doc/html/latest/process/submitting-patches.html#w...
https://git-scm.com/docs/SubmittingPatches#sign-off

A couple of examples of each of them follow.

netlink: specs: mptcp: fix port endianness from linux's commit 09a45a5553792bbf20beba0a1ac90b4692324d06

netlink: specs: mptcp: fix port endianness

The MPTCP port attribute is in host endianness, but was documented
as big-endian in the ynl specification.

Below are two examples from net/mptcp/pm_netlink.c showing that the
attribute is converted to/from host endianness for use with netlink.

Import from netlink:
  addr->port = htons(nla_get_u16(tb[MPTCP_PM_ADDR_ATTR_PORT]))

Export to netlink:
  nla_put_u16(skb, MPTCP_PM_ADDR_ATTR_PORT, ntohs(addr->port))

Where addr->port is defined as __be16.

No functional change intended.

Fixes: bc8aeb2045e2 ("Documentation: netlink: add a YAML spec for mptcp")
Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net>
Reviewed-by: Davide Caratti <dcaratti@redhat.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20240911091003.1112179-1-ast@fiberby.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

wrapper: introduce log2u() from git's commit d343068e4abc5e43d1ef1d5fed42bf4d7aa8cff4

wrapper: introduce log2u()

We have an implementation of a function that computes the log2 for an
integer. While we could instead use log2(3P), that involves floating
point numbers and is thus needlessly complex and inefficient.

We're about to add a second caller that wants to compute log2 for a
`size_t`. Let's thus move the function into "wrapper.h" such that it
becomes generally available.

While at it, tweak the implementation a bit:

  - The parameter is converted from `int` to `uintmax_t`. This
    conversion is safe to do in "bisect.c" because we already check that
    the argument is positive.

  - The return value is an `unsigned`. It cannot ever be negative, so it
    is pointless for it to be a signed integer.

  - Loop until `!n` instead of requiring that `n > 1` and then subtract
    1 from the result and add a special case for `!sz`. This helps
    compilers to generate more efficient code.

Compilers recognize the pattern of this function and optimize
accordingly. On GCC 14.2 x86_64:

    log2u(unsigned long):
            test    rdi, rdi
            je      .L3
            bsr     rax, rdi
            ret
    .L3:
            mov     eax, -1
            ret

Clang 18.1 does not yet recognize the pattern, but starts to do so on
Clang trunk x86_64. The code isn't quite as efficient as the one
generated by GCC, but still manages to optimize away the loop:

    log2u(unsigned long):
            test    rdi, rdi
            je      .LBB0_1
            shr     rdi
            bsr     rcx, rdi
            mov     eax, 127
            cmovne  rax, rcx
            xor     eax, -64
            add     eax, 65
            ret
    .LBB0_1:
            mov     eax, -1
            ret

The pattern is also recognized on other platforms like ARM64 GCC 14.2.0,
where we end up using `clz`:

    log2u(unsigned long):
            clz     x2, x0
            cmp     x0, 0
            mov     w1, 63
            sub     w0, w1, w2
            csinv   w0, w0, wzr, ne
            ret

Note that we have a similar function `fastlog2()` in the reftable code.
As that codebase is separate from the Git codebase we do not adapt it to
use the new function.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In fact in the last year more than 79% of the commits involve 2 or more mentions vs. the rest being one mention; which usually means two, i.e. the author and the reviewer/commiter. See the end of the comment for the source of this affirmation.

To arrive there, the following was used as the base.

$ date -u
vie 13 set 2024 17:58:16 UTC
$ git describe
11.0.0-alpha1-541-g1c67d3b559
$ git log --format=%s --since="1 year ago"| wc -l
2141
$ git log --format=%s --since="1 year ago"| sed -e 's/:.*//' -e 's/^Issue #[[:digit:]]* by //'|awk -F, '{print NF}'| sort |uniq -c |sort -nr
    491 2
    440 1
    338 3
    247 4
    183 5
    116 6
     97 7
     56 8
     39 9
     23 11
     20 10
     14 12
     13 14
     10 13
      8 16
      7 17
      6 18
      4 21
      4 20
      3 24
      3 22
      3 19
      3 15
      2 29
      2 25
      1 82
      1 55
      1 52
      1 46
      1 35
      1 34
      1 32
      1 28
      1 26

P.S. It would be so nice to have <details> accepted on the comment format.

dww’s picture

Status: Needs work » Needs review
  1. This is a policy issue for core. We cannot and do not enforce what happens in the "Wild west" of contrib. Yes, many/most contrib maintainers follow core's lead on such things. Certainly the bespoke tooling on d.o issues is largely to "blame", since folks do what's easy. But it's entirely a distraction to worry about differences, enforcement, wringing our hands over how hard it will be if contrib maintainers continue to do their own thing, etc. I just want to be able to read the core Git history and understand WTF is going on, which is close to impossible with the current state of things. Please let's focus on that problem, and not delay this another 10 years (yes, the efforts to fix core's commit conventions have been going on at least that long) by worrying about contrib. If core starts doing this, contrib will (hopefully!) quickly catch up...
  2. This may or not happen before we migrate entirely to GitLab issues. Either way, "we" are almost certainly not going to be changing any drupalorg.module code to better support this. Sadly, there's likely to be some manual effort for crafting the right attribution footers for core commits. That's probably a deal breaker, but I/we can still dream. If it comes to it, we'll have to drop the attributions. Or hopefully there are ways within GitLab to make that possible/easy. I haven't hacked around enough in GitLab to know. There's a chance that GitLab has templates and variables that might be sufficient. So I don't think we need an elaborate bikeshed discussion on the exact formatting of the footer, since all that might have to be dropped for expediency, anyway.
  3. I've only said it dozens of times, but I'll say it again: "Issue" is a waste of bytes. What kind of issue? That's part of the point of conventional commits and semantic versioning -- if we see "feat" in the Git logs for a given tag, it should be a new minor version. If we only see "task" and "fix", we know it's still viable for a patch release. Yes, people can / do get in shoving matches over bug vs. feature, and the d.o fields can change. Same with credit + attributions. Things change, but Git history does not. That said, I'd rather the Git history did the best it could to reflect and document what we thought was true at the time the commit was pushed. I don't think it's the end of the world if someone goes back and retroactively changes an issue from a task to a bug but the Git commit still says task.

Back to NR. I don't see anything of that warrants needing more work here. We need to keep reviewing the options, and ideally converge on agreement. NW will doom this to further delays and more nearly useless core commit messages... 😅

Thanks,
-Derek

P.s. edited for clarity: I believe the word “Issue” itself is a waste of bytes, and we’d be better off using the category (fix|feat|task), but the ID (#999999 in all examples) is absolutely essential as part of the 1-line summary.

dww’s picture

p.s. Re: #33 and option E: If your concern is that Category and Component can change in the issue but not in the Git history, why put them in the commit message at all (even if only in the footer)? That seems to undo the whole point you were trying to make at the end of #32. I think the "optional scope" here has been a distraction, which is why I dropped it entirely in option D. "Category" is valuable, which is the premise of conventional commits. See #35.3. But option E seems to have all the drawbacks and none of the benefits of putting the category directly into the 1st line summary.

baluertl’s picture

Issue summary: View changes

#36 Logically correct, option E updated.

joachim’s picture

> Therefore we need to keep our commit message pattern dead simple as possible if we expect past and current contributors to adopt something totally new for themselves.

I don't really see how this is an issue -- we usually squash MR commits into one, don't we?

> There use to be documentation around commit message format, looking a bit around that use to be node nid 52287

There was. Then I think what happened is that we added a commit message suggestion to d.org, and that just takes the issue title.

This IMO has made commit messages a LOT WORSE, because issue titles are typically not worded in the same way. They state a problem, whereas the commit message should state an action done.

baluertl’s picture

“I don't really see how this is an issue -- we usually squash MR commits into one, don't we?”

My point is regarding commits existing on feature branches before the MR gets merged. Your statement refers to the single one commit getting created after the MR got merged into the main branch.

joachim’s picture

I didn't think this issue was concerned with commits on a MR feature branch?

dww’s picture

Re #40: 100% correct. This is for the core Git history, the result of the squashed merges that get pushed to the canonical core repo.

Re: #39: No one cares about intermediary commit messages in MRs, neither core nor contrib. Those aren’t “history”.

baluertl’s picture

#41 Keep your temper, please.

I understand I misinterpreted the scope of the topic. To outline the base concept here: running a community decision about the pattern of how those few core committers will use when they merge MRs into the Core, right?

What is it not about?

  • The Contrib space at large
  • Commits on feature branches
tim.plunkett’s picture

@baluertl, I see how you mistook @dww's comment as him "losing his temper", but he was stating a fact. Feature branch commit messages can be anything, and that's okay.

The point I am not clear on is why this wouldn't affect contrib.

Contrib projects have the same commit message format as core, and have for as long as I can remember.

I would expect contrib to adopt this change too.

quietone’s picture

Issue summary: View changes

Thank you to everyone who clearly indicated their preference. I have updated the Issue Summary to indicate that Option D is the preference

A: 1
B: 1
C: 1
D: 5
E: 1

kim.pepper’s picture

D looks great. I'm going to change my preference to D. 😁

dww’s picture

Thanks, y’all! Indeed, I’m not angry, and still have my temper right here by my side where I left it. 😂

To be extra clear: I agree that contrib will hopefully follow core’s lead on this, especially if we work out tooling to make it easier to get this right. But contrib can already do whatever they want. I’m acknowledging that freedom, and I’m not claiming this issue will mandate contrib to do the same.

The thing we can change is the format core committers use when they push to the canonical core repo. Once that’s better (and easy), I imagine most of contrib will do the same.

Thanks again,
-Derek

acbramley’s picture

+1 for D

bramdriesen’s picture

#46 Totally agree!

Would be nice if we can suggest the squashed commit message in the recommended format.

baluertl’s picture

Issue summary: View changes
bbrala’s picture

I'm all for using my proposed solution, specifcally D.

Seems there a decent consensus on D

joachim’s picture

> [#999999] task: Convert

I kinda feel that the 'type' of fix/feat/task is a waste of bytes too.

- fix: Commit message will be of the form 'Fixed the broken thingy.'
- feat: 'Added support for the thingy'.
- task: Any other verb as the first word.

We used to have a commit message standard (~ 15 years ago!) which said the first word should be a verb like 'Added / Fixed / Changed'.

So I prefer E to D, but D is my second choice.

dww’s picture

#51 I hear you that the component can usually be inferred by the summary text. But the promise of conventional commits and related tooling is that if you fully standardize on it you can actually grep the history to learn things, make release notes, etc, instead of parsing text and having to “understand” it.

bramdriesen’s picture

Re #51 Without those, you won't ever be able to benefit from semantic-release for example. Which is something we use on all our projects.

https://github.com/semantic-release/semantic-release

bbrala’s picture

Status: Needs review » Reviewed & tested by the community

In my opinion we have consolidated on option D. Updated the list so helpfully gathered by @xjm

A: 1
B: 1
C: 1
D: 8 (plus one second choice)
E: 2

I'll go so far to say this is RTBC. <3

larowlan’s picture

I will add this to the committers next meeting agenda

bbrala’s picture

quietone’s picture

Issue summary: View changes
Status: Reviewed & tested by the community » Fixed

I checked on the next step here and that is to open an issue in the infrastructure project. That issue is, #3486271: Change format for git commit message for the Drupal Core project..

So, the policy decision has been made and there is an issue to implement, so this issue can be closed as Fixed.

Thanks everyone!

bbrala’s picture

Pro tip, print the list in comment 57 and hang it on your monitor ;) Saves a lot of time while getting used to it.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

xjm’s picture

Amending attribution.

xjm’s picture

I will be tempted to call every issue I dislike a chore. ;)

quietone’s picture

Version: 11.x-dev » 11.1.x-dev

Changing to latest version when this was closed.

quietone’s picture

nod_’s picture

Concerned were raised in #3543076: Simplify agreed commit format and comply with Conventional commits spec about the fact that we're not following conventional commits spec and is proposing to move the issue ID after the commit type information.