Problem/Motivation

#2288727: [meta] Provide credit to organizations / customers who contribute to Drupal issues aims to provide commit credit to organizations.

This issue is postponed (per #56) on having a few months of comment attribution data to play with.

Proposed resolution

Change the commit message template to a multiline format, similar to the format used by several other open source projects (Linux, OpenStack, Eclipse, Typo3). Put credit information in the "footer" of the commit message, like:

Supported-by: Drupal Association drupal-association@1646370.org.no-reply.drupal.org

This is a representation of [Organization Node Title] [Organization Alias]@[Organization Node ID].org.no-reply.drupal.org, and allows tying the credit back to organization nodes.

An example of a full generated commit message might be for an issue such as #2183983: Find hidden configuration schema issues:

git commit -m 'Issue #2183983: Find hidden configuration schema issues

Contributed-by: Gábor Hojsty <goba@4166.no-reply.drupal.org>
Contributed-by: Vijayachandran Mani <vijaycs85@93488.no-reply.drupal.org>
Contributed-by: Wim Leers <wimleers@99777.no-reply.drupal.org>
Contributed-by: Sascha Grossenbacher <berdir@214652.no-reply.drupal.org>
Contributed-by: Florian Weber <webflo@254778.no-reply.drupal.org>
Contributed-by: Alex Pott <alexpott@157725.no-reply.drupal.org>

Supported-by: Acquia <acquia@1204416.org.no-reply.drupal.org>
Supported-by: Capgemini <capgemini@1772260.org.no-reply.drupal.org>
Supported-by: MD Systems <md-systems@1979456.org.no-reply.drupal.org>
Supported-by: UEBERBIT GmbH <ueberbit@1838438.org.no-reply.drupal.org>
Supported-by: Chapter Three <chapter-three@1121246.org.no-reply.drupal.org>

Supported-by: Some Mysterious Drupal 8 Client <mysterious-d8-client@XXXX.org.no-reply.drupal.org>
Supported-by: Some Other Mysterious Drupal 8 Client <mysterious-d8-client2@XXXX.org.no-reply.drupal.org>
'

We can use historical Git commit mentions to populate historical commit entities to give credit to individual contributors for past contributions. This should keep history in sync going forward. Because of the way that the supported-by element works, we can only track organization commit mentions going forward from the release of this feature.

Other solutions that have been discussed and put aside include using Git notes (while it's the right tool for the job, there is not rich enough support in various clients, and it requires engineering effort on Drupal.org infra side as well), and using special characters to denote individuals vs. employers vs. customers (this is a non-standard "Drupalism," it creates weirdly formatted commit messages that can't ever be changed).

Remaining tasks

Discuss, agree on exact format for implementation.

User interface changes

Auto-generated commit message template in the "Credit & Committing" UI will change; https://www.drupal.org/node/52287 will need to be updated.

API changes

Unknown. In #48 joshuami notes: "All credit will be associated with a commit rather than the issue. We will need to confirm that the commit entity (version control operation) that we store on Drupal.org can have fields, or other storage can be implemented cleanly. We can parse the users and organizations from the commit message, and associate them with the commit entity as user and node entity references."

Original report by @webchick

#2288727: [meta] Provide credit to organizations / customers who contribute to Drupal issues aims to provide commit credit to organizations. There are a few ways this can be done:

Embed credit in commit message itself

This is the format proposed at http://buytaert.net/a-method-for-giving-credit-to-organizations-that-con... (slightly tweaked to put the credit at the end in order to keep commit messages sensible, discussed and agreed upon at #2288727-31: [meta] Provide credit to organizations / customers who contribute to Drupal issues and following:

$ git commit -am "Issue #n: message by INDIVIDUAL@AGENCY*CUSTOMER."

For example (this would be auto-generated so just a copy/paste):

$ git commit -m "Issue #42: fixed performance bug by Sam@Acquia, Megan, Tim*Pfizer."

Pros

  • No special tooling required; could start using this format immediately.
  • Credit is immediately obvious, in a very developer-facing place.
  • Credit also visible in third-party tools; GitHub, cgit, Git GUIs such as Tower, etc..... basically every single thing that works with Git knows how to display commit messages.

Cons

  • On issues with multiple authors (especially complex core issues) you could easily end up with a 2048+ character commit string.
  • A "Drupalism"; no other project does it this way.

Embed credit in Git notes

git notes is a way to add metadata to commits without actually affecting the commits themselves. See https://groups.drupal.org/node/161659 for some prior art on this discussion.

For example (this would be auto-generated so just a copy/paste):

git commit -m “Issue #2125621: Fixed Contrast between title/slogan and header is too low.”  --author=“David Hernandez <foo@bar.com>” --git-notes=“davidhernandez*New Jersey Something Something
mgifford@Open Concept
jessebeach@Acquia
mparker17@Open Concept*Government of Canada*Whitehouse
anandps
”

Pros

  • Git notes is specifically for storing additional metadata, so we're using Git as Linus (or whoever) intended. :P
  • Commit messages can go back to just "Issue #1234: Fixed the thing." which is standard with nearly all other projects.
  • Can even go back to old commits and add this information. Doesn't change the commit hash!
  • Because this information is not intended to be human-readable per se, can embed much more detailed information, for example actual user / organization IDs, which can ease parse-ability/improve accuracy (however, this seems to be a no-op because according to drumm in #19, the UI being developed at #2295411: Auto-generate Git attribution info / commit messages on Drupal.org will populate tables).

Cons

  • Credit not immediately visible without some special tricks, e.g. git show -s or custom Drupal.org listings.
  • May require significant tooling to get custom listings working on Drupal.org; #1282040: Add support for reading git-notes sounds a bit spooky (though could tie it to the issue rather than the Git commit to work around VCAPI).
  • Not a widely used feature, so unknown level of support in third-party tools (Git GUIs, source code browsers, etc.) for displaying Git notes. As a prominent example, GitHub used to have and then recently removed support for displaying git notes.
  • Also a "Drupalism"; no other project does it this way.

Embed credit in commit message itself (another way)

Adding a block of text describing attribution in the commit message as the linux kernel or git itself project does it.
Notice we will be extending what those projects do adding a new keyword, "Supported-by".

Detailed:

	Signed-off-by: Coder1 Name <coder1@git.mail>
	Signed-off-by: Coder2 Name <coder2@git.mail>
	Reported-by: Reporter1 Name <reporter1@git.mail>
	Tested-by: Tester1 Name <tester1@git.mail>
	Reviewed-by: Reviewer1 Name <individual2@git.mail>
	Suggested-by: Suggester1 Name <suggester1@git.mail>
	Supported-by: Agency1 Name <agency@agency-domain.com>
	Supported-by: Customer1 Name <customer@customer-domain.com>

Not detailed:

	Signed-off-by: Coder1 Name <coder1@git.mail>
	Signed-off-by: Coder2 Name <coder2@git.mail>
	Signed-off-by: Reporter1 Name <reporter1@git.mail>
	Signed-off-by: Tester1 Name <tester1@git.mail>
	Signed-off-by: Reviewer1 Name <individual2@git.mail>
	Signed-off-by: Suggester1 Name <suggester1@git.mail>
	Supported-by: Agency1 Name <agency@agency-domain.com>
	Supported-by: Customer1 Name <customer@customer-domain.com>

Pros

  • No special tooling required; could start using this format immediately.
  • Credit is immediately obvious, in a very developer-facing place.
  • Credit also visible in third-party tools; GitHub, cgit, Git GUIs such as Tower, etc..... basically every single thing that works with Git knows how to display commit messages.
  • Not a "Drupalism"!

Cons

  • On issues with multiple authors (especially complex core issues) you could easily end up with a 2048+ character commit string.

Comments

markcarver’s picture

markcarver’s picture

I am personally more in favor of using git notes. Another good resource to read: http://git-scm.com/blog/2010/08/25/notes.html

Regardless of whether these are in [a multiline] commit message or a git note, I do not think that doing the equivalent of username@company is a very sustainable approach. Not to mention we already have the standard git attribution format (in the form of an email address). I think just by using CSV we can determine when an attribution is associated with an company, client or organization. The easiest way to parse this: first one is always a user, any additional "emails" are simply nodes to the respectable entity.

The following is what I would like to propose. Keep in mind this isn't real data, it's just for formatting sake:

Initial commit:

git commit -m “Fixed #2125621: Contrast between title/slogan and header is too low.”  --author=“davidhernandez <davidhernandez@261131.no-reply.drupal.org>”

Followed by:

git notes --ref review append -m "davidhernandez <davidhernandez@261131.no-reply.drupal.org>, New Jersey Something Something <756332@756332.no-reply.drupal.org>"
git notes --ref accessibility append -m "mgifford <mgifford@782321.no-reply.drupal.org>, Open Concept <543211@543211.no-reply.drupal.org>"
git notes --ref review append -m "mparker17 <mparker17@1717171.no-reply.drupal.org>, Government of Canada <99999@99999.no-reply.drupal.org>, Whitehouse <00000@00000.no-reply.drupal.org>"
git notes --ref documentation append -m "jessebeach <jessebeach@27123.no-reply.drupal.org>, Acquia <21322@21322.no-reply.drupal.org>"
git notes --ref review append -m "anandps <anandps@123456.no-reply.drupal.org>"

Also please keep in mind that #2295411: Auto-generate Git attribution info / commit messages on Drupal.org will help to automate this so it would literally just be a copy and paste.

webchick’s picture

The e-mail addresses and individual "append" rules I get, but how exactly would --ref documentation, --ref accessibility et al be automated? Seems like a human would need to take an extra N minutes to read through the entire issue and attribute those aspects. How would automating that work?

One aspect that Dries feels is very important but gets lost in your proposal (probably easily fixed though) is ensuring there is a distinction between employers vs. customers when giving organization credit. While I worked at Lullabot for a long time, any given patch was more than likely for e.g. Sony BMG or Grammy or whatever website I was building at the time, and both orgs should be credited. This is an important aspect of the proposal, because it encourages the orgs actually paying for Drupal work (and looking to hire top talent) to give back.

markcarver’s picture

Re: --ref
I would imagine that a certain amount of this could be automated. For example, we could have a code namespace that would just detect the individuals that have uploaded patches. Crazy idea here, but we could perhaps put the "burden", so to speak, on the commenter and give the an option to choose which type of comment they are providing (i.e. code review, accessibility review, documentation review, patch, etc.). I would be more in favor of that personally as I am the one who knows what my true intentions are. Regardless, this is to just help the automated process. I suspect that in #2295411: Auto-generate Git attribution info / commit messages on Drupal.org we should still have the ability to manually alter the individuals attributed before copying and pasting the CLI commands.

Re: employers vs. customers
As far as I am aware, both of these are (or will be) just nodes. Regardless of the content type, they both contain nids. That is what I was saying is that the first entry is always a user (the person who provided the patch, review or whatever) and then any and all subsequent CSV "email" entries would be assumed nids to the respected entity.

xjm’s picture

I still feel strongly that not overloading the commit message and making a backwards-compatible format (with git notes) is more valuable than some nebulous hope that non-git projects will adopt our d.o-specific implementation and tool. Commit messages cannot be updated in the future, so we would have to add special-casing to parse our own git log if we changed the format. In contrast, git notes allow us to retroactively add/standardize credit while still using the git log we currently have.

webchick’s picture

Yes. Unfortunately it also means a metric ass-load of work on Version Control API to make it handle Git notes, so looks from the other issue like we'll go with commit messages for now (though moving credit text to the end of the message so we keep meaningful logs).

drumm’s picture

metric ass-load of work on Version Control API

Huh? Where did someone familiar with Version Control API say this?

moshe weitzman’s picture

For the record, I agree with xjm that comment messages are a poor place for this info. If we get outvoted for some reason, then *please* put the actual text of what happenned at the beginning of the message. It is silly to put the bookeeping stuff beforehand. So,

"message. Issue #n by INDIVIDUAL@AGENCY*CUSTOMER:"

webchick’s picture

@drumm: I was judging by what Sam wrote in the issue summary at #1282040: Add support for reading git-notes:

It's gonna be an interesting challenge, since notes are internally very good at chasing around rebased commits - something we haven't had to be diligent about *at all*, even conceptually.

@moshe: Yep, that's exactly almost exactly what's proposed in the parent issue.

moshe weitzman’s picture

Sounds good. Adding tag.

xjm’s picture

Does the version control API have to be able to parse git notes for us to adopt a git notes format? Like, does adding notes break our existing parsing, or would it just be more work to richly integrate it both directions? If the latter, it seems we could add the notes generation from the start, and then do the other work later.

markcarver’s picture

Does the version control API have to be able to parse git notes for us to adopt a git notes format?

I do not believe so, no. AFAIK, VCAPI doesn't actually parse the commit message at all currently. It just determines the committer and the author (--author) of the commit itself. This is why no one has contribution credit for core on their profiles aside those who have direct commit access, yes? Adding an additional layer to parse git notes doesn't sound like it would, in any way, conflict with existing code. It would simply be an additional event that is processed.

drumm’s picture

Can https://www.drupal.org/node/52287 be updated for things that are decided? Such as, moving "by …" to the end.

webchick’s picture

No, we'd only update the docs once the functionality was deployed, IMO. Until then the commit credit doesn't get long enough to warrant breaking everyone's habits.

moshe weitzman’s picture

OK, based on #8 and #12, it does not look like we need to do much in order order to put contribution metadata into Notes. I suggest we go that way right from the start. If we don't, we're going to have angry devs who claim that we are using Git incorrectly.

As far as I can tell, this comes down to emitting two commands instead of one. We would emit git commit [STUFF] && git notes add [STUFF]

xjm’s picture

See also #2295411: Auto-generate Git attribution info / commit messages on Drupal.org. @webchick pointed out that we don't need to resolve this issue before adding that functionality, since adding the UI and the feature that already exists in dreditor could easily be the first step, followed by adopting a git notes format later.

webchick’s picture

One bummer about git notes is it looks like as of August 2014, GitHub doesn't display them anymore (source: https://github.com/blog/707-git-notes-display, via Moshe). Not a reason not to use them, per se, since they definitely make the most technical sense in a world where we have a way to display them on Drupal.org, but we lose the part of Dries's proposal where the same credit system works wherever Drupal development is done (and a non-trivial amount of Drupal development is done off d.o).

catch’s picture

I'm not sure which is the correct issue to post this on, there are at least three, however cross-posting my comment from #2230579-30: [meta] Allow crediting reviewers (and other non-coders) as first-class contributors

If we use git notes, since that can be edited without changing git history, I'd suggest the following:

1. We can auto-generate the message, and it'll be correct 95% of the time per #2295411: Auto-generate Git attribution info / commit messages on Drupal.org.

2. If we can auto-generate it, we can also have a script do this retrospectively for previous commits against core - as long as there's an issue nid associated with the commit that should be possible. That'd retrospectively fix some of the missing reviewer credit etc.

3. If we can auto-generate it, then I think the git notes part doesn't even necessarily have to be done by a core committer - we could automate it with a git commit hook, and then manually change things if we need to. Can always have the git commit hook not change anything if it's already there.

4. That leaves the current commit message - could remove attribution from that, or keep it to patch authors in the sense that's similar to what you'd see if we were merging pull requests.

I really think overloading the commit message is the wrong approach here - and it is going to add technical debt (in our commit messages of all things) for later parsing given it adds yet another format to deal with. Whereas with git notes we can potentially automate and back-date (and even change the format later and update that retrospectively too if we really needed to).

drumm’s picture

For Drupal.org, we aren't actually going to parse commit information. Since we will have the UI for generating it, we'll store it right away rather than re-parsing everything. (We can do a one-time reparse to get the history populated.) That storage isn't part of the first round of work, #2295411: Auto-generate Git attribution info / commit messages on Drupal.org. What Drupal.org stores as structured data (in fields) will be available in our RestWS API.

If this is a Drupal thing, the commit message isn't technically important in the end. It is good to visibly credit people. I'd expect 3rd-party parsing to use RestWS - JSON parsing will be more reliable than string parsing. You could query for credits to a person or organization right away, without building your own full database.

If this is a thing other non-Drupal Git users use.. we'll need Git community buy-in.

webchick’s picture

Issue summary: View changes
Issue tags: -Needs issue summary update

Moshe and I just had a discussion about this earlier today. In response, I've updated the issue summary with a clearer picture of the two options and their pros/cons.

Moshe's preference, and I think most people in this issue's preference, is to use Git notes for this, because storing this kind of additional metadata on commits is exactly what Git notes is intended to do. And it can also do so retroactively without affecting commit hashes or the like, which is a huge benefit.

In my mind there are only three downsides to Git notes:

1) It's unclear what the support is for Git notes in third-party tools, including source code browsers (GitHub, which dropped support for displaying Git notes recently, being just one prominent example) but also things like Git GUIs, etc. Can you even enter Git notes into a GUI like Tower or PHPStorm, for example, or must you resort to command-line? (if so, this would be a huge loss of workflow for many users.)

2) With our current method of embedding credit in commit messages, you see your name show up in places like https://www.drupal.org/commitlog, http://cgit.drupalcode.org/drupalmoduleupgrader/log/, and https://github.com/drupal/drupal/commits/7.x automagically. Moving to Git notes, without also introducing a way to surface those names at least on Drupal.org, would introduce a regression because these names would go from being visible to everyone with zero effort to only be visible now by developers with command-line tools. (I'm not sure how much people actually care about this in practice, but I would suspect that at least some subset of people care an awful lot.)

3) AFAIK, it's still undefined how much work would need to be done on our existing Drupal.org Git integration to incorporate Git notes, and whether that's even relevant to get visualizations mentioned in #2. Someone needs to do that research.

I don't think any of these are insurmountable; however, Git notes introduces some more barriers in front of rolling out features like #2295411: Auto-generate Git attribution info / commit messages on Drupal.org and #2288727: [meta] Provide credit to organizations / customers who contribute to Drupal issues that commit message munging simply doesn't have to deal with, since it's what we already do. So personally, my preference is to go forth with commit message munging for now, and roll out Git notes at a later date when the various dependencies are in place so we can keep iterating. But if someone digs in on 1-3 and finds out none of those are a huge deal, and moving to Git notes instead wouldn't materially push off the timeline for deployment of these features, then cool.

drumm’s picture

How is this a blocker for #2295411: Auto-generate Git attribution info / commit messages on Drupal.org? The goal isn't to fully implement the formatting. I am trying to be forward-looking in the formatting, moving by … to the end.

webchick’s picture

Right, I would argue that this issue is not a blocker, and is something that could come at any time, once agreement is reached and the dependent pieces are ready. However, there are some concerns about screwing around with the commit credit system twice, and a desire to do it once and do it right, so wanted to note them for completeness.

catch’s picture

For Drupal.org, we aren't actually going to parse commit information. Since we will have the UI for generating it, we'll store it right away rather than re-parsing everything. (We can do a one-time reparse to get the history populated.)

Why would we do a one-time re-parse of commit messages (which are missing the full attribution information that this is supposed to add), rather than using the generation mechanism on older issues to populate the data set? Can't add companies for that but it can definitely help with issue openers, reviewers, issue summary updates, images posted etc.

drumm’s picture

It will be a one-off update closer to using the generation mechanism on older issues to populate the data set. Instead of a human using the form to populate the commit data, we'll parse it out of commit messages, stuff it into field(s) and save. Not a versioncontol module re-parse.

marvil07’s picture

Issue summary: View changes

Added a new option: follow what linux and git project do.

YesCT’s picture

So this issue is about the format for commit credit... where are we discussing where the data comes from/how to tell what company gets credit?

YesCT’s picture

#1968480: Keep historical tracking of users' employers had good points on how we decide when to credit companies. The relevance here, is that we can't know what company a user worked for when they did work on an issue. We only know if "now" (on commit) they work for their "current" company.

This complicates seeding old data, and going back to old issues and filling in git notes.

It also makes the only choice a user has (if their employer does not give them time to work on issues) if they have spent their own time on issues, to make their current company a past company.

[edit: added the following]

Ah, there are good thoughts on the complication of this on the meta #2288727: [meta] Provide credit to organizations / customers who contribute to Drupal issues which has one option which suggests attribution with a UI on the issue at the time of each comment.

drumm’s picture

Yep, #2340363: Add issue comment attribution is likely the next issue to be implemented.

Dries’s picture

I gave it more thought over the weekend (and for months before that), and I really think we should go with option #1 over option #2.

$ git commit -m "Issue #42: fixed performance bug by Sam@Acquia, Megan, Tim*Pfizer."

Here is why:

  • It is the easiest to use for committers which is critical for this to succeed.
  • It is the easiest to read for people looking at commit messages. I personally don't subscribe to the idea that commit credits are metadata that should be hidden. I think it's really important information in our community. Almost all other Open Source projects store credit in the commit message too.
  • It is the easiest to implement on Drupal.org.
  • It is by far the most portable to other Git-hosting platforms and non-Git source code management systems.

All the other pros and cons are less important and if we are wrong, we can always change it later.

catch’s picture

It is the easiest to use for committers which is critical for this to succeed.

It's not easier than git notes, which if automated would entail no changes at all for committers.

It is the easiest to implement on Drupal.org.

Impossible to retrospectively apply to older commits since it would require rewriting git history. While we can't do that for organisations, we can do it for issue authors, testers, reviewers etc. automatically.

catch’s picture

Title: Determine format for commit credit for individuals/organizations/customers » [policy, no patch] Determine format for commit credit for individuals/organizations/customers

I'm going to move this to core for the following reasons:

1. It affects the core code base.

2. It's a best practices change across all projects hosted on Drupal.org, and generally those get discussed in the core queue (at least for coding standards which commit messages are closest to).

3. #2230579: [meta] Allow crediting reviewers (and other non-coders) as first-class contributors is closely related and is already in the core queue.

alexpott’s picture

One of the advantages of the git notes approach is that it separates the organisation - allowing both individual and organisations to stand on their own two feet. The ability to retrospectively credit people for reviewing an issue should not be discounted either. Getting good reviews on issues is one of our biggest problems and giving credit for it retrospectively could be massive.

webchick’s picture

I'm a bit confused as to why we need git notes in order to retroactively credit for commits/reviews, since a key part of rolling all of this out is storing the credit in a database table of some kind which we can then utilize in Views, badges, etc. for both individual and organization profiles. A script that automates populating git notes could just as easily automate populating this table(s).

The credit would only show up on d.o and not in Git, but since there are no UIs I could find to easily access git notes data, and the only format I've seen proposed for git notes is something specific to our project anyway, either way we're talking about exposing this credit on Drupal.org only. So we get the same "end user" benefit by auto-back-parsing credit out of issues, as well as the other benefits Dries mentions in #29 by sticking with commit messages.

catch’s picture

since a key part of rolling all of this out is storing the credit in a database table of some kind which we can then utilize in Views, badges, etc. for both individual and organization profiles.

If the database table is retroactively populated based on different information to the commit messages in git, then it's going to be out of sync with what's in the codebase.

That means http://ericduran.github.io/drupalcores/, Certified to Rock etc. will either have to use the information from the database table on Drupal.org, or they'd be out of sync permanently with official data on Drupal.org

In that case, why does it have to be in the commit message at all? Why not just populate the table first with as much information as possible, then decide what if anything to reflect in git notes or the commit message later?

Dries’s picture

It's not easier than git notes, which if automated would entail no changes at all for committers.

It is easier than Git Notes for me. The argument might be subjective.

We can't truly rely on Git Notes as it is not supported in other source code management systems.

We could design the system such that it picks up the commit credit from the commit message *or* from the git notes.

catch’s picture

It is easier than Git Notes for me. The argument might be subjective.

I don't think the ability to automate is a subjective argument. Either something is done entirely in the background so you don't have to interact with it, or it's not and you do.

We can't truly rely on Git Notes as it is not supported in other source code management systems.

Why is that a concern?

webchick’s picture

That's a fair point about the Git data vs. database data being out of sync. (Though OTOH, if this data was shown on Drupal.org itself, I think we wouldn't necessarily need ericduran's site anymore. :)) We'd need something added eventually to https://www.drupal.org/api with this data so that external parsing of credit data could still work if we don't populate it in Git at some point.

As far as why commit messages, I think all of #29. It's the most accessible way to find credit, it requires the fewest changes to the existing workflows/infrastructure, and it's easily utilizable outside of Drupal.org as well.

I want to point out I have no particular horse in this race, and as a developer I would love to see Git notes used ultimately, as it's definitely the right tool for the job. But I also have a lot of experience (especially with Drupal.org) with good ideas being completely killed and remaining stagnant craters for years because we didn't come up with the technically perfect solution the first go-around. I'm really trying to avoid that here in the interest of making progress, because it's clear to me from the feedback about Dries's keynote in AMS that this credit system, even if implemented imperfectly, is key to scaling the project for the next N years.

catch’s picture

it requires the fewest changes to the existing workflows/infrastructure

By definition using the database only and nothing new in git would require less changes.

it's easily utilizable outside of Drupal.org as well.

That's simply not true if the information is outdated/out of sync compared to what's being used on Drupal.org

catch’s picture

Project: Drupal.org customizations » Drupal core
Version: 7.x-3.x-dev » 8.0.x-dev
Component: Code » documentation
chx’s picture

Regarding git commit message length:

  1. git itself limits it to size_t. You are not going to be hitting that -- we are talking billions of characters at least.
  2. In practice many tools (github for example) treat the first line as a "subject" and the rest as body.

I conclude: well formatted long messages are not a problem.

See for example http://stackoverflow.com/questions/2290016/git-commit-messages-50-72-for... .

chx’s picture

Oh and by the way, I absolutely love "Embed credit in commit message itself (another way)"

We could add Security-signed-off and Peformance-signed-off and UX-signed-off and so forth later which would make very ,very happy indeed.

catch’s picture

"Embed credit in commit message itself (another way)" helps a lot on the format. i.e. if two people working for the same company and also in their own time contribute to an issue, the company gets one mention rather than two in the attribution.

Anything that happens with commit messages, we also need to remember that if we ever move to a merge flow, unless every merge is from a rebased branch with a single commit, the individual commit messages against the branch will also need to follow this format.

catch’s picture

And also any issue for which there is more than one commit is going to run into problems, not just merges if there's a 1-1 correlation between commit message -> database field for the issue. alexpott decided last week he'd had enough of follow-up-patches-on-the-same-issue due to it messing meta-data (task vs. bug etc.). but that was only last week.

Making the field multiple value might work though?

marvil07’s picture

I definitely agree with comment #18 that there are several related issues about this topic, and it is hard to discuss this since all of them are interconnected/interdependent.

First, some quick replies:

  • Re #12: vcsapi git backend does not currently stores/recognizes any git-notes information, and it does indeed recognize both author and committer data associating them with users using by mail to do the mapping.
  • Re #17: I do not see how github support is relevant here. We currently use two ways of showing commit messages, one inside drupal via vcsapi (does not currently support notes) and the other in our git web viewer, cgit (does support displaying of notes at least for default ref/notes/commits ref. Based on local testing it seems like it does not support custom refs, but I'm not sure yet)

Now, let me step back a little, hopefully this comment can add some more context to this issue.

On multiple authorship

Git allows only two internal representations of people associated with a a commit: author and commiter.

In Drupal community, changes are often worked by several people, so in order to give attribution to all of them we are using a Drupal convention to write commit messages since CVS times.
Currently in Drupal core, author and committer are used almost as synonymous, so we can "equally give credit" to several people, and only people with push access to the project should be the only ones in git author and committer data. It seems like all proposals does not change this.

I have been searching the git mailing list archives for information about assigning multiple authorship to one commit, but I only could find one relevant thread from 2012, How best to handle multiple-authorship commits in GIT?.
There, the main git maintainer proposes that author field should be taken only as a primary contact for a particular commit in case of question.

Based the mentioned Junio comment, the proposal for multiple authors in the git metadata itself will probably not be added to git in any near future, so we definitely should go with an alternative.

On granularity of contribution

As mentioned in comments #2, #3, #4, we would need to define if we would like to add additional information about what each contributor is helping with.
This naturally would need some manual analysis, since a tool cannot really automate this(at least for now).

This is valid for both 2nd and 3rd alternatives, which can either store information in different git-notes ref names or with different prefixes on each line.
First alternative just do not have that granularity.

In we decide to have this extra metadata, we should naturally need to define a set of git notes ref names or prefix keywords to use.

On extracting/showing attribution information

There are two components here:
A) Storing the attribution data
B) Mapping attributions to users in d.o

For step A:

  • in-commit-message alternatives needs nothing since data is in git and both vcsapi and cgit shows commit messages.
  • git notes alternatives needs vcsapi to be taught about reading git-notes(as mentioned in comment 20 item "2)") and cgit to be taught about non-default notes refs(if we decide to use a non-default ref), or both

For B, all mentioned alternatives require custom steps to actually map data either in commit or notes to d.o users.

Re #6-#7 (vcspi work needed for git-notes): I proposed a way to do it in #1282040-10: Add support for reading git-notes, and should be doable. Also see comment #12 there.

webchick’s picture

Thanks for the detailed reply, marvil07!

Quickly:

- If you want Even Moar Issues... ;) #2230579: [meta] Allow crediting reviewers (and other non-coders) as first-class contributors attempts to cover the granularity of credit you're talking about. I chose to split it off from the org credit discussion because the two aren't strictly related and it could be rolled out any time independently of what happens with the other stuff. It's also a bit of a thorny issue with many facets (for example, how do we delineate different types of contribution meaningfully, without relegating some of them to "second class" contributions?).

So for the purposes of this issue (and also to eliminate things to argue about :D) we should assume the "Not detailed" version.

- GitHub is relevant for two reasons: a) it's where the vast majority of open source development happens these days, and one of Dries's desires was for whatever format to be transferable across FLOSS projects and b) It's also where about 30% of Drupal development is happening these days, according to some analysis I did the other week with http://www.githubarchive.org/. So trying to invent a solution without factoring in outside tools like GitHub (and CGit if you want to pick a "closer to home" example) is only catering to part of the picture.

Sounds like there might be some coalescing happening around option #3, which keeps the credit in commit messages vs. git notes (so eliminates the downsides of git notes), doesn't result in funky characters that need to be parsed to get something meanignful, and is used by at least two other major projects.

joshuami’s picture

drumm’s picture

joshuami’s picture

The Drupal.org engineering team met to talk through implementation a bit last week. We are in agreement that option 3 gives us the most direct way forward. (Option 3 is using commit messages, but separating contribution credits onto separate lines.)

Here are a couple of additional implementation details to work into the issue summary:

All credit will be associated with a commit rather than the issue. We will need to confirm that the commit entity (version control operation) that we store on Drupal.org can have fields, or other storage can be implemented cleanly. We can parse the users and organizations from the commit message, and associate them with the commit entity as user and node entity references.

We can use historical Git commit mentions to populate historical commit entities to give credit to individual contributors for past contributions. This should keep history in sync going forward. Because of the way that the supported-by element works, we can only track organization commit mentions going forward from the release of this feature.

Generating a commit credit from an issue will have a UI that allows the maintainer to select from contributors users and the organizations they have given credit. This will be important as the Git attribution format is specific. For an organization to have their contribution tied to their organization profile, they will need the maintainer to use the Git attribution email ID for that organization. For example...

Supported-by: Drupal Association drupal-association@1646370.org.no-reply.drupal.org

This is a representation of [Organization Node Title] [Organization Alias]@[Organization Node ID].org.no-reply.drupal.org.

A commit message will be generated automatically on the issue, so maintainers will not have to manually add IDs. For those that want to credit an organization in a workflow outside of Drupal.org, they will need to create an issue to easily tie together the commit credit to the associated commit. This is common practice—if not best practice—for most projects on Drupal.org. The upcoming issue workspace initiative will be an opportunity for us to look at workflows that pull in commits from external repositories such as GitHub.

The organization node ID is critical for successful parsing. Parsing based on an exact match with Organization Node Title is possible, but more prone to errors.

We will be able to use fields on the organization node to determine if the organization is a Drupal agency or a Drupal customer/end-user.

If there are no major disagreements remaining, I’d like to get this on my team’s schedule so that we can begin collecting these credits for future display. Can we put a time box around remaining comments to be added between now and the end of the year with a final decision by that time? That will allow us to begin implementation the first week of 2015.

webchick’s picture

On the surface that sounds pretty good to me! I should be able to discuss with Dries in the morning.

His concern is likely to be around "What happens for projects that start from GitHub / private SVN/Git repos for client sites / etc." (which I think is actually by far the norm, versus starting a project on Drupal.org from commit #1, especially with the daunting project application process).

One way to address it is to use Drupal.org as the issue tracker, as you pointed out. But if development is primarily happening on GitHub (which we know happens for a non-trivial amount of Drupal code thanks to how ubiquitous GitHub is), that won't actually help... you'd need to get your co-workers/clients/whoever else is collaborating on the code to duplicate their work entry over here, unless I'm missing something.

In lieu of that, we could probably also expose the org and individual credit strings on individual/organization profiles somehow, so at least you could copy/paste it easily into your commit messages in outside systems. (Which is kind of funny, because I was advocating for removing those a few months back due to the clutter. :P)

webchick’s picture

Hm. Ok, so in trying to update the issue summary, I read this more in-depth, and found some more concerns. :D

  1. We're missing one other key component from Dries's original proposal which is the ability to segment out customers from employers (this is an attempt to encourage Drupal end users to contribute back as part of their site builds). The initial proposal used @ vs. * for the distinction. The new one is just "Signed-off" with no distinction in type. Is the thinking that we'd differentiate these two on an organization profile level, or how does this match up with Employer vs. Customer distinction at #2288727: [meta] Provide credit to organizations / customers who contribute to Drupal issues?
  2. Can you go into a bit more detail on why "All credit will be associated with a commit rather than the issue?" We should have all the data associated with the issue already if #2288727: [meta] Provide credit to organizations / customers who contribute to Drupal issues happens (which AFAIK it needs to in order to populate the commit message template?). Is the idea to only credit people/companies who actually get an issue through to completion and something committed to a project? That seems a bit suboptimal, since it laser-focuses contributions around code (while there are many other types of issue contribution that don't necessarily result in a commit, and we've voiced a desire to be more inclusive of diverse contribution types). And even at that, a patch contribution is still a valid contribution, even if the maintainer of a module is AWOL and never commits it. [EDIT: Maybe this concern is resolved by using a "Credit" entity w/ reference fields vs. a Commit or Issue node, because this way credit could span both, as well as other types of contributions.]
  3. As far as deadlines goes, it sounds great on paper, but since the communication protocol I recommended at #2295411-83: Auto-generate Git attribution info / commit messages on Drupal.org two months ago to raise awareness about this initiative was never done (or if it was done, I missed it), I foresee this not going exactly well. You're effectively choosing the two weeks the year when D.o has the lowest traffic—because most of our community is off spending time with their families and on vacation and various other stuff that involves not thinking about Drupal.org—as the only available timeframe to discuss this far-reaching change, only among the 32 people who happened to accidentally stumble across this issue. :\ I'd say bump it at least to mid-January as a result, and once again plead for some communication from the DA pointing people at this effort prior to cutting off all discussion.
webchick’s picture

Issue summary: View changes

Ooookay. I attempted to rewrite the issue summary based on Josh's comment and the latest replies here. I am not sure if what's there is 100% accurate though, since Josh only touched on the organization credit part, not the individual credit.

I added an example commit message based on what I think is the current consensus, for the issue at #2183983: Find hidden configuration schema issues. This is a fairly good example, since it has a number of contributors from a bunch of different companies, and I think at least one of them is also working for a Drupal 8 customer as well, so I threw that in too, for the full picture.

If the intent is not to monkey with the individual credit and keep that a CSV, and instead only add the "footer" of the commit message for organization credit, then that will need adjusting. But I think all the reasons you gave for wanting to include the organization node ID for easy tie-back apply to individuals as well. Does make the commit message string long as hell, though. :\

The one other thing I did compared with the old option 3 is I changed "Signed-off-by" to "Contributed-by" because a) signed-off has a specific, "Gittish" meaning already, and it mainly applies to subsystem/core maintainers and b) often times someone is credited on commit because they contributed to a patch, but they may have stopped for some negative reason (someone else took over and wildly changed their initial solution, they moved on from Drupal, etc.) so it seems like it's bordering on libel to say they "signed off" on a given patch. ;) We can always get more granular with the *-by if we want in #2230579: [meta] Allow crediting reviewers (and other non-coders) as first-class contributors.

webchick’s picture

webchick’s picture

Oh, one final thing in the "channeling Dries" department. He's likely to be very concerned about the amount of work it takes to compile a commit message like this by hand. (I have all the supercow permissions on Drupal.org and it still took me roughly 20-25 minutes to compose the one in the issue summary; not sure how others would get those IDs :\) Making these messages too difficult to type by hand means most people won't bother, which means we'll get inconsistent application of credit, which may be worse than having none at all. Projects already on Drupal.org whose maintainers painstakingly use issues for every commit will obviously be fine, but a quick glance at https://www.drupal.org/commitlog and https://github.com/search?utf8=%E2%9C%93&q=drupal (not to mention private repos that we can't see, where contrib projects generally start) shows that's clearly not generally the case.

webchick’s picture

Ok, spoke with Dries about this, as promised. The main concern he focused on (since all the other concerns were already covered) was one of the main original goals of the proposal: providing visibility and transparency around how work actually gets done in our community, and who's funding what and whom.

Once again using #2183983: Find hidden configuration schema issues as the example, while we get a nice list of companies (both employers and customers), what we can't answer with this format is:

- Who from a particular company (let's say MD systems) worked on the patch?
- Was that person working for MD systems on "contribute time" or was it part of a customer engagement?
- Was it MD systems acting in the capacity as a "customer" and paying someone from Capgemini to do the work?
- Who contributed on this issue as a volunteer?

The "munge special characters together" format would be able to answer these questions, because we'd end up with something like:

vijay*Mysterious Client # freelancer, directly sponsored through customer work
Berdir@MD systems*Mysterious Client # agency employee, sponsored through customer work
Wim@Acquia # agency employee, direct sponsorship of core work
webflo # volunteer
alexpott alexpott@Chapter Three # both volunteer and agency sponsored time spent on issue

Since we do not have consensus on the "munge special characters together" format, we should still try and capture this "contribution chain" some other way. Maybe (please note that I'm making this up to illustrate various scenarios, not based in reality):

git commit -m 'Issue #2183983: Find hidden configuration schema issues

Contributed-by: Gábor Hojsty <goba@4166.no-reply.drupal.org> (@Acquia, !volunteer)
Contributed-by: Vijayachandran Mani <vijaycs85@93488.no-reply.drupal.org> (@CapGemini, *Some Mysterious Drupal 8 Client)
Contributed-by: Wim Leers <wimleers@99777.no-reply.drupal.org> (@Acquia)
Contributed-by: Sascha Grossenbacher <berdir@214652.no-reply.drupal.org> (@MD Systems, *Drupal Association) # sponsored by Drupal Association D8 accelerate fund
Contributed-by: Florian Weber <webflo@254778.no-reply.drupal.org> (!volunteer)
Contributed-by: Alex Pott <alexpott@157725.no-reply.drupal.org> (!volunteer, @Chapter Three, *Some Other Mysterious Drupal 8 Client)

Supported-by: Acquia <acquia@1204416.org.no-reply.drupal.org>
Supported-by: Capgemini <capgemini@1772260.org.no-reply.drupal.org>
Supported-by: MD Systems <md-systems@1979456.org.no-reply.drupal.org>
Supported-by: UEBERBIT GmbH <ueberbit@1838438.org.no-reply.drupal.org>
Supported-by: Chapter Three <chapter-three@1121246.org.no-reply.drupal.org>

Supported-by: Some Mysterious Drupal 8 Client <mysterious-d8-client@XXXX.org.no-reply.drupal.org>
Supported-by: Some Other Mysterious Drupal 8 Client <mysterious-d8-client2@XXXX.org.no-reply.drupal.org>
'

(Note, that looks like a huge mess. I started without using special symbols but wasn't sure how to delineate "I work for the DA" from "I was sponsored by the DA." Hmmm.)

Anyway. Josh, Dries, and I have a call set up to discuss this in real-time tomorrow and try and work through some of these issues but wanted to leave a note for everyone else on where things are currently at.

webchick’s picture

Status: Active » Postponed

During the call we decided to shift focus a bit to get this figured out for Drupal.org issues first, everything else second. More details will be posted at #2288727: [meta] Provide credit to organizations / customers who contribute to Drupal issues in the next couple of weeks, but for now we can put this on the back-burner.

joshuami’s picture

We started with three primary goals for crediting individuals and organizations:

  1. Give credit to the funders of work in the community. The hope is that exposing this credit will give companies incentive to give back more.
  2. Study how the community works, by studying the relationships between customers, agencies, and volunteer contributions and how and where the money is flowing.
  3. Clearly identify "teams" of people who work for the same employer/customer (and thus raise potential conflicts of interest on controversial issues)

To this we are adding the following requirements for issue credits:

  1. Credit is given on Drupal.org. We recognize the ecosystem is growing to include repositories on GitHub, and even private repositories, but the canonical repositories live on Drupal.org for projects that can call home to update services. Further, user and organization profiles for Drupal community members are on Drupal.org.
  2. Credits should be given at the issue level for organizations and individuals. This will allow us to give first-class credit for non-code issue contributions. (We are also exploring non-issue credits separately for contributions such as event sponsorship, documentation, answering support requests, etc.)

With these principles in mind, we are going to shift our focus to #2288727: [meta] Provide credit to organizations / customers who contribute to Drupal issues and work on the UI for a user to give credit to an organization and for maintainers to award the credits to contributors.

Once this UI is ready, we will begin storing credit at the issue level. This will allow us to collect data that can be used for making decisions about how credits are displayed on user profiles and organization profiles.

As we will not need to parse data from the commit message with this solution, we will come back to the commit credit format after we have a couple months of credit data.

David_Rothstein’s picture

Contributed-by: Sascha Grossenbacher <berdir@214652.no-reply.drupal.org> (@MD Systems, *Drupal Association) # sponsored by Drupal Association D8 accelerate fund
.....
Contributed-by: Alex Pott <alexpott@157725.no-reply.drupal.org> (!volunteer, @Chapter Three, *Some Other Mysterious Drupal 8 Client)

(Note, that looks like a huge mess. I started without using special symbols but wasn't sure how to delineate "I work for the DA" from "I was sponsored by the DA." Hmmm.)

Yeah, I agree. Way too many symbols :)

With what I just proposed at #2288727-67: [meta] Provide credit to organizations / customers who contribute to Drupal issues this would be a lot simpler, I think, and easy to delineate the differences. It could become something like this:

Contributed-by: Sascha Grossenbacher <berdir@214652.no-reply.drupal.org> (@MD Systems#employer, @Drupal Association#contractor)
.....
Contributed-by: Alex Pott <alexpott@157725.no-reply.drupal.org> (#volunteer, @Chapter Three#employer, @Some Other Mysterious Drupal 8 Client#client)

Or maybe "contractor" in the first would be "sponsor" instead - but the point is (a) it's definitely not "employer"; there'd be a defined list to choose from (and it would be easy to add or subtract from that list when necessary), and (b) you only need two symbols, not four :)

David_Rothstein’s picture

Question: With any of the proposals above, how does a module maintainer give credit to an organization, assuming their work to maintain the module is being sponsored by an employer or client?

Would they have to put their own name into the commit message for every patch they review and commit?

Or would we just assume that this never goes into Git (and the organization only gets credit on the project page instead)?

drumm’s picture

For contrib maintainers, the project's supporting organizations do show on the organization pages today. I'd say that should stay higher on the page than issue credits, since supporting a project is a bigger deal. Each maintainer is free to credit individual issues however they wish. We can provide the UI and advice on using it, but each project is its own project in the end.

YesCT’s picture

This was postponed in #55 in December.
Given the progress on #2340363: Add issue comment attribution,
and that #2369159: Extend crediting UI to include organizations & customers will be / is starting,
do we want to
make this active again,
and,
- figure out the format here, so 2369159 can use it,
- or, let 2369159 work on the table and have a separate issue to implement whatever is decided here?

YesCT’s picture

Issue summary: View changes

Oh, #56 @joshuami

As we will not need to parse data from the commit message with this solution, we will come back to the commit credit format after we have a couple months of credit data.

So, then the intention was to NOT change the commit messages right away.
Updating the issue summary with my guess at what this is postponed on.

drumm’s picture

Status: Postponed » Active
Issue tags: +Needs issue summary update

I think we can go ahead and make this active. We need to refine the format to specifically map to the data we're starting to collect in the other issues. Some of these questions might have been answered in comments already, we should get them into the issue summary.

Currently, the issue summary has a list of users, then a list of companies, with no user → organization → customer mapping. Is that okay? Would ordering the lines "user, company, customer, repeat" be better?

Both organizations and customers are put in Supported-by: Is that okay? Is there a more-appropriate header?

Do these same formats apply just as well to contrib?

joshuami’s picture

What if we grouped them by contributor with a new line in between each grouping? That would keep the format similar and leave open the option for someone to visually scan the git log or parse it for the same data we are storing in the database.

I believe issue credits will be a more accurate view of organizational contribution than commit logs, but I want to recognize the need to keep the history in git for flexibility in the future.

mlncn’s picture

All of this looks great. But for contrib modules, most of the time being able to give someone full credit for a patch they provide is enough, and there has been a regression on d.o in losing the copy-paste help to give direct attribution to a commit (e.g. --author="joe_user <joe_user@123456.no-reply.drupal.org>" as described on this documentation page. It's possible to construct this author attribution manually, but a) it's quite inconvenient and b) people who had opted out of the no-reply e-mail attribution to use an e-mail address they use for git attribution in other places no longer have that preference honored. I don't see anything in this discussion that suggests a policy change from --author credit, indeed it's used repeatedly in examples, so would loss of the --author attribution line from profiles be a bug that should be fixed?

... and now i see, as i post this, that the "Credit & committing" section will provide the --author line. I merely scoured documentation, went on IRC where no one hanging out happened to know about it, and wrote this whole comment before seeing. Updating documentation now!

The loss of using a real e-mail in attribution does appear to have been lost, however. Is that an unintended regression?

drumm’s picture

This is all tangential to --author. That will still be available as radio buttons in Credit & committing, and this changes what the checkboxes will do. Each project maintainer can give out credit as they see fit.

webchick’s picture

Priority: Normal » Major

Elevating this one to major, since the ability to assign individual/org credit *before* projects end up on Drupal.org (private repo, GitHub, etc.) is something Dries has explicitly asked for, and I can't see any other way it's possible than by deriving credit info through commit messages.

joshuami’s picture

The related conversation in #2474609: Not possible to credit people who didn't comment in an issue reminded me that we might need a little clarification around assigning org credits outside of Drupal.org.

In comment 56 of this thread, we had decided that we would only record credits related to canonical issues or commits on Drupal.org. While users can track intent in commit messages on GitHub or private repos, we cannot parse those repos. We are relying on the maintainer/committer to the Drupal.org repo to include information in the commit for us to recognize that as credit we should track.

Ideally, relationships between issues and commits on D.o repos are be best practice. That gives us a place to store the credits and tie back all of the data to the D.o organization and D.o user. Can we require that every commit is related to an issue so that we can capture the data in an issue credit?

Any other ideas for how we store parsed data from commit messages?

webchick’s picture

The proposed resolution suggests an "expanded" git commit format like this, based on what other projects do (and based on the unreliability of git notes across various external tools):

git commit -m 'Issue #2183983: Find hidden configuration schema issues

Contributed-by: Gábor Hojsty <goba@4166.no-reply.drupal.org>
Contributed-by: Vijayachandran Mani <vijaycs85@93488.no-reply.drupal.org>
Contributed-by: Wim Leers <wimleers@99777.no-reply.drupal.org>
Contributed-by: Sascha Grossenbacher <berdir@214652.no-reply.drupal.org>
Contributed-by: Florian Weber <webflo@254778.no-reply.drupal.org>
Contributed-by: Alex Pott <alexpott@157725.no-reply.drupal.org>

Supported-by: Acquia <acquia@1204416.org.no-reply.drupal.org>
Supported-by: Capgemini <capgemini@1772260.org.no-reply.drupal.org>
Supported-by: MD Systems <md-systems@1979456.org.no-reply.drupal.org>
Supported-by: UEBERBIT GmbH <ueberbit@1838438.org.no-reply.drupal.org>
Supported-by: Chapter Three <chapter-three@1121246.org.no-reply.drupal.org>

Supported-by: Some Mysterious Drupal 8 Client <mysterious-d8-client@XXXX.org.no-reply.drupal.org>
Supported-by: Some Other Mysterious Drupal 8 Client <mysterious-d8-client2@XXXX.org.no-reply.drupal.org>
'

There's nothing Drupal.org-specific about this at all, other than the representative commit message of "Issue #12345: Blah blah" but that could just as easily be "Fix some crap" or whatever. :) So people coding in GitHub or private SVN repo or elsewhere could easily (well, not easily, but with a Dreditor-esque plugin or something...) format their Git commit messages in the same way.

Then the thinking was we'd have some kind of post-commit hook (*hand-waving*) run when these commits are imported into D.o (after the client project is over and you're ready to make the code public) which would "back-parse" those credits and assign them to the proper people/organizations.

webchick’s picture

As for this:

Ideally, relationships between issues and commits on D.o repos are be best practice. That gives us a place to store the credits and tie back all of the data to the D.o organization and D.o user. Can we require that every commit is related to an issue so that we can capture the data in an issue credit?

Sadly, not. It is highly unusual that a client allows development out in the open on Drupal.org from commit #1. And even if they do allow it, many shops will not do this until they've been paid for the work, which is pretty sensible. Further, GitHub is the go-to spot for more and more projects these days, such as https://github.com/drush-ops/drush and https://github.com/RESTful-Drupal/restful and https://github.com/Gizra/og.

As far as only assigning commit credit for projects that are at least mirrored on Drupal.org, that makes a lot of sense to me. Drupal.org should still be the central place all of this is aggregated. We just need to acknowledge that a lot of work happens outside of Drupal.org these days, especially in the beginning phases of projects.

Edit: to add point about GitHub, too

joshuami’s picture

I think the need for a mirrored repo on D.o is a minimum requirement. Every username and organization name is very specific to Drupal.org or we cannot map it to a canonical entity to add it all up and show on a profile. The same applies to GitHub. You don't see commit frequency for a user unless that repo is on GitHub and matches to the user on GitHub. I wonder how the tool that makes it easy for a GitHub commit to get the necessary username and organization data and pull that in would work? Would we need to work with GitHub on an integration between our user accounts to make that happen easily?

With commit message parsing, do we end up keeping a two tier system? Issue credits expanded the notion of "contributor" to be anyone that participated in the comment thread. It also allowed users to explicitly show their organizational attribution intent for each comment. When that intent is summarized into an issue credit, you see users and organizations getting one credit per issue they helped resolve. So multiple commits on an issue still equate to one issue credit awarded by the maintainer that pulls that work into the final repo.

Commit mentions are per commit. Will parsing commit mentions into a data structure lead to a greater number of total credits per user or organization based on frequency of commit versus issue? Do we only count commits to the project head? I would imagine the git commit history for an internal project that gets packaged up to publish on Drupal.org looks really different than the commit history for Drupal core.

I'm trying to wrap my head around that part. It is so important to how we can display the resulting data and weigh its meaning.

So many questions... sorry about that.

webchick’s picture

Every username and organization name is very specific to Drupal.org or we cannot map it to a canonical entity to add it all up and show on a profile.

Well, one of the nice things about the new proposed commit message format is the user ID/organization ID is embedded directly in the commit message via the "no-reply" email addresses, so we would not have to resort to any nasty username/company name matching. For example:

Contributed-by: Gábor Hojsty <goba@4166.no-reply.drupal.org>: "Contributed-by" is always an individual, and 4166 represents Gábor's Drupal.org user ID.
Supported-by: Acquia <acquia@1204416.org.no-reply.drupal.org>: "Supported-by" is always an organization, and 1204416 is Acquia's Drupal.org organization ID.

On the downside, because of this, they're basically impossible to type by hand without an incredible amount of patience and tenacity (unless all your commits are for one company, in which case they could develop a simple template to copy/paste). So some sort of "Dreditor-esque" browser plugin that calls out to d.o API to get user ID/organization IDs based on names and inserts them into a textbox automatically would be pretty important. This is why Dries pushed hard for the Name@Company*Customer format, because that's incredibly easy to type, but that has the nasty name-matching problem, and also push-back compared to the current proposal.

With commit message parsing, do we end up keeping a two tier system? [...] users and organizations [get] one credit per issue they helped resolve. [...] Commit mentions are per commit.

Yeah, that's an interesting twist. In some respects, I think per-commit credit is better. For example, often the work to do a D7 backport of a D8 patch is a whole entire 40+ comment effort of itself, more than worthy of an additional individual/org credit. Yet they often occur on the same issue in order to not fork the discussion about how to solve the problem.

However, it's a problem/feature (depending on your POV :D) of Drupal.org's use of Git that effectively we only do "squash" commits of what is often tons and tons and tons of work that, when using Git properly, would be numerous individual commits.

I think what you're saying is we would not want to give Acquia 5 commit credits for:

git commit -m "Look! a README! I'm rocking and rolling now!"
git commit -m "A module file. Behold!"
git commit -m "Moar code!"
git commit -m "Aw, crap. Parse error."
git commit -m "Aw, crap. Spelling error."

...since that's pretty much exactly what all of my projects start out with. ;)

OTOH, nearly all Drupal.org projects also have issues for "fix spelling error" and "fix parse error" and "add a README." So I'm wondering how much we need to be concerned about this.

moshe weitzman’s picture

A couple refinements from what Angie said.

  1. I would think that we would be able to affiliate the commit with a d.o. user via any recognized email address, no just the no-reply email addresses. So, Contributed-by: Moshe Weitzman would affiliate with me. It is onerous to require use of d.o. emails"
  2. I would like to move the Issue #2183983: to later in the Commit message as it gets in the way when you can single line commit summaries. I know that habits are hard to change, but we can do it. So:
git commit -m 'Find hidden configuration schema issue. Issue #2183983.
Contributed-by: Gábor Hojsty 
Contributed-by: Vijayachandran Mani 
Contributed-by: Wim Leers 
Contributed-by: Sascha Grossenbacher 
Contributed-by: Florian Weber 
Contributed-by: Alex Pott 


Supported-by: Acquia 
Supported-by: Capgemini 
Supported-by: MD Systems 
Supported-by: UEBERBIT GmbH 
Supported-by: Chapter Three 


Supported-by: Some Mysterious Drupal 8 Client 
Supported-by: Some Other Mysterious Drupal 8 Client 
'
YesCT’s picture

moshe weitzman’s picture

Google has open sourced Git Appraise - a distributed code review application that stores its data in Git Notes. I'd say that Notes are robust enough to handle our metadata needs.

catch’s picture

Also to add that git notes was proposed on one of the very first discussions around this back in 2011: https://groups.drupal.org/node/161659

Version: 8.0.x-dev » 8.1.x-dev

Drupal 8.0.6 was released on April 6 and is the final bugfix release for the Drupal 8.0.x series. Drupal 8.0.x will not receive any further development aside from security fixes. Drupal 8.1.0-rc1 is now available and sites should prepare to update to 8.1.0.

Bug reports should be targeted against the 8.1.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.2.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.1.x-dev » 8.2.x-dev

Drupal 8.1.9 was released on September 7 and is the final bugfix release for the Drupal 8.1.x series. Drupal 8.1.x will not receive any further development aside from security fixes. Drupal 8.2.0-rc1 is now available and sites should prepare to upgrade to 8.2.0.

Bug reports should be targeted against the 8.2.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.3.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

klausi’s picture

Now that we handle issue credits on drupal.org itself instead of git can we close this issue?

Somewhat related to the git message format aspect discussed here: #2802947: [meta] Use the Git commit message format from AngularJS

hestenet’s picture

There are a few suggested improvements to the credit system from Dries' recent blog that suggest we may still want to consider this issue.

The credit system gives us quantifiable data about where our community's contributions come from, but that data is not perfect. Here are a few suggested improvements:

  1. We need to find ways to recognize non-code contributions as well as code contributions outside of Drupal.org (i.e. on GitHub). Lots of people and organizations spend hundreds of hours putting together local events, writing documentation, translating Drupal, mentoring new contributors, and more—and none of that gets captured by the credit system.
  2. We'd benefit by finding a way to account for the complexity and quality of contributions; one person might have worked several weeks for just one credit, while another person might have gotten a credit for 30 minutes of work. We could, for example, consider the issue credit data in conjunction with Git commit data regarding insertions, deletions, and files changed.

Source: http://buytaert.net/who-sponsors-drupal-development

Version: 8.2.x-dev » 8.3.x-dev

Drupal 8.2.6 was released on February 1, 2017 and is the final full bugfix release for the Drupal 8.2.x series. Drupal 8.2.x will not receive any further development aside from critical and security fixes. Sites should prepare to update to 8.3.0 on April 5, 2017. (Drupal 8.3.0-alpha1 is available for testing.)

Bug reports should be targeted against the 8.3.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.4.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.