With an approach finally settled in #898816: Consider using real names and/or e-mail addresses for the author/committer metadata?, we need a small, custom module that collects the necessary information from d.o users. Much of what that module will need to capture is described in #898816-15: Consider using real names and/or e-mail addresses for the author/committer metadata?, but I'll recap here.
The module must gather the following data from current d.o CVS account holders:
- If they'd like to use the pseudo-email [username]@[uid].no-reply.drupal.org (e.g., I would be sdboyer@146719.no-reply.drupal.org). This is the default.
- Or, if they'd like to use a real email address. If so, they should be able to select between their primary d.o email address or any email address registered with the Multiple Email Addresses module.
I don't really care if this module form_alters the account registration form we see at http://drupal.org/user/!uid/edit/cvs , or if it defines its own form on the user account - whatever's easiest for getting the data.
What makes this a bit more complicated is the data storage requirements. Ordinarily, it'd be fine to just store this on $user->data or something, but there are a few requirements that make it a little funky:
- We need to ensure that there's an entry for every CVS account holder, including those who haven't ever touched this form, and that the entry for them is the default, anonymized email. That probably means a
hook_enable()
implementation which fills a db table with all the initial values. - We need to be able to make arbitrary additions to the table and be ensured that they're going to stick around. For example, Ken Rickard changed his CVS username somewhere waaay back from 'agentken' to 'agentrickard', and it'd be nice if we could manually capture that. Of course, that'd be data we manually enter, no need to accommodate it in the UI.
- We need to be able to write this data to a flat file on disk, so that the migration scripts can use it easily. The way to do this that'll work best for our infra is creating a drush command that dumps all the data into the flat file. Just gotta make sure the drush command takes a path to the desired output file location as a parameter.
I'm figuring this module will probably need a table with no more than three fields - uid, name, email. That's all we really need. Though if uid is made a PK, it'll complicate scenarios like the agentken/agentrickard one above...well, figure it out :P
Note that I'm assigning this to neclimdul because he's the one primarily working on the migration scripts right now, but we're hoping for a volunteer on this one. So if you wanna do it, feel free to reassign to yourself :)
Comment | File | Size | Author |
---|---|---|---|
#13 | drush-cvsmigration-export.csv_.png | 149.12 KB | Josh The Geek |
#12 | drush-cvsmigration-export.zip | 346 bytes | Jonathan Webb |
Comments
Comment #1
Jonathan Webb CreditAttribution: Jonathan Webb commentedAre the anonymized email addresses based on values from {versioncontrol_accounts}? For example would the rough idea for the initial data population be:
Would this be a D6 module?
Comment #2
Josh The Geek CreditAttribution: Josh The Geek commentedDrupal.org runs D6, so yes.
Comment #3
Jonathan Webb CreditAttribution: Jonathan Webb commentedAssuming I'm on the right track with the query above, I should have some code ready for testing by tomorrow afternoon.
Comment #4
marvil07 CreditAttribution: marvil07 commented{versioncontrol_accounts} is probably going to disappear(see #983926: Remove account class), but probably that data also is going to live at versioncontrol_account_status.
Anyway, the data should be written on an independent module and naturally you can not assume that it lives there(versioncontrol is not on d.o now).
The data about accounts is now at {cvs_accounts} table (see cvslog project). And the PK in that table is the "cvs_user" field, so, @sdboyer it seems like it is tracking the many accounts per user :-)
Comment #5
Jonathan Webb CreditAttribution: Jonathan Webb commentedThank you for the info, Marco!
I've posted a preliminary version at: https://github.com/webbj74/cvsmigration
[removed verbose status info --JW]
So far it appears to be operating as desired. I should have the Multiple Email Address functionality in time for the stand-up tomorrow afternoon.
Comment #6
Jonathan Webb CreditAttribution: Jonathan Webb commentedThis module is ready for review: https://github.com/webbj74/cvsmigration
What it does presently:
cvs_user@uid.no-reply.drupal.org
email addresses with all of the cvs_user entries that don't already have a repository mail associated with them.cvs_user@uid.no-reply.drupal.org
email (if it isn't already listed)drush cvsmigration-export [filepath]
Tried to upload an example of the csv file output, but d.o is not allowing it.
Comment #7
Josh The Geek CreditAttribution: Josh The Geek commentedName the file blah.txt and it should allow it. Shouldn't the output include the uid?
Comment #8
neclimdul@josh the geek technically it does in the fake email address. I don't see how discreetly having it would help me.
Comment #9
Josh The Geek CreditAttribution: Josh The Geek commented@neclimdul OK.
The code looks good, but I don't have a CVS repo to test it with.
Comment #10
sdboyer CreditAttribution: sdboyer commentedThis is really fantastic. I've tested it out, and it appears to all work quite swimmingly. There are just a couple, pretty small issues:
user_load()
on d.o, that'll be in$user->full-name
. Yeah, I dunno what's up with that CLEARLY invalid property name, but whatever...you can access it by transforming $user into an array or by assigning a var with the value and then using it to access the property (e.g.,$var = "full-name"; $fullname = $user->$var
- though note that, VERY frustratingly, I was only able to access it if I did$var = "full-name"
.$var = 'full-name'
consistently failed.). So, I dun much care how it gets implemented - maybe the drush command loads the user account and stitches it in, or maybe we add another column to the {cvs_migration} table.I'd SWEAR there's a third quick thing, but I can't remember it to save my life. I've made a pull request with changes that cover item #1: https://github.com/webbj74/cvsmigration/pull/1
Regardless, this is close enough that I intend to demo it tomorrow.
Comment #11
Jonathan Webb CreditAttribution: Jonathan Webb commentedThanks for the feedback. After I pull in your changes, I'll update the .install to also perform the change based on Drupal user name (since that is where the emails are created en masse).
I may modify the set of invalid email username characters to be a little more conservative "just in case" (right now invalid chars are converted to underscores). I'm just a little concerned about the liberalness of the characters allowed in Drupal usernames (such as apostrophes) and whether they translate into actually valid email addresses. I currently use the same regular expression as Drupal's valid_email_address, but this expression isn't a perfect match to RFC 2822.
Also I think it makes the most sense to add $user['full-name'] in the Drush stage. So I will make that change as well (and I will include the uid to the output for good measure).
Comment #12
Jonathan Webb CreditAttribution: Jonathan Webb commentedUpdated: https://github.com/webbj74/cvsmigration to include sdboyer's changes, and some minor mods.
The drush output has 4 fields: "d.o uid","d.o Full Name","CVS Username","Email" Example (q.v. attached zip):
In the event that the full name field is not populated, drush will fall back to $user->name. I use addslashes() to clean the "Full Name" field.
I also reduced the set of valid email address characters to
[a-zA-Z0-9_\-\+]+
. If $user->name has chars outside of this set, they are converted to underscores.Comment #13
Josh The Geek CreditAttribution: Josh The Geek commentedYou should probably reduce the full name to a-zA-Z0-9_-+ too. The slashes really mess up quicklook on the mac. I'll fork your repo and fix this. My GitHub username is JoshTheGeek.
Comment #14
Josh The Geek CreditAttribution: Josh The Geek commentedSee my pull request at https://github.com/webbj74/cvsmigration/pull/2 . It also allows spaces.
Comment #15
Jonathan Webb CreditAttribution: Jonathan Webb commentedI merged this change to the drush output, but modified the regular expression to strip just control chars and quotes (rather than transform them to underscores). Thanks!
Comment #16
sdboyer CreditAttribution: sdboyer commentedThis all looks good to me. Let's get this module up, make it a real thing. Once we fix up multiple_email, we'll get it on d.o.
Comment #17
eliza411 CreditAttribution: eliza411 commentedTagging for Git Sprint 7 with the intention of having this poised for launch right at the beginning of Sprint 8.
Comment #18
Jonathan Webb CreditAttribution: Jonathan Webb commentedI will create a project on d.o and upload it to CVS.
Comment #19
Jonathan Webb CreditAttribution: Jonathan Webb commentedProject CVS Migration Prefs has been added. I imported the project history from github using git-cvsexportcommit (not perfect, but was able to maintain credits for the contribs by sdboyer & joshthegeek). Dev snapshot is pending. Thanks for the assistance everyone!
Comment #20
marvil07 CreditAttribution: marvil07 commenteddev snapshot avalaible :-) http://ftp.drupal.org/files/projects/cvsmigration-6.x-1.x-dev.tar.gz
Comment #21
Josh The Geek CreditAttribution: Josh The Geek commentedShouldn't we set this to fixed after it's launched on d.o?
Comment #22
eliza411 CreditAttribution: eliza411 commentedThe module is created and hosted on d.o. which satisfies the issue. When and how it is deployed will be accounted for in a different issue as part of the specific deployment planning process.