Based on a discussion in IRC, it might be nice to store date info about when tags were added, so that we could interleave tags on the commit listing pages. E.g.:
http://drupal.org/project/cvs/3235
It'd be nice to see when tags were created for official releases, etc.
It's going to be a pain to parse the entire contrib repo and try to figure out tag timestamps. It'd have to use some crazy heuristics. I'm not even sure it's possible, but I'm not going to worry about that now. At least we could start collecting this data when the tags are created via xcvs-taginfo.php.
I think it'll be easier to just change cvs_show_messages_format() to intersperse tag messages along with the commits, instead of trying to mess with the query itself.
| Comment | File | Size | Author |
|---|---|---|---|
| #11 | 339054_2008-11-25_cvs_tag_datestamp_errors.txt | 16.07 KB | dww |
| #10 | 339054_xcvs-import-tag-dates.2.php_.txt | 3.2 KB | dww |
| #9 | 339054_xcvs-import-tag-dates.php_.txt | 2.85 KB | dww |
| #8 | contrib-tags-3.txt | 421.52 KB | damien tournoud |
| #5 | contrib-tags-2.txt | 421.26 KB | damien tournoud |
Comments
Comment #1
dwwUntested patch that at least adds a datestamp field to {cvs_tags} and starts populating that field as tags are added via xcvs-taginfo.php.
Comment #2
dwwThis one's actually tested for cvs_install() and the DB update. ;)
Comment #3
damien tournoud commentedHere are the timestamps of the tags in the current contribution repositories (projects/*, themes/* and translations/*), as well as the extraction script.
Comment #4
dwwSlick. I didn't know about cvsps or that it had these sorts of heuristics already solved. Cool.
Now we need that data in more of a format like this:
nid: project node id of the corresponding project name.
tag_name: with none of cvsps's commentary about **FUNKY** or **INVALID**
datestamp: the raw epochtime.
nid might be a pain in the ass for you to extract since you don't have the {project_projects} table to work from. ;) so, at least this would be more useful:
project_uri:tag_name:datestamp
e.g.:
Comment #5
damien tournoud commentedDone.
Comment #6
dwwOk, I just setup a dummy CVS repo on project.d.o and tested #2 there. Works like a charm. I think I'm going to start there, commit/deploy that on d.o + cvs.d.o, so we start collecting data on new tags. Meanwhile, I'll work on p.d.o to come up with an import script to parse the data from #5 and update the {cvs_tags} table for the historical tags.
Comment #7
dwwOk, I committed #2 to cvs.module in HEAD and DRUPAL-5, deployed on d.o, ran the update, etc. All is good. New tags now have datestamps. I'm going to work on that script to parse the historical data and update the records... Now that d.o is getting datestamps on new tags, it'd be nice to get one last set of data from Damien so that we don't miss any tags that happened to be created between when he last ran his extractor script and when I deployed the change to collect datestamps on new tags...
Setting this back to active since we need a) the script (I'm working on) and b) a patch to cvs.module to actually display these datestamps on the cvs listings, etc.
Comment #8
damien tournoud commentedHere is the latest version of the extract.
cvsps does some very strange things when dealing with "bizarre" situations like this one:
- media mover 5.x-1.0-beta5 was released on September 22, 2008
- but this commit the same day: http://cvs.drupal.org/viewvc.py/drupal/contributions/modules/media_mover... is tagged with nearly all beta versions (except BETA5), even later ones (up to BETA14!)
But given that CVS does not store tagging operations per se, this is our best shot.
Comment #9
dww@Damien: thanks for the updated data. This script is working great on my laptop. Now I gotta try it on project.d.o.
Comment #10
dwwNow with some error checking so that if the UPDATE query fails, we write the tag info line to an error log file so we can see rows that aren't valid for some reason...
Comment #11
dwwAfter some more testing and a mysqldump of the live {cvs_tags} table, I just ran this on d.o itself. Mostly it went fine:
Attached is the resulting error log, with 475 tags it couldn't insert (since there weren't already rows in {cvs_tags} for them, or the script couldn't find a project nid to match the first column). Some are obvious, like the "cck.pre-rename" directory (which I should just remove from the repo -- that was a backup from before cck was reorganized). The large # of OG tags it couldn't find is a little scary. I just looked and d.o doesn't know about many of OG's tags:
Not sure how that happened... there are clearly release nodes for more than this. Because of how the schema works for this table (a single row for each project/tag pair), there's a problem where if you remove a given tag from a specific file, the DB thinks you removed that tag from all files for that project. There's validation that's supposed to prevent you from deleting tags when there's a release node pointing at a tag, but maybe Moshe subverted that somehow. ;) Anyway, we should probably just take all the OG-related rows from this error log and insert them into {cvs_tags} directly.
Oh, I see there's similarly missing tags in signup module, too (which I maintain). And seeing that, I now know the problem -- both of these modules recently had big "rename all the files in HEAD" operations done to them, and by using cvs_rename.pl, that involved removing tags, which made xcvs-taginfo.php remove them from {cvs_tags}, too. Whoops. I just modified the script so it read in those rows and did an INSERT instead of UPDATE for each one. So og and signup are fixed.
img_filter looks like it was removed at one point, and re-created. :( http://drupal.org/project/img_filter. Not sure I care.
Ditto these:
http://drupal.org/project/messenger
http://drupal.org/project/magicsquares
Anyway, I think we're in good shape now. We've got real data for nearly all projects. So, now all we need is to display it. ;)
Comment #12
dwwHrm, although...
;) Looks like your script didn't like branches. Maybe cvsps can't figure out dates for when those were created? Again, I'm not positive it's worth spending much time on, but if there's a quick/easy explanation/fix for this, I'd be happy to hear about it.