Based on a discussion in IRC, it might be nice to store date info about when tags were added, so that we could interleave tags on the commit listing pages. E.g.:

http://drupal.org/project/cvs/3235

It'd be nice to see when tags were created for official releases, etc.

It's going to be a pain to parse the entire contrib repo and try to figure out tag timestamps. It'd have to use some crazy heuristics. I'm not even sure it's possible, but I'm not going to worry about that now. At least we could start collecting this data when the tags are created via xcvs-taginfo.php.

I think it'll be easier to just change cvs_show_messages_format() to intersperse tag messages along with the commits, instead of trying to mess with the query itself.

Comments

dww’s picture

Status: Active » Needs review
StatusFileSize
new1.85 KB

Untested patch that at least adds a datestamp field to {cvs_tags} and starts populating that field as tags are added via xcvs-taginfo.php.

dww’s picture

This one's actually tested for cvs_install() and the DB update. ;)

damien tournoud’s picture

StatusFileSize
new696 bytes
new613.88 KB

Here are the timestamps of the tags in the current contribution repositories (projects/*, themes/* and translations/*), as well as the extraction script.

dww’s picture

Slick. I didn't know about cvsps or that it had these sorts of heuristics already solved. Cool.

Now we need that data in more of a format like this:

nid:tag_name:datestamp

nid: project node id of the corresponding project name.
tag_name: with none of cvsps's commentary about **FUNKY** or **INVALID**
datestamp: the raw epochtime.

nid might be a pain in the ass for you to extract since you don't have the {project_projects} table to work from. ;) so, at least this would be more useful:
project_uri:tag_name:datestamp
e.g.:

signup:DRUPAL-5--2-6:1226705960
damien tournoud’s picture

StatusFileSize
new421.26 KB

Done.

dww’s picture

Ok, I just setup a dummy CVS repo on project.d.o and tested #2 there. Works like a charm. I think I'm going to start there, commit/deploy that on d.o + cvs.d.o, so we start collecting data on new tags. Meanwhile, I'll work on p.d.o to come up with an import script to parse the data from #5 and update the {cvs_tags} table for the historical tags.

dww’s picture

Status: Needs review » Active

Ok, I committed #2 to cvs.module in HEAD and DRUPAL-5, deployed on d.o, ran the update, etc. All is good. New tags now have datestamps. I'm going to work on that script to parse the historical data and update the records... Now that d.o is getting datestamps on new tags, it'd be nice to get one last set of data from Damien so that we don't miss any tags that happened to be created between when he last ran his extractor script and when I deployed the change to collect datestamps on new tags...

Setting this back to active since we need a) the script (I'm working on) and b) a patch to cvs.module to actually display these datestamps on the cvs listings, etc.

damien tournoud’s picture

StatusFileSize
new421.52 KB

Here is the latest version of the extract.

cvsps does some very strange things when dealing with "bizarre" situations like this one:

- media mover 5.x-1.0-beta5 was released on September 22, 2008
- but this commit the same day: http://cvs.drupal.org/viewvc.py/drupal/contributions/modules/media_mover... is tagged with nearly all beta versions (except BETA5), even later ones (up to BETA14!)

But given that CVS does not store tagging operations per se, this is our best shot.

dww’s picture

Status: Active » Needs review
StatusFileSize
new2.85 KB

@Damien: thanks for the updated data. This script is working great on my laptop. Now I gotta try it on project.d.o.

dww’s picture

StatusFileSize
new3.2 KB

Now with some error checking so that if the UPDATE query fails, we write the tag info line to an error log file so we can see rows that aren't valid for some reason...

dww’s picture

Status: Needs review » Active
StatusFileSize
new16.07 KB

After some more testing and a mysqldump of the live {cvs_tags} table, I just ran this on d.o itself. Mostly it went fine:

2008-11-26 02:10:52: Starting to import tag info from /home/dww/tag_dates/contrib-tags-3.txt
2008-11-26 02:10:59: Imported datestamps for 11051 tags

Attached is the resulting error log, with 475 tags it couldn't insert (since there weren't already rows in {cvs_tags} for them, or the script couldn't find a project nid to match the first column). Some are obvious, like the "cck.pre-rename" directory (which I should just remove from the repo -- that was a backup from before cck was reorganized). The large # of OG tags it couldn't find is a little scary. I just looked and d.o doesn't know about many of OG's tags:

mysql> select * from cvs_tags where nid=13446;
+-------+-------------------+--------+------------+
| nid   | tag               | branch | timestamp  |
+-------+-------------------+--------+------------+
| 13446 | DRUPAL-4-5        |      1 |          0 | 
| 13446 | DRUPAL-4-6        |      1 |          0 | 
| 13446 | DRUPAL-5--3       |      1 |          0 | 
| 13446 | DRUPAL-5--8-0-RC1 |      0 | 1225824947 | 
| 13446 | DRUPAL-6--1-0-RC7 |      0 | 1225985596 | 
| 13446 | DRUPAL-6--1-0-RC8 |      0 | 1226310938 | 
| 13446 | DRUPAL-6--1-0-RC9 |      0 | 1226761537 | 
+-------+-------------------+--------+------------+
7 rows in set (0.00 sec)

Not sure how that happened... there are clearly release nodes for more than this. Because of how the schema works for this table (a single row for each project/tag pair), there's a problem where if you remove a given tag from a specific file, the DB thinks you removed that tag from all files for that project. There's validation that's supposed to prevent you from deleting tags when there's a release node pointing at a tag, but maybe Moshe subverted that somehow. ;) Anyway, we should probably just take all the OG-related rows from this error log and insert them into {cvs_tags} directly.

Oh, I see there's similarly missing tags in signup module, too (which I maintain). And seeing that, I now know the problem -- both of these modules recently had big "rename all the files in HEAD" operations done to them, and by using cvs_rename.pl, that involved removing tags, which made xcvs-taginfo.php remove them from {cvs_tags}, too. Whoops. I just modified the script so it read in those rows and did an INSERT instead of UPDATE for each one. So og and signup are fixed.

img_filter looks like it was removed at one point, and re-created. :( http://drupal.org/project/img_filter. Not sure I care.

Ditto these:
http://drupal.org/project/messenger
http://drupal.org/project/magicsquares

Anyway, I think we're in good shape now. We've got real data for nearly all projects. So, now all we need is to display it. ;)

dww’s picture

Hrm, although...

mysql> SELECT COUNT(*) FROM cvs_tags WHERE timestamp = 0;
+----------+
| count(*) |
+----------+
|     7047 | 
+----------+
1 row in set (0.01 sec)
mysql> SELECT COUNT(*) FROM cvs_tags WHERE timestamp = 0 AND branch = 1;
+----------+
| COUNT(*) |
+----------+
|     5928 | 
+----------+
1 row in set (0.01 sec)
mysql> SELECT COUNT(*) FROM cvs_tags WHERE branch = 1;
+----------+
| COUNT(*) |
+----------+
|     5951 | 
+----------+
1 row in set (0.01 sec)
mysql> SELECT COUNT(*) FROM cvs_tags WHERE timestamp != 0 AND branch = 1;
+----------+
| COUNT(*) |
+----------+
|       23 | 
+----------+
1 row in set (0.01 sec)

;) Looks like your script didn't like branches. Maybe cvsps can't figure out dates for when those were created? Again, I'm not positive it's worth spending much time on, but if there's a quick/easy explanation/fix for this, I'd be happy to hear about it.