Closed (fixed)
Project:
Drupal.org infrastructure
Component:
Other
Priority:
Normal
Category:
Bug report
Assigned:
Unassigned
Reporter:
Created:
1 Nov 2009 at 18:09 UTC
Updated:
29 Nov 2009 at 18:13 UTC
Jump to comment: Most recent
Not sure if this issue goes here. I tried to search if it was already reported, but I found nothing.
I tend to monitor how project stats evolve, but it seems there's been a problem in the process this sunday. Project stats for October 25th have not been populated correctly, it seems.
Comments
Comment #1
dwwthe infra team has been talking about changing how the stats are computed. someone might have changed something and broken the existing stat processing system. i haven't been involved, so I'm not sure the status. moving to a more appropriate queue, in the hopes that the folks who've been touching this stuff recently will see it and reply...
Comment #2
gerhard killesreiter commentedThere haven't been any changes I know of. We were only discussing changes, not implementing them.
Comment #3
Anonymous (not verified) commentedLooks like it affects all modules. I thought I made a mistake with mine or something lol
Comment #4
Anonymous (not verified) commentedChanging status to critical.
Comment #5
markus_petrux commentedIt looks like some data was really collected, but summary data is missing.
As in example, if we look at CCK, the "Weekly project usage" table reports 0 for October 25th, and nothing in the "Recent release usage" table for that period:
http://drupal.org/project/usage/cck
However, if we look at data related to a particular release, the usage stats are there:
http://drupal.org/project/usage/484068
http://drupal.org/project/usage/539128
The same happens to all other projects I've seen.
So maybe there was an error last time the process that generates summary data was executed?
Comment #6
Andrew Schulman commentedsubscribe
Comment #7
Anonymous (not verified) commentedLooks like it happened again...
http://drupal.org/project/usage/uc_ticket
http://drupal.org/project/usage/622226
It's counting the usage clearly, but not compiling stats for the whole project.
Comment #8
seutje commentedsubscribe
Comment #9
aidanlis commentedsubscribe
Comment #10
pvhee commentedsubscribe
Comment #11
mohammed j. razemsubscribe
Comment #12
mrfelton commentedsubscribe. Affects all of my modules.
Comment #13
mohammed j. razemactually all Drupal modules are affected.
Comment #14
markus_petrux commentedThere seems to be an additional problem: usage stats are not reported for new releases.
So, in the usage stats page for a project, the table "Weekly project usage" reports 0 for new weeks, no column is rendered in the "Recent release usage" table for those periods, AND no new row is reported for new releases either.
However, usage stats are reported if we look at usage stats page for individual releases, even if these do not show on the project stats page.
So the problem seems to be focussed in the data that's used to build the usage stats page at project level.
Hope that helps.
Comment #15
michelleI searched on usage statistics and didn't find it so adding that into the title for others who may search on the actual page title. Also updating the title since there is now another week of no usage statistics.
Sorry, no help fixing the problem, but maybe this will prevent dupes. :)
Michelle
Comment #16
falcon commentedsubscribe
Comment #17
globetrotter commentedsubscribe
Comment #18
dwwSpent a while looking into this today. The basic summary:
- Once again, our usage data has become so huge that even our gigantic 70-gigs-of-RAM DB server (or whatever it's got now) can't actually process the weekly summaries. The queries just explode and timeout. :(
- The alternative system for processing all this data outside the DB is mostly working fine now (yay for Bdragon!). He's got summary data for all the missing weeks. We're going to import that data sometime in the next day or two.
- We're basically ready to turn off the old system for handling this, whereby each request for release history XML files is logged to the d.o DB. In the near future, all those requests are going to be served by our varnish cache nodes, and logged in the varnish access logs. Bdragon's script knows how to parse these logs to generate the summary usage data, instead of doing it all in the DB. We'll roll out the new system over the next few days, too. That should significantly reduce the load on the main d.o DB, and hopefully make the usage stats much less fragile in the future.
- The OSL folks have configured things to preserve the varnish access logs for up to 30 days.
- We've already turned off the calls in the script that serves the release history files to bootstrap drupal and record the usage in the DB. We're seeing those requests in the varnish logs, and Bdragon's script is successfully parsing them. Logging this data to the DB is now pointless, since we can't actually process it anymore. So, that should hopefully reduce the load on the DB server quite a bit already. Soon we'll be able to prevent these requests from even hitting Apache at all by letting varnish handle them completely. Just requires a bit of header re-writing to get it right. We *need* to ensure that update.module requests for files on updates.drupal.org at least hit varnish so we can log them, otherwise, we break the usage stats. But, Bdragon and DamZ now understand this, so I have full confidence they'll get this right as they move forward.
We're just hashing out some of the details in #drupal-infrastructure IRC right now, but things are totally on track. I have full confidence in our new usage stat czars. ;)
Yay for progress!
Comment #19
michelleAwesome! Thanks for the hard work. It's appreciated. :)
Michelle
Comment #20
markus_petrux commentedDitto. Thanks a lot! :)
Comment #21
dwwThe actual bug reported here is now fixed:
http://drupal.org/project/usage
If we need to coordinate anything regarding the roll-out of the new system I described in #18 we should just open new issues for those things, and not keep piling on here.
Cheers,
-Derek
Comment #22
pasquallethanks Derek and everyone involved
Comment #23
seutje commentedawesome, guess there's too much using going on ;)
Comment #24
Andrew Schulman commentedAll working now. Please pass the thanks on to the usage stats team.
Comment #25
anrikun commentedBroken again :(
Comment #26
mrfelton commentedNot only are the Drupal statistics broken again, but I noticed yesterday that they were not actually properly fixed in the first place. Well, the D5 stats weren't anyway. From Nov 8 onwards, all my modules show 0 users for 5.x-all-versions, which I know to not be true.
Comment #27
pasqualle@mrfelton: the summary for 5.x seems good, only the usage per release is incorrect..
Comment #28
markus_petrux commentedIt seems like usage stats per release were not collected, and the summaries still try to render a line about them, but all columns are 0. Rows per release for the new week are not even rendered, so it seems there's no data about them.
Comment #29
dwwThe new problem is being discussed at #643754-4: New usage statistics system sometimes fails to update per-release data. Closing this and locking comments. I'd rather not keep reopening old issues everytime something goes wrong. Thanks.