Not sure if this issue goes here. I tried to search if it was already reported, but I found nothing.

I tend to monitor how project stats evolve, but it seems there's been a problem in the process this sunday. Project stats for October 25th have not been populated correctly, it seems.

Comments

dww’s picture

Project: Project » Drupal.org infrastructure
Version: 6.x-1.x-dev »
Component: Usage statistics » Other

the infra team has been talking about changing how the stats are computed. someone might have changed something and broken the existing stat processing system. i haven't been involved, so I'm not sure the status. moving to a more appropriate queue, in the hopes that the folks who've been touching this stuff recently will see it and reply...

gerhard killesreiter’s picture

There haven't been any changes I know of. We were only discussing changes, not implementing them.

Anonymous’s picture

Looks like it affects all modules. I thought I made a mistake with mine or something lol

Anonymous’s picture

Priority: Normal » Critical

Changing status to critical.

markus_petrux’s picture

It looks like some data was really collected, but summary data is missing.

As in example, if we look at CCK, the "Weekly project usage" table reports 0 for October 25th, and nothing in the "Recent release usage" table for that period:

http://drupal.org/project/usage/cck

However, if we look at data related to a particular release, the usage stats are there:

http://drupal.org/project/usage/484068
http://drupal.org/project/usage/539128

The same happens to all other projects I've seen.

So maybe there was an error last time the process that generates summary data was executed?

Andrew Schulman’s picture

subscribe

Anonymous’s picture

Title: Project stats for October 25th not populated » Project stats for October 25th and Nov 1st not populated

Looks like it happened again...

http://drupal.org/project/usage/uc_ticket

http://drupal.org/project/usage/622226

It's counting the usage clearly, but not compiling stats for the whole project.

seutje’s picture

subscribe

aidanlis’s picture

subscribe

pvhee’s picture

subscribe

mohammed j. razem’s picture

subscribe

mrfelton’s picture

subscribe. Affects all of my modules.

mohammed j. razem’s picture

actually all Drupal modules are affected.

markus_petrux’s picture

There seems to be an additional problem: usage stats are not reported for new releases.

So, in the usage stats page for a project, the table "Weekly project usage" reports 0 for new weeks, no column is rendered in the "Recent release usage" table for those periods, AND no new row is reported for new releases either.

However, usage stats are reported if we look at usage stats page for individual releases, even if these do not show on the project stats page.

So the problem seems to be focussed in the data that's used to build the usage stats page at project level.

Hope that helps.

michelle’s picture

Title: Project stats for October 25th and Nov 1st not populated » Project stats (usage statistics) last reported for week of October 18

I searched on usage statistics and didn't find it so adding that into the title for others who may search on the actual page title. Also updating the title since there is now another week of no usage statistics.

Sorry, no help fixing the problem, but maybe this will prevent dupes. :)

Michelle

falcon’s picture

subscribe

globetrotter’s picture

subscribe

dww’s picture

Spent a while looking into this today. The basic summary:

- Once again, our usage data has become so huge that even our gigantic 70-gigs-of-RAM DB server (or whatever it's got now) can't actually process the weekly summaries. The queries just explode and timeout. :(

- The alternative system for processing all this data outside the DB is mostly working fine now (yay for Bdragon!). He's got summary data for all the missing weeks. We're going to import that data sometime in the next day or two.

- We're basically ready to turn off the old system for handling this, whereby each request for release history XML files is logged to the d.o DB. In the near future, all those requests are going to be served by our varnish cache nodes, and logged in the varnish access logs. Bdragon's script knows how to parse these logs to generate the summary usage data, instead of doing it all in the DB. We'll roll out the new system over the next few days, too. That should significantly reduce the load on the main d.o DB, and hopefully make the usage stats much less fragile in the future.

- The OSL folks have configured things to preserve the varnish access logs for up to 30 days.

- We've already turned off the calls in the script that serves the release history files to bootstrap drupal and record the usage in the DB. We're seeing those requests in the varnish logs, and Bdragon's script is successfully parsing them. Logging this data to the DB is now pointless, since we can't actually process it anymore. So, that should hopefully reduce the load on the DB server quite a bit already. Soon we'll be able to prevent these requests from even hitting Apache at all by letting varnish handle them completely. Just requires a bit of header re-writing to get it right. We *need* to ensure that update.module requests for files on updates.drupal.org at least hit varnish so we can log them, otherwise, we break the usage stats. But, Bdragon and DamZ now understand this, so I have full confidence they'll get this right as they move forward.

We're just hashing out some of the details in #drupal-infrastructure IRC right now, but things are totally on track. I have full confidence in our new usage stat czars. ;)

Yay for progress!

michelle’s picture

Awesome! Thanks for the hard work. It's appreciated. :)

Michelle

markus_petrux’s picture

Ditto. Thanks a lot! :)

dww’s picture

Status: Active » Fixed

The actual bug reported here is now fixed:

http://drupal.org/project/usage

If we need to coordinate anything regarding the roll-out of the new system I described in #18 we should just open new issues for those things, and not keep piling on here.

Cheers,
-Derek

pasqualle’s picture

thanks Derek and everyone involved

seutje’s picture

awesome, guess there's too much using going on ;)

Andrew Schulman’s picture

All working now. Please pass the thanks on to the usage stats team.

anrikun’s picture

Title: Project stats (usage statistics) last reported for week of October 18 » Usage statistics broken again (November 29)
Status: Fixed » Active

Broken again :(

mrfelton’s picture

Not only are the Drupal statistics broken again, but I noticed yesterday that they were not actually properly fixed in the first place. Well, the D5 stats weren't anyway. From Nov 8 onwards, all my modules show 0 users for 5.x-all-versions, which I know to not be true.

pasqualle’s picture

Priority: Critical » Normal

@mrfelton: the summary for 5.x seems good, only the usage per release is incorrect..

markus_petrux’s picture

It seems like usage stats per release were not collected, and the summaries still try to render a line about them, but all columns are 0. Rows per release for the new week are not even rendered, so it seems there's no data about them.

dww’s picture

Title: Usage statistics broken again (November 29) » Usage statistics last reported for week of October 18
Status: Active » Closed (fixed)

The new problem is being discussed at #643754-4: New usage statistics system sometimes fails to update per-release data. Closing this and locking comments. I'd rather not keep reopening old issues everytime something goes wrong. Thanks.