I've made a small change to cacherouter.inc which enables the gathering of page view statistics when pages are served to anonymous users using fast page caching. This isn't normally possible since statistics requires a database connection, however I'm storing these views within memcached and then persisting the data in the database during cron execution.

This snippet gets inserted into cacherouter.inc around line 191, right before $page->data is printed and the function returns.


  // increment page view counter for uri
  if ($cached = cache_get('cached_page_view_counts')) {
    $counts = $cached->data;
  }
  $counts[request_uri()]++;
  cache_set('cached_page_view_counts', $counts);

This is the hook_cron() implementation which persists this data in the database. This assumes we're using pathauto or some other path-aliasing mechanism.


function cacheupdater_cron() {
  // write out cached page view counts to database
  if ($cached = cache_get('cached_page_view_counts')) {
    // immediately clear the cache entry for page view counts
    cache_clear_all('cached_page_view_counts');
    $counts = $cached->data;
    $limit = variable_get('cached_page_updates_per_cron', 1000);
    $processed = 0;
    // iterate over each page view count and add it to existing statistics
    foreach ($counts as $uri => $views) {
      $pieces = explode('/', drupal_lookup_path('source', substr($uri, 1)));
      if (count($pieces) == 2 && $pieces[0] == 'node' && $nid = $pieces[1]) {
        $timestamp = time();
        if ($counter = db_fetch_object(db_query('SELECT daycount, totalcount FROM {node_counter} WHERE nid=%d', $nid))) {
          db_query('UPDATE {node_counter} SET daycount=%d, totalcount=%d, timestamp=%d WHERE nid=%d', $counter->daycount + $views, $counter->totalcount + $views, $timestamp, $nid);
        }
        else {
          db_query('INSERT INTO {node_counter} (nid, daycount, totalcount, timestamp) VALUES (%d, %d, %d, %d)', $nid, $views, $views, $timestamp);
        }
        $processed++;
      }
      unset($counts[$uri]);
      // reached the limit, stop processing
      if ($limit > 0 && $processed === $limit) {
        break;
      }
    }
    // merge unprocessed views into any views which have been made during this process
    // (this could *potentially* result in some lost page views, but amount should be trivial)
    if (count($counts) > 0) {
      if ($new_cached = cache_get('cached_page_view_counts')) {
        foreach ($new_cached->data as $key => $value) {
          $counts[$key] += $value;
        }
      }
      cache_set('cached_page_view_counts', $counts);
    }
  }
}

Steve, this could easily be encapsulated within cacherouter or a sub-module for all those using page fast-caching AND statistics. This got me thinking about the design of a new hook which could provide some dynamic data to pages without the need for a database connection. Let me know if anyone is interested in discussing.

Comments

moshe weitzman’s picture

This is a terrific feature. I actually think core should adopt this pattern when memcache/apc are available. We really need to avoid writes to stats table on every request. Similarly, I would love to keep stats on cache hit/miss for the core cache.inc. until then, we swap cache.inc.

yhager’s picture

The cron is not the only place you can lose counts. A number of web servers might increment the same counter together, and override each other's results. Depending on the load on the site, and the amount of web servers, you might lose more than a trivial amount.
As long as the count is not an atomic test-and-set operation, your counts might be awfully off.

McErr’s picture

Main idea was super. It even worked, when I changed my codes, but...

this line does not clear this f* cache variable. (just figured it out with var_export())

cache_clear_all('cached_page_view_counts'); // the problem is here
error_log(var_export(cache_get('cached_page_view_counts'), true)."\n", 3, './files/myvarlogs.txt');

and every time cron runs - it adds current count plus all previous.
Boss is killing me. I need help.
Skype: grigoriy.babenko

jtrudeau’s picture

I don't have time to debug this, but for the time being you can try this instead:

cache_set('cached_page_view_counts', 0);
McErr’s picture

Super! Patched, works, makes me happy!
Many-many thanks.
---
Noa. It doesn't. 6th Drupal.
This one - does:
cache_clear_all('cached_page_view_counts', 'cache');

andypost’s picture

Suppose we need some hook for this feature