This patch is intended as a logical extension to http://drupal.org/node/230101. There's a huge performance boost when serving pages from the pagecache, but on a very busy website each time the pagecache is flushed you take a performance hit from many simultaneous requests all building a new cached version of the various popular pages.

This (currently untested) patch was sponsored by Pagesix.com. It relies on a new cache_lock table (not currently created by this patch), and no longer hard flushes the pagecache table. Instead, all pagecache objects are given timestamps, and a new global cache-flushed timestamp is introduced. Instead of flushing the cache, the global timestamp is updated to the current time. When serving cached pages, logic checks the timestamp of the cache object -- if older than the global timestamp we grab a lock unique for the uri and rebuild the page. If another request comes in for the same page while it is being built, the attempt to grab a lock fails and so we serve the older version of the page.

Here's some ASCII art to better describe the logic behind and reason for this patch:

  Timer  Action    
  0ms:
   |----> Page request #1, obtain lock, rebuild page
  1ms:
   |----> Page request #2, failed to obtain lock, serve page from cache
   |                       
   |----> Page request #n, failed to obtain lock, serve page from cache
   |                       
  250ms:
   |      Page request #1 finished building new page cache, page cache updated
   |----> Page request #n+1, cache is up-to-date, nothing to rebuild, serve page from cache
CommentFileSizeAuthor
#4 pagecache_lock3.patch3.28 KBJeremy
pagecache.patch3.71 KBJeremy
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

firebus’s picture

does this patch require the changes in http://drupal.org/node/230101 to work? if so, could you give me a very brief explanation of why? i'm not intimate enough with the bootstrp process to immediately get the relationship.

i think this change makes sense for advcache

if there is a dependency with http://drupal.org/node/230101, then perhaps it does make sense to include http://drupal.org/node/230101 as part of advcache, even though http://drupal.org/node/230101 alone provides no benefits to advcache users who don't also use memcache.

###

since this patch modifies both memcache and advcache

i think the least jarring way to commit a feature like this to both projects would be to have separate, related, feature requests in both advcache and memcache projects, with separate patches for each project.

if the main code is part of advcache, set a variable here indicating that this feature is enabled.

then on the memcache side, conditionally check for the variable and only invoke the new memcache page_cache handling if the variable is TRUE.

m3avrck’s picture

subscribing

Owen Barton’s picture

I am actually wondering if this might actually make sense as a Drupal core patch? It's a small amount of code, has no major changes to the user experience, and yet can save a pretty significant amount of server resources during busy periods.

Jeremy’s picture

FileSize
3.28 KB

The earlier patch was actually quite broken. I'm attaching a working version.

This new (simpler) version is for Drupal 5.11, and assumes you're using memcache with this patch already applied.

Updated patch sponsored by ConsumerSearch.

jojje’s picture

Very interested in this patch. Is it going to be available for Drupal 6?

Jeremy’s picture

Yes, the patch has been ported to 6.x. The port was sponsored by ParentsClick / Lifetime Digital.

slantview’s picture

Status: Needs review » Needs work

hey, this patch is not atomic. by the time you do your cache_get request, another client could have gotten the lock and now you overwrite it with your cache_set. See #251792: Implement a locking framework for long operations for reference on how to make a fully automic lock. I only say this cause this is how i was trying to write that patch. HTH

doublejosh’s picture

Was really hoping this would solve our problem on a non-PressFlow Drupal 6 with reasonably serious traffic.

However, this moved the problem of deadlocking into the semaphore table.

doublejosh’s picture

Version: 5.x-1.x-dev » 6.x-1.x-dev

Wanted to bring up that the majority of cache deadlocking for us (with a nice big cache lifetime) was a large variable cache.
If you're lucky you will be able to expunge a bunch of crap and drop that down. We went from 1.3mb to 0.2mb.

doublejosh’s picture

So in addition to preventing deadlocks with cache build locking, this patch also prevents bulk flushing of expired content and rather logs the cron attempt. Logs the 'Flush all caches'
But shouldn't this graceful latent rebuild honor the min lifetime? Shouldn't because cache was explicitly cleared, keeping page cache around just allows avoiding a stampede after a full flush.
Otherwise this patch causes page cache to ignore the lifetime and let the last cron timestamp rule expires during page_get_cache.

Our problem was cache stampedes regardless of 'flush all caches'; just normal 30 minute clearing was causing the stampedes.
This patch is useful for the semaphore lock, but if performance is hurting enough to deadlock from frequent cron expires the cache build locking isn't enough. Smarter expires has been our path to success.