Just eyeballing the code it looks like this doesn't work out of the box with core's db caching, but instead calls an expire_cache hook with a list of URLs.

So to make this work with core it should be as simple as implementing the hook in a module and calling cache_clear_all($url, 'cache_page') for each of the passed URLs?

I have been fiddling with Rules and Cache Actions for a while and its a tedious process to set up all the permutations. This module could be a winner.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

mr.j’s picture

Yes indeed, this is all that is needed to make this work with core db caching.

/**
 * Implementation of hook_expire_cache from expire module
 */
function yourmodule_expire_cache($urls) {
  
  // Clear each affected URL from the page cache
  foreach ($urls as $url) {
    cache_clear_all($url, 'cache_page');
  }
}

Maybe a candidate for adding to the module code, or as a sub-module?

djbobbydrake’s picture

cache_clear_all() is already called on node saves in core, so the entire page and block caches are cleared on any node save.

mr.j’s picture

Not exactly. If you just rely on cache_clear_all() in node_save then all nodes might be flushed, but only if the cache lifetime has been exceeded.

This is a more targeted and immediate solution because if you pass a specific cache id to cache_clear_all like in this code, the singular cached page is flushed immediately regardless of the cache lifetime setting.

That is what makes this module so beneficial - we can bump up the cache lifetime so that pages are cached for longer and rely on this module to pick out the exact pages we need to flush from the cache. The module could be improved in lots of ways but its still a better approach than a blanket cache_clear_all.

I have added a small hack that deletes all "?page=n" variations for a node as requested here: #1299722: Purge all comment pages. It is hackish because it relies on the order of the URLs being passed into the hook. With the current code, they are always like "node/6" followed by "alias/to/node6". So as long as that order does not change you can reliably use the aliased path along with wildcard matching to flush all pages belonging to a single node, assuming you are using path aliases of course. It is not that sophisticated though - it would delete all nodes which have the same beginning path alias - eg "alias/to/node6/another/node" too, but that is the best you can do with core's wildcard matching, and it is something I am not too concerned with on my site.

function yourmodule_expire_cache($urls) {
  
  $wildcard = FALSE;
  
  foreach ($urls as $url) {
    
    // Clear URL from the page cache
    cache_clear_all($url, 'cache_page', $wildcard);
    
    // If this matches a $url like '/node/657', the next $url in the loop will
    // be an alias for the node. Delete the cached page alias using wildcard 
    // matching to also clear all pages of comments on the node.
    $wildcard = preg_match('@\/node\/\d+$@i', $url) === 1;
  }
}
doublejosh’s picture

...bump up the cache lifetime so that pages are cached for longer and rely on this module to pick out the exact pages we need to flush from the cache.

This is exactly the case I was planning to use this module for. Before setting Varnish up I'd like to be able to set my page cache for 1 day, let content editors be able to work on pages and see immediate results.
If it's not flushing the DB for core pages cache lifetimes then doesn't Varnish just get old pages anyway if the cache lifetime is set at all?

mr.j’s picture

I don't use Varnish so I can't answer that, sorry.

doublejosh’s picture

I wrote this before I realized that core cache integration requires the use of Rules.
Couldn't quite figure that out though: #1054580: Rules integration
I'm trying to get Expire to work BEFORE I setup Varnish.

doublejosh’s picture

I'm confused.
Looking through both expire_nodeapi(), expire_node(), expire_cache_derivative() it seems like this module should be clearing nodes page cache.
However, I don't see the actual line in expire_cache_derivative() that literally does the cache update call (I'm assuming I'm just missing it) it seems like that's the main purpose of most of this module. Yet, with my core performance minimum lifetime set to 9 hours, editing a node does not update it's cached page.

Would love to provide a patch with some more understanding. Or maybe it's just a Rules thing that I'm just not configuring right yet.

It seems that adding cache_clear_all($urls, 'cache_page', FALSE); to expire_cache_derivative() around line 467 wrapped with an admin option would work, but it doesn't what am I missing?

SORRY FOR ALL THE COMMENT EDITS!!!

doublejosh’s picture

Perhaps something is wrong with my setup. Even the drush command doesn't clear pages even when I add cache_clear_all($urls, 'cache_page', FALSE); to expire_cache_derivative().

My site is on a subdomain like: beta.mysite.com, my cid entries include that, and when I dsm() the $urls array I see the full subdomain path just fine.

doublejosh’s picture

Think I get it. Can't clear the cache_page if cache_content is stale... or perhaps another cache table?
n/m, the created timestamp isn't increasing.

doublejosh’s picture

I'm sorry I just needed...

// line 467
foreach ($urls as $u) {
  cache_clear_all($u, 'cache_page', FALSE);
}

...duh.
Will make an admin setting and provide a patch.
OR would this be better as a submodule using the expire_cache hook?

doublejosh’s picture

Status: Active » Needs review
FileSize
1.61 KB

Seems cleaner as a sub-module. Might I recommend choosing "Expire menus: > No" as a default. (off topic)

doublejosh’s picture

This sub module seems rather worthwhile to me as it can allow those without the resources or expertise to setup Boost, Varnish, etc. to set their cache very high.

For those needing this functionality... and not wanting to use the patch you can also use Cache Actions plus Rules.
This submodule will do this with just the new admin setting turned on. Good for beginners who want to cache farther.

doublejosh’s picture

I guess this module is semi-abandoned. Just looked at the last commit dates :)

msonnabaum’s picture

Status: Needs review » Closed (won't fix)

#3 is incorrect. cache_clear_all() will clear the page cache only if you set the *minimum* cache lifetime. If you really need a high minimum setting, you can use something like cache actions to clear node pages.

And just fyi, varnish only respects the cache max setting, so this will do nothing in that case.

mr.j’s picture

Status: Closed (won't fix) » Needs review

No. You are wrong. #3 is correct. Please don't be so dismissive.

I'll even take the time to explain it to you. Here is the code from cache.inc with irrelevant bits removed.

  if (empty($cid)) {
    if (variable_get('cache_lifetime', 0)) {

      // MINIMUM CACHE LIFETIME IS SET - CHECK IF IT HAS EXPIRED

      else if (time() > ($cache_flush + variable_get('cache_lifetime', 0))) {
        // Clear the cache for everyone, cache_lifetime seconds have
        // passed since the first request to clear the cache.
        db_query("DELETE FROM {". $table ."} WHERE expire != %d AND expire < %d", CACHE_PERMANENT, time());
        variable_set('cache_flush_'. $table, 0);
      }
    }
    else {
      // No minimum cache lifetime, flush all temporary cache entries now.
      db_query("DELETE FROM {". $table ."} WHERE expire != %d AND expire < %d", CACHE_PERMANENT, time());
    }
  }

When cache_clear_all is called without parameters, $cid is always NULL so it enters that conditional block. If the cache_lifetime is set then it deletes ALL cached pages if the cache_lifetime has expired since the last cache flush (and yes that includes pages that may have been cached 10 seconds ago). If $cid and cache_lifetime are not set it deletes all cached pages (it says temporary ones only but the reality is they are all saved as temporary). So you either delete all cached pages or none every time the function is called like that. Its not very smart but thats the way it is, and why I am using this module.

If $cid is set however, that whole block is skipped and the specific cache entry matching the $cid is deleted. That is the whole point of why I wrote this code. It allows you to pick out and immediately delete a cached node page if it has been updated.

Believe me, I wrote, tested, and use the code myself and I know that it works.

Why the wont fix? It would be a tiny sub-module that can be optionally used if someone wants to use this with core DB caching like me. What is not to like about it? Did you miss the part in my original post where I mentioned how tedious it is to set up all the cases that this module covers using rules and cache actions (not sure it is even possible)?

SqyD’s picture

I'll add my 2 cents to this discussion:
I would recommend to make this module a separate project or have it included with another project that does more with Drupals database caching layer.

Since it's conception by mikeytown2 this project has been a general API module without any specific use case that helps other external modules like Varnish, Purge and possibly Boost 7.x to control various caches. Your proposed database cache implementation would act on that same level. Including it as a submodule to this project would move away from that path.

Don't get me wrong, I'm not being dismissive. Your use case and solution sound very valid to me but this project is probably not the right place to publish and maintain it. Why don't you branch out into a separate project and I'm sure we'll be happy to mention it as one of the modules implementing the hooks the Expire module provides.

I propose to keep this issue open so you can post your response and possibly link to you new project.

msonnabaum’s picture

I misspoke in 14. I meant to say that cache_clear_all() WON'T clear the entire page cache, only if you set the *minimum* cache lifetime.

I'm just trying to clarify that the issue you're describing in #3 only applies when using the internal page cache with a minimum cache lifetime. This module aims to provide a way to expire URLs from reverse proxy caches where the minimum cache lifetime is not used and in most cases internal page caching is disabled completely.

In my opinion, the functionality provided here is out of scope for this module.

nicholasThompson’s picture

Version: 6.x-1.0 » 7.x-1.x-dev

For those interested...
http://drupal.org/project/expire_cache_page

Also bumping the version as it looks like 6.x is no longer receiving any attention.

Spleshka’s picture

Version: 7.x-1.x-dev » 7.x-2.x-dev
Status: Needs review » Fixed

Done in 7.x-2.x branch.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.