I'm using Boost with Expire and it's working fine.

One thing I need to do in addition is use Rules to clear the cache for pages created by Views.

The "Clear URL(s) from page cache" works, but you need to enter each URL individually. This is fine, unless, say, your view uses pagination and/or arguments, and as well as clearing /viewpath you also need to clear /viewpath?page=1 and /viewpath/today etc.

Attached is a patch that allows you to pass a wildcard in Rules (or elsewhere) so that you can use

viewpath*

and everything matching that will get deleted.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

apemantus’s picture

Actually, this patch doesn't work, or rather it only works in a limited way.

At the moment, I'm dealing with the problem with using rules to execute custom PHP that uses _boost_rmdir and so removing everything in a directory. It's a pretty blunt tool but for now it works for my needs. If anybody can come up with a better way, although, I'm all ears...

apemantus’s picture

[triple posting deleted]

apemantus’s picture

[triple posting deleted]

bgm’s picture

Your patchs makes sense, can you provide more info on why it didn't work?

I'm also wondering if this should not be an option in Expire to "clear other pages of Views with pagers", this way, other caches, such as Varnish, could also benefit from it. Not really sure on how to implement that though, since the view cannot really know how many pages there are (unless we run an expensive SQL query), so I guess we can resort to adding it to boost_expire in the mean time.

apemantus’s picture

The issue for me (testing on Windows) was that glob("news*") matched "URLs — Boost files" like

news — news_.html
news?page=1 — news_page=1.html

But not paths like

news/tags/sport — news/tags/sport_.html
news/tags/business — news/tags/business_.html

i.e. it worked for pagination, not for arguments.

I ended up using a version of the patch to create a rule to delete pagination and then used _boost_rmdir to remove news/tags etc.

I think a version of this patch would probably be useful. I haven't tested it, but thinking about it now, I probably could have used the patch and done

news*
news/tags/*
news/authors/*
news/archives/*

This means that "wildcards" maybe work a bit different than you'd expect [if you had expectation that news* would match news/tags], but it would help with clearing views via expire + rules + boost.

I can't really think of a foolproof, universal way for expire to know about all possible views pages unless you had an optional module that hooked into views and logging the paths when they're created?

jenshk’s picture

Issue summary: View changes
FileSize
1.67 KB

I think this is what you are looking for.

drclaw’s picture

Title: Let boost_expire_cache take a wildcard » Support wildcards in boost_expire_cache
Status: Active » Needs review
FileSize
2.99 KB

I might suggest an alternate approach. hook_expire_cache() now supports wildcards using it's own |wildcard syntax. It would make more sense to support that since it's what people will likely be using.

The attached patch expands hook_expire_cache() to respect the wildcard settings supplied by the expire module. We use glob() for filename pattern matching to determine which files should be removed from the cache. Additionally, custom expire paths can include their own '*' wildcards which will be respected by glob().

Not sure if this is the best approach, but I think it's at least a step in the right direction.

beauregard’s picture

I tested the patch from #7. Here my results.

Creating new node or editing a node: when saving:
Notice: Undefined variable: filename in boost_expire_cache() (Zeile 420 von /home/httpd/vhosts/lcdfernseher.ch/subdomains/exchtest/httpdocs/sites/all/modules/boost/boost.module).
These error appears up to 15 times on the page

Wildcard-Tests
1. mylist|wildcard
mylist is a paged view.
It worked and deleted all cached view pages

2. pagewithview|wildcard
pagewithview is a page with an embedded view. This view is paged.
It worked and deleted all cached view pages

3. cars/|wildcard
cars is a directory and I wanted to delete all nodes which are in www.mydomain.com/cars/.
This had no effect. Is it not possible or must I pass it with another syntax?

4. cars/adm|wildcard
Should delete all caches of nodes starting with “adm” in www.mydomain.com/cars
This had no effect. Is it not possible or must I pass it with another syntax?

shortspoken’s picture

Hi. I also tested this and confirm that its working with the following pattern:

node_alias?date=*

Although I am getting this watchdog error message too:
Notice: Undefined variable: filename in boost_expire_cache() Line 420

beauregard’s picture

@shortspoken
"node_alias" is the clean-url of your node?
how exactly was your pattern working? I thought that * does not work with boost.

@drclaw/maintainers
Is there any plan to fix this code, so we could use it productively? Can I sponsor it?

shortspoken’s picture

Sorry for not being clear enough. I wanted to expire also the paging and date-urls of my nodes as I use the views (date) pager.
Not only "cleanurl" should be cleared but also "cleanurl?page=1" and all other pages.
So I added the lines to the custom pages of the expire cache settings of my content type:

cleanurl?date=*
cleanurl?page=*

Note that the * has always to be at the end.

beauregard’s picture

Ok, now I understood when I tested. I assumed that * should not be used. But I am confused why now |wildcard and * can be used, however differently.
Example of a paged view

mylist|wildcard
--> Correctly deletes all pages of the view

mylist*
--> Does not delete all pages of the view

mylist?page=*
---> Deletes all pages of the view, except the first page (logical, because the first page is /mylist

My points 3&4 above work with *.

@drclaw
It would be very helpful if you could shortly outline how this is now implemented with both types of wildcards. Is there a reason to use both?

drclaw’s picture

Hello All,

Thanks for testing out the patch and providing such detailed feedback! I've tested out the different scenarios you've all described and adjusted the logic in the patch.

Here's how the patch now works after the adjustment:

  • If you don't use "|wildcard", we only use the path to match a single cached page. So for the path cars, we'll only check for that one page's cache entry and delete it if it exists (technically we'll look for the file named "cars_.html").
  • If you do use "|wildcard" we'll remove any files or directories that start with the supplied path (e.g. for cars|wildcard we'll flush cars_.html, cars-audi-a4_.html, cars/, cars_preowned/ etc.). Technically what we're doing is concatenating a "*" to the path and running it through glob() and removing any file or directory matches.

@beauregard: To clarify about the two types of wildcards, I was just pointing out that since we're using the php glob() function, you could technically use any of the pattern matching characters it supports ("*" and "?"). However, this new patch is slightly different than the old one. Now, we only use glob() if a wildcard has been specified using the expire module "|wildcard" syntax. Paths like mylist?page=* won't work with this new patch. The only real use case for using a glob wildcard would be for some kind of advanced pattern matching. For instance you could provide a path like cars/audi*2014|wildcard to match paths like

cars/audi-a4-2014 
cars/audi-a3-2014
cars/audi-tt-2014-q3
etc...

but not

cars/audi-a4-2013 
cars/audi-a3-2012
cars/audi-tt-2015

Hope this makes sense to everyone. Ask away if there are any questions (I'll try to respond quicker this time!)

drclaw

drclaw’s picture

FileSize
3.56 KB
2.22 KB

Sorry, small error in last patch :O

Use this one.

beauregard’s picture

Hi

Thanks for the update and explanations. I just installed and tested it. I still get a list of warnings when I update a node: Notice: Undefined variable: filename in boost_expire_cache() (Zeile 419 ..../sites/all/modules/boost/boost.module).

You don't have these warnings?

drclaw’s picture

I don't get that warning... but let me try on a fresh install and get back to you.

drclaw

beauregard’s picture

Hi

I asked a freelancer doing work for me to check this. He made a small change and now it works on my server without warnings. Please see the enclosed patch.

@shortspoken: Could you test it on your server too, because you get the warnings as I did?

best regards

garamani’s picture

Thanks beauregard, The Patch #17 Works fine but for the latest version 7.x-1.0 there is a redundant closing curly bracket:

The last part of patch should looks like this:

/**
 * Implements hook_expire_cache (from the 'expire' module)
 */
function boost_expire_cache($urls, $wildcards, $object_type, $object) {
  global $base_root;

  foreach ($urls as $key => $url) {
    // Check if the URL to be flushed matches our base URL
    // base_root: http://www.example.org
    // url: http://www.example.org/node/123
    if (strpos($url, $base_root) === 0) {
      // Decode the url since it may have a query string that has been encoded
      $boost = boost_transform_url(urldecode($url));

      // We need the extention for the filename
      $boost['header_info'] = boost_get_header_info();
      $boost['matched_header_info'] = boost_match_header_attributes($boost['header_info']);

      // Issue #2135835 Cache may not be enabled for this type (html/xml/ajax)
      if (! $boost['matched_header_info']['enabled']) {
        continue;
      }

      // If wildcards are enabled, we'll need to create a wildcard pattern for
      // globbing
      if ($wildcards[$key]) {
        $pattern = (isset($boost['filename']) ? $boost['filename'] . '*.' . $boost['matched_header_info']['extension'] : NULL);
      }
      else {
        $pattern = (isset($boost['filename']) ? $boost['filename'] . '.' . $boost['matched_header_info']['extension'] : NULL);
      }
       
      // The files to remove
      $files = glob($pattern, GLOB_NOSORT); // no sort = better performance
      if ($files) {
        foreach ($files as $filename) {
          if (unlink($filename)) {
          boost_log('Removed !file from the boost cache.', array('!file' => $filename), WATCHDOG_DEBUG);
          }
          else {
          boost_log('Could not delete the cache for !url, file !file does not exist.', array('!url' => $url, '!file' => $filename), WATCHDOG_DEBUG);
          }
        }
      }
    }

    else {
      boost_log('Could not delete the cache for !url, file !file does not exist.', array('!url' => $url, '!file' => $filename), WATCHDOG_DEBUG);
    }
  }
}

Edit: The Patch (#17) works without the modification above!!

beauregard’s picture

Hi

I applied the patch from comment #17 to the latest version 7.x-1.0 and I have it running on 3 productive sites since 3 weeks. It works without any problem.
I asked now the freelancer who works for me and he said that you added a new else statement:

   else {    // NEW CODE HERE
                boost_log('Could not delete the cache for !url, file !file does not exist.', array('!url' => $url, '!file' => $filename),  WATCHDOG_DEBUG);
   }

But this won't work, because filename is not known at this place of code.

So, the question is what happened: Did you apply my patch to the latest version and it did not work? Are we not using the same versions? Why you added this else statement?

best regards

Anonymous’s picture

@beauregard - it would probably help if you we specific in who you are questioning.

There is a stable version of boost and a development version, this is marked as being for the development thread (by someone), it is probable that someone's been looking at the different versions.

garamani’s picture

@beauregard
You're right, that was my mistake. The patch #17 works perfect and No modification needed.

rroblik’s picture

Working fine !

Maybe put this patch in a new dev release ?

Thanks

  • bgm committed 1ca3ef4 on 7.x-1.x authored by drclaw
    Issue #1810936: Support wildcards in boost_expire_cache.
    
  • bgm committed 3b7d911 on 7.x-1.x authored by beauregard
    Issue #1810936: Support wildcards in boost_expire_cache.
    
bgm’s picture

Status: Needs review » Fixed

Applied to 7.x-1.x using #14 then a diff of #17 (in an attempt to commit with proper credit). Thanks for the patches and testing!

Can anyone do a last review before releasing the next version of Boost? I spent a bit too much time understanding the two patches, and probably not enough testing.

rroblik’s picture

Working fine for me :)

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

azinck’s picture

This fix isn't quite right. The patch in #14 was much better than what got committed, I believe. See here: #2710449: Wildcard matching for expiration doesn't descend into subdirectories