Hi,

I have noticed that as of 2.x the expire module switches to giving absolute URLs only instead of the hybrid API it had before (using the "Include base URL in expires") option. I'm the author of the Acquia Purge module and one of the important roles that module fulfills is determining what domain names need purging, as the Acquia Cloud platform provides this there's a logical reason for this to be handled by a *module that executes the purges*.

http://drupalcode.org/project/expire.git/blob_plain/refs/heads/7.x-2.x:/...

Now with absolute URLs being provided only as of 2.x I believe we are mixing up the role of what expire is and should be: *the module that does detection*, and which it does very solidly btw. Can you therefore please strongly reconsider providing paths - either optional or trough a second hook - to not break the Acquia Purge module with the upgrade?

Thank you for consideration,

Niels van Mourik
Acquia

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Spleshka’s picture

Status: Active » Needs work

Hey @nielsvm,

Thank you for you issue. Of course, I will check modules' integration and make some thoughts about this. If you have any ideas of how to do this - you are very welcome!

konradj’s picture

We have the same problem with the Varnish module, so pleas add the config back.

Possible fix:

Add to main config

  $form['format']['expire_include_base_url'] = array(
    '#type'          => 'checkbox',
    '#title'         => t('Include base URL in expires'),
    '#default_value' => variable_get('expire_include_base_url', EXPIRE_INCLUDE_BASE_URL),
    '#description'   => t('Include the base URL in expire requests. Compatible with Domain Access'),
  );

And change line 31 in expire.api.php

from
if ($absolute_urls_passed) {
to
if ($absolute_urls_passed || !variable_get('expire_include_base_url', EXPIRE_INCLUDE_BASE_URL)) {

wulff’s picture

A big +1 for Niels' suggestion. I ran into the same problem when setting up Expire with the Varnish module.

At the moment I'm using hook_expire_cache_alter() to set $absolute_urls_passed = TRUE. This works, but seems like a very ugly hack.

nielsvm’s picture

I'm honestly concerned about this sudden API break, I've been getting multiple issues both internally here at Acquia as d.o tickets. What happens is that 70% of all users don't pay attention to your warning and just install 7.x-2.x without looking. And there's no obvious technical way for me to detect that "upstream" (expire) has changed version _AND_ api without ugly hacks.

I don't like sounding like someone that just complains, here's what I would suggest:

  • expire 7.x-2.x: rename hook_expire_cache to something like hook_expire_cache_v2 so your downstream implemetors can maintain both.
  • expire 7.x-2.x: Provide $urls as object that contains the internal drupal path and the full url.
  • expire 7.x-2.x: I'm concerned about support for wildcard purging too, as I can't support that at all and will continue to require old-school hardcoded paths.

Looking forward to your thoughts,
Niels

Spleshka’s picture

Hi guys,

I'm trying to figure out the problem. @nielsvm, you suggester to provide both internal and full url for the paths. But expire module already does it: http://clip2net.com/s/6f2h0K. What else should be here?

If you need to fix this extremely quickly, you may contact me by email/skype.

wulff’s picture

That screenshot only shows absolute URLs.

I'll try to create an example of what happens when I use the Expire module without the hack mentioned in #3.

Spleshka’s picture

@wulff,

Take a look at the array keys.

wulff’s picture

Sorry, I'm blind – I hadn't noticed the array keys.

However, I'm not sure that's an ideal way to pass the paths around since other modules (e.g. the Varnish module) only look at the array values, cf. varnish_expire_cache().

wulff’s picture

O.k., I did some more digging. It looks like there's a patch for the Varnish module, which makes it accept absolute URLs: #2017097: Ban/purge requests to varnish are malformed..

Spleshka’s picture

Okey, thanks. So I think I should implement this setting back to the 2.x, to avoid a confusions.

wulff’s picture

I would really appreciate the possibility to choose between absolute and relative URLs, so that's a +1 from me :-)

konradj’s picture

A big +1 for add settings back, as I mentioned in #2.

nielsvm’s picture

Contrary to others here I wouldn't really appreciate the return of that setting, however, as other downstreams depend on it its a small bargain for me as I'll just hook_form_alter out the setting and overwrite $conf globally. Probably me doing that is a little dirty but I'm not expecting any issues with it unless any of you would think it could.

Otherwise, I've played further with 2.x and noticed the array keys now too, I'm quite happy with it, so a big +1 Spleshka! The only thing I don't really feel comfortable with it still is that the API's still changing, so I'm now having to do things like is_string(key($paths)) to detect version 2.x or 1.x which feels a little fragile.

Anyway, thanks!

Spleshka’s picture

So we have an opposite opinions. Please, find the solution that will be suitable for everyone and then I (or someone else) will create a patch to solve this issue.

bibo’s picture

Hi,

After some test with Varnish-module and expire-module), I understand why nielsvm is opposed to having that setting. The hostname-setting would always have to be turned off, or invidual Varnish flushes just wont work. I would be almost fine with having that setting again, just so we could get a 7.2-release out.. but it would mean that people can configure the cache so that page database-purges work, but page-Varnish-purges dont.

This is abit hard to debug, unless varnishlog and the inner logic of Varnish are well known. Simply put, this ban works:

$ varnishlog|grep "Rd ban"
    0 CLI          - Rd ban req.http.host ~ example.com && req.url ~ ^/some/path$

And this one created by expire doesnt:

$ varnishlog|grep "Rd ban"
    0 CLI          - Rd ban req.http.host ~ example.com && req.url ~ ^/http://example.com/some/path$

The only part of expire 7.2-x that clears invidual urls from Varnish correctly is "drush expire-url some/path", when used with internal path only. It expects an absolute url though, according to description!

More info in the related Varnish-issues that I just added, but I don't have high hopes that they will be fixed there anytime soon. If this was fixed, expire would AFAIK work as it is now: #2017097: Ban/purge requests to varnish are malformed.

If I understand the problem correctly, the best way to fix it on the side of expire-module would be Varnish-specific handling, so that people who use the page-cache provided by Varnish can't configure it wrong. Aa admin form alter or sniffing out the caching class during page request.

I'm marking this issue as critical, since I see it so that the 7.2-series wont be of use to anybody (expect people who use boost or database page-cache), before this is fixed.

nielsvm’s picture

Yeah and in general addition to what Bibo says, I think its better to leave the "domains" off to the actually purging plugins agnostic of the technique. That means that the Varnish module could deal with a path its own way than the Acquia Purge or Purge modules will do, all solving the "full http paths" thing the way they want.

With having both 'paths' and 'absolute URLs' through the array key's I'm personally happy with it though, because I can simply ignore the absolute URLs and solve the domains and full path generation my own way. So yeah, that setting also became a little bit unnecessary imho.

Thanks for all the brilliant work on this module though!

Niels

bibo’s picture

Status: Needs work » Needs review
FileSize
9.49 KB

I've now created a patch, which mostly follows the idea of konrad. That means reintroduction of the option to disable base_url entirely, as it was before.

However, there are several additions:

  1. Re-added "define('EXPIRE_INCLUDE_BASE_URL', TRUE);" with extensive comments in code and in the admin UI - to make both developers and end users understand that checking it is not compatible when Varnish or Acquia Purge are used as cache-page backends. I kept the default still on, because it has been TRUE for so many versions and changing it to FALSE cound cause confusion.
  2. The API file to modify is at includes/expire.api.inc, not expire.api.php (@konrad). The modifications were somewhat more complex than just adding an extra || !variable_get('expire_include_base_url', EXPIRE_INCLUDE_BASE_URL) When I tested it with the one liner, a node update resulted in only the internal ("node/1") url flush, and the alias would not get flushed. To get the alias too, I had to modify the convertToAbsoluteUrls()-function.
  3. I also added the $base_path-stripping, which also exists in the older expire-versions, but in a different place.

In addition to those points, the patch includes other new drush features it maybe shouldn't, from this cases' perspective. The expire.drush.inc changes can be ignored entirely if you wan't, but the additions they contain may prove useful too.

The drush integration changes are:

  1. "drush expire-comment 123" would result in fatal error if comment-module is disabled. I added "comment"-module to it's requisites to avoid unnecessary confusion.
  2. I modified "drush expire-path foo"-extension to do something additional (if Varnish module is used as a cache backend): flush the url from Varnish directly via _varnish_terminal_run(). This was added so that it would be possible to flush urls regardless of domain names. Varnish will normally always include the hostname, which is currently not possible to disable. This method also has custom auto-escaping, which Varnish-module itself lacks (for now), to avoid erronous commands send to Varnish.
  3. I created an entirely new extension "drush expire-regex ".*any_url", which allows base_url/hostname/domain-free cache flushes. It uses the aforementioned "direct purging". The documentation of it ("drush expire-regex --help") may be useful to people who are struggling with understanding Varnish flushes.

EDIT: Fixed some typos and formatting.

Spleshka’s picture

@bibo, thank you for your great, I really appreciate this!

But your patch contains several different improvement that should be commited definately separately. Also I have couple questions regardless the patch:

1. Why drush regexp should work only for Varnish?
2. Why should we include integration with Varnish into Expire? There are API that allows to process urls in other contribs. So for me function expire_flush_varnish_without_hostname() doesn't seem suitable in the expire module.

Spleshka’s picture

Status: Needs review » Needs work

I commited fix for drush comment.

So now we need first to have a clean patch with "include_base_url" configuration. Only after commiting this fix we may think about drush expire-regexp command.

Spleshka’s picture

Status: Needs work » Fixed

I hope that I found a solution that works for everyone, and I even already commited it. See http://drupalcode.org/project/expire.git/commit/1c569df for more info. To be short, I added setting 'expire_include_base_url' to the configuration. But made a workaround for internal paths and external urls in different way.

Thank you all guys for working on this issue. Now I am waiting for your tests and feedbacks regarding this fix. Then we can finally go stable!

@bibo, If you still want to have drush expire-regexp in the expire module, please, create separate issue for that.

bibo’s picture

Thanks Spleshka. I just tested the latest 2.x-dev, and the base_url skipping, aliases etc work fine. I mainly tested just re-saving a node and using varnishlog|grep "Rd ban".

There is something that could still be changed in the admin settings. The new/old option says now:

Include base URL in expires
Include the base URL in expire requests. Compatible with Domain Access.
Enabling this setting when Varnish or Acquia Purge modules are used as a cache backend is not recommended.
If you use internal expiration, this checkbox should be selected.

When Varnish is used as the page_cache, having "Include base URL" enabled, always results in a failed flush, both with Internal and External.
They just create slighly different purge-commands:

Base Url: ON & Internal => DOES NOT WORK

➜  contrib git:(develop) varnishlog|grep "Rd ban"
    0 CLI          - Rd ban req.http.host ~ local.dev && req.url ~ ^/http://local.dev/node/830227$
    0 CLI          - Rd ban req.http.host ~ local.dev && req.url ~ ^/http://local.dev/test-url$

Base Url: ON & External => DOES NOT WORK

➜  contrib git:(develop) varnishlog|grep "Rd ban"
    0 CLI          - Rd ban req.http.host ~ local.dev && req.url ~ ^/http://local.dev/node/830227$|^/http://local.dev/test-url$

Base Url: Off & Internal => WORKS OK
➜ contrib git:(develop) varnishlog|grep "Rd ban"

    0 CLI          - Rd ban req.http.host ~ local.dev && req.url ~ ^/node/830227$
    0 CLI          - Rd ban req.http.host ~ local.dev && req.url ~ ^/test-url$

Base Url: OFF & External => WORKS OK

➜  contrib git:(develop) varnishlog|grep "Rd ban"
    0 CLI          - Rd ban req.http.host ~ local.dev && req.url ~ ^/node/830227$|^/test-url$

I didn't test having the standard page_cache (Database). I suggest rewriting the info or at least removing: "If you use internal expiration, this checkbox should be selected".

1. Why drush regexp should work only for Varnish?

Good question. Varnish just works a bit differently than other caching backends, and TBH I am only interested in using the module together with Varnish.

2. Why should we include integration with Varnish into Expire? There are API that allows to process urls in other contribs. So for me function expire_flush_varnish_without_hostname() doesn't seem suitable in the expire module.

You're correct, a more logical place would be Varnish module, or possibly another contrib-module. I just find the Varnish-module itself is not progressing very fast, and for now don't plan to create another issue for that. I still find it problematic that Varnish always forces the hostname in the purges, which can only be avoided by calling it's internal function.

I hope that I found a solution that works for everyone, and I even already commited it. See http://drupalcode.org/project/expire.git/commit/1c569df for more info. To be short, I added setting 'expire_include_base_url' to the configuration. But made a workaround for internal paths and external urls in different way.

I like your workaround better than mine, since mine still kept some function-names as they were (which was misleading).

Thank you all guys for working on this issue. Now I am waiting for your tests and feedbacks regarding this fix. Then we can finally go stable!

Looks stable to me :) The admin UI text is a minor thing.

@bibo, If you still want to have drush expire-regexp in the expire module, please, create separate issue for that.

I probably should do it in the varnish issue queue, but for now I have to concentrate on other stuff.

Spleshka’s picture

Hey @bibo,

Thanks for your feedback. I see that you are right about this note: "If you use internal expiration, this checkbox should be selected". We should remove or update it. I somewhy forgot that it is possible to use different cache backends even for internal expiration.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.