Problem/Motivation

#2697795: [meta] Cache storage in database can fill the disk, especially via 404s points out that 404 responses are cached indefinitely.

When using the database cache backend, or another backend that does not limit storage space via LRU or similar, this can lead to cache filling in the case of either a large number of real 404s or a mis-behaving crawler.

Proposed resolution

Set a shorter TTL for 404 responses in page_cache module, so that they get expired after time rather than only via cache tags.

Consider whether to do this for all cache backends, or only those that don't support LRU or other space limitation - this would require a way for cache backends to indicate whether they do this though, which we don't have yet.

Remaining tasks

User interface changes

API changes

Data model changes

Comments

catch created an issue. See original summary.

catch’s picture

catch’s picture

Version: 8.2.x-dev » 8.1.x-dev
Status: Active » Needs review
Issue tags: +Performance, +scalability
FileSize
899 bytes

Here's a patch. Does not take much to do this, but I didn't add any configuration.

This could potentially happen in a patch release

Status: Needs review » Needs work

The last submitted patch, 3: 2699613.patch, failed testing.

Fabianx’s picture

+++ b/core/modules/page_cache/src/StackMiddleware/PageCache.php
@@ -312,6 +312,14 @@ protected function get(Request $request, $allow_invalid = FALSE) {
+      // Cache for an hour or the maximum age, which ever is the greater.
+      $expire = REQUEST_TIME + min($max_age, 3600);

I think this should be configurable at least via settings.php.

Also s/greater/lesser/?

catch’s picture

Also s/greater/lesser/?

Yeah I couldn't decide and changed it more than once ;)

Settings.php seems like an OK place to add the configuration yeah.

catch’s picture

catch’s picture

Status: Needs work » Needs review
FileSize
2.02 KB

Here it is with settings. Didn't look into the test failure yet.

Status: Needs review » Needs work

The last submitted patch, 8: 2699613.patch, failed testing.

alexpott’s picture

Status: Needs work » Needs review
FileSize
3.86 KB
3.66 KB

Fixed the test - the problem is with how $response->getMaxAge() works. I've also move the logic out of set() since I'm not sure it belongs there. Plus I think if the ttl is set to FALSE we shouldn't make any attempt to cache a 404. I also think we should debate whether or not making Drupal cache 404s is correct - since some form of cache busting always possible if we are.

I was also wondering whether 404s should be a cache redirect to a canonical 404 page - but maybe that would give us more problems.

catch’s picture

@alexpott so the use case for caching 404s at all is if you have a bad link somewhere (especially an image tag or similar), which isn't handled by the fast_404 handling - then the caching avoids having to build that page every time.

For me personally, I really dislike sites that redirect to a 404 page - if I've mis-typed a URL for example then look up at the address bar to see /404

catch’s picture

Status: Needs review » Needs work

Plus I think if the ttl is set to FALSE we shouldn't make any attempt to cache a 404.

Don't think that's right. If you're using page cache and no reverse proxy, you expect some page caching despite a 0 max_age.

Going from permament down to an hour is a much smaller change than going from permanent down to 0.

borisson_’s picture

I think this is related to #2472335: Support an independent (shorter) max age for 404/403 responses , while that had a different reasoning, I think the result of that issue is similar to the patch in here.

alexpott’s picture

It discussion with @catch I think we should consider calling this negative cache... see https://en.wikipedia.org/wiki/Negative_cache and http://www.squid-cache.org/Doc/config/negative_ttl/

Also I think rather then futzing stuff in PageCache we should consider setting an expires header in 404 generation.

alexpott’s picture

Also re the redirect idea - I didn't mean to do a page redirect. What I was trying to say is that with 404s we cache the same page over and over again - so should we try to only store one 404 page?

catch’s picture

What I was trying to say is that with 404s we cache the same page over and over again - so should we try to only store one 404 page?

Per #2699627: url.path cache context for breadcrumbs is unnecessarily granular the breadcrumbs can be different. Not that I think that's a good feature (could live without breadcrumbs altogether except on menchi katsu and kaki fry ;)) but currently we provide those if the path parents are valid paths - such as admin/config/development/abc

Similarly search_404 means different 404 pages for similar reasons.

So we'd either need to special case the page cache (but somehow allow search_404 to override the behaviour), or we'd need to generically support cache contexts - however that'll have significant overhead for cache retrieval including on non-404s.

I don't like using the expires header in page_cache much in general - it's almost deprecated at this point in favour of max age. I do think negative_cache is a good idea though - that's exactly what this is for, and we could then apply it to all url.path* cache contexts when there's a 404.

alexpott’s picture

Status: Needs work » Needs review
FileSize
3.77 KB
4.03 KB

After discussing with @catch here's an approach that renames the setting negative_cache_ttl and uses it for 403 and 404s.

catch’s picture

+++ b/core/modules/page_cache/src/StackMiddleware/PageCache.php
@@ -248,10 +249,30 @@ protected function fetch(Request $request, $type = self::MASTER_REQUEST, $catch
+        $max_age = $response->getMaxAge();

Why max age here but expires if it's not a 404?

Also should we be checking for ::isClientError()rather than specifically 403/404?

alexpott’s picture

  • Because @catch used in #8 :)
  • isClientError() makes a tonne of sense.
catch’s picture

Because @catch used in #8 :)

lol. I think I must have been projecting my dislike of the expires header, but we should probably use getExpires() so the only functional change is that we cache for an hour, or less if it was already going to be less.

alexpott’s picture

Actually I think it is a bug that we're not using getMaxAge everywhere here - because that falls back to return $this->getExpires()->format('U') - $this->getDate()->format('U');

alexpott’s picture

Discussed some more with @catch. The expires check in page cache is super confusing. By ignoring max-age PageCache is not a http compatible cache implementation and in fact it can be advantageous that it is not. It allows people to use a varnish implementation that does not support cache tags and have a small max-age whilst storing a permanently cache page in PageCache and clearing it using cache tags. Apparently the expires checking was implemented in #566494: Running cron in a shutdown function breaks the site?!?!

Therefore removing the max-age and not implementing an expires check for 4xx responses.

Wim Leers’s picture

The expires check in page cache is super confusing.

+1


  1. +++ b/core/modules/page_cache/src/StackMiddleware/PageCache.php
    @@ -248,10 +249,26 @@ protected function fetch(Request $request, $type = self::MASTER_REQUEST, $catch
    +      $settings_ttl = Settings::get('negative_cache_ttl', 3600);
    

    Adding a new setting… shouldn't this belong on \Drupal\system\Controller\Http4xxController::on404() in combination with #2352009: Bubbling of elements' max-age to the page's headers and the page cache?

    Of course, #2352009 will be hard to achieve due to BC. But I'd urge us to have a @todo here pointing to that issue, stating that this should be removed when we do that.

  2. +++ b/sites/default/default.settings.php
    @@ -420,6 +420,19 @@
    + * Cache TTL for client error (4xx) responses.
    

    Why TTL here, why not max-age?

alexpott’s picture

@Wim Leers / #24

  1. Because this is an implementation detail of PageCache not telling the outside world how to cache our 400s
  2. Because this is not about http max-age but about how PageCache behaves.

Basically page cache using max-age is separate and different issue.

catch’s picture

Yes page cache's use and non-use of max-age and use of expires is very confusing, but if we ignore the 'Expires' handling which was added for a very specific and unrepresentative use case and which is more or less undocumented (not to mention the expires header being near-deprecated in general), then we're left with it ignoring those values. I think we should leave all of this to #2352009: Bubbling of elements' max-age to the page's headers and the page cache, don't feel like it needs a @todo - that issue will completely change the implementation/functionality.

#23 looks like a good improvement on what I had in #10.

Wim Leers’s picture

#25: Excellent explanations! I'd like to see point 1 made more explicit then, because the setting you're adding in default.settings.php is not in any way explained to be Page Cache-specific. I'd like both the docs and the name to convey this. I think that'd make it non-confusing/non-ambiguous.

So, for the setting name:

+++ b/sites/default/default.settings.php
@@ -420,6 +420,19 @@
+# $settings['negative_cache_ttl'] = 3600;

s/negative_cache_ttl/page_cache_negative_cache_ttl/

or

s/negative_cache_ttl/page_cache_negative_ttl/

or

s/negative_cache_ttl/page_cache_4xx_ttl/

catch’s picture

The only thing with the rename is I think this is a concept we could use elsewhere.

For example render cache could set a shorter TTL for all cache entries written on 4* responses. I haven't opened an issue for that yet - not sure if it's a good idea, but don't think we'd want two different settings for it - more page_cache would end up piggybacking on the render cache setting (even if it's introduced here).

Wim Leers’s picture

But how do you know if something written to render cache (or any other cache for that matter) for a 4xx response is actually only relevant for 4xx responses? 4xx responses may just as well use cache/compute things that are relevant in other situations as well.

catch’s picture

We could add the negative_ttl to cache metadata from the route/url cache contexts on 4** where we know it's only going to be relevant there. This might help with the url cache context on form actions for example.

Even if we did it to all items, the worst that happens is the item expires and gets cached again - either by another 4** or permanently.

Wim Leers’s picture

Sounds interesting. I can't quite imagine how that'd work though.

Fine by me. But can we then at least put that rationale in the documentation (or the change record)?

catch’s picture

Added some extra docs, not sure if that's enough to clarify though. Also it's really at this point clarifying the existing design of the page cache rather than the behaviour added by the patch.

catch’s picture

alexpott’s picture

Issue tags: +Triaged D8 critical

Discussed with @catch, @effulgentsia, @xjm and @webchick. We decided to keep this critical since it is an avenue for DDOS and could result in data loss.

Berdir’s picture

  1. +++ b/core/modules/page_cache/src/StackMiddleware/PageCache.php
    @@ -244,14 +245,33 @@ protected function fetch(Request $request, $type = self::MASTER_REQUEST, $catch
    -    $date = $response->getExpires()->getTimestamp();
    -    $expire = ($date > time()) ? $date : Cache::PERMANENT;
    -    $this->set($request, $response, $expire, $tags);
    +    $expire = 0;
    +    // 403 and 404 responses can fill non-LRU cache backends and generally are
    +    // likely to have a low cache hit rate. So do not cache them permanently.
    

    Wondering if it would be worth it to still respect the Expires header, and only do the default 403/404 logic if no specific header is set.

    But probably not..

  2. +++ b/core/modules/page_cache/src/StackMiddleware/PageCache.php
    @@ -244,14 +245,33 @@ protected function fetch(Request $request, $type = self::MASTER_REQUEST, $catch
    +      // Cache for an hour by default. If the ttl is set to 0 then do not cache
    +      // the response
    +      $settings_ttl = Settings::get('negative_cache_ttl', 3600);
    

    Would it make sense to also respect this for the max-age header that we send out?

    I guess that could be a non-critial follow-up?

  3. +++ b/core/modules/page_cache/src/StackMiddleware/PageCache.php
    @@ -244,14 +245,33 @@ protected function fetch(Request $request, $type = self::MASTER_REQUEST, $catch
    +      if ($settings_ttl > 0) {
    +        $expire = REQUEST_TIME + $settings_ttl;
    +      }
    

    Not sure about our guideline for using or not using REQUEST_TIME.

    We started using $_SERVER['REQUEST_TIME'] in some cases, see e.g. \Drupal\Core\Cache\MemoryBackend::getRequestTime() and \Drupal\Core\Path\AliasManager::getRequestTime().

    AFAIK, that is to make unit testing easier, especially if we need to fake the current time. Doesn't apply here I think.

  4. +++ b/core/modules/page_cache/src/Tests/PageCacheTest.php
    @@ -375,6 +375,22 @@ function testPageCacheAnonymous403404() {
    +      $this->assertResponse($code);
    +      $this->assertEqual($this->drupalGetHeader('X-Drupal-Cache'), 'MISS');
    +    }
    

    Wondering how to test the default expiration time.

    Building the page cache cid for a given page is fairly easy, we could load the cache entry manually and check the expire timestamp.

Berdir’s picture

catch’s picture

I think only #3 and #4 aren't covered by the other issue.

I opened #2717207: Add a time service to provide consistent interaction with time()/microtime() and superglobals for #3. Agreed it doesn't matter here but the global constant won't be available to components in a subtree split and no point defining our own if PHP's got one. Would rather do that all at once and add a DrupalCS rule for it though, then deprecate the constant.

#4 ought to be doable.

alexpott’s picture

Here's the requested test.

Berdir’s picture

Status: Needs review » Reviewed & tested by the community

Thank you, I think this is good to go then?

Wim Leers’s picture

Status: Reviewed & tested by the community » Needs work

I think this looks good now, but there's too many nits left to be fixed on commit, so moving back to NW. Sorry.

  1. +++ b/core/modules/page_cache/src/StackMiddleware/PageCache.php
    @@ -244,14 +245,33 @@ protected function fetch(Request $request, $type = self::MASTER_REQUEST, $catch
    +    // via cache tags locally. They also ignore max age, since this allows the
    +    // local page cache to be treated differently from external proxies or CDNs.
    

    This is a bit misleading: some external proxies (e.g. Varnish) and some CDNs (e.g. Fastly and CloudFlare) also support cache tags. So they can use the same strategy.

  2. +++ b/core/modules/page_cache/src/StackMiddleware/PageCache.php
    @@ -244,14 +245,33 @@ protected function fetch(Request $request, $type = self::MASTER_REQUEST, $catch
    +      // Cache for an hour by default. If the ttl is set to 0 then do not cache
    

    Nit: s/ttl/TTL/

  3. +++ b/core/modules/page_cache/src/Tests/PageCacheTest.php
    @@ -374,6 +375,30 @@ function testPageCacheAnonymous403404() {
    +      // Ensure the expires date on the cache entry uses negative_cache_ttl.
    

    s/expires date/'expire' field on the cache item/

  4. +++ b/core/modules/page_cache/src/Tests/PageCacheTest.php
    @@ -374,6 +375,30 @@ function testPageCacheAnonymous403404() {
    +      $cache_entry = \Drupal::service('cache.render')->get($this->getUrl() . ':html');
    

    AFAIK we always call this $cache_item?

  5. +++ b/core/modules/page_cache/src/Tests/PageCacheTest.php
    @@ -374,6 +375,30 @@ function testPageCacheAnonymous403404() {
    +      // Account for any rounding errors by rounding the difference ot the
    

    s/ot/to/

  6. +++ b/core/modules/page_cache/src/Tests/PageCacheTest.php
    @@ -374,6 +375,30 @@ function testPageCacheAnonymous403404() {
    +      // nearest 10. As long as negative_cache_ttl is a multiple of 10 then this
    

    I don't understand this "multiple of 10" stuff?

  7. +++ b/core/modules/page_cache/src/Tests/PageCacheTest.php
    @@ -374,6 +375,30 @@ function testPageCacheAnonymous403404() {
    +      // Getting the 404 page twice you still result in a cache miss.
    

    s/you//

  8. +++ b/sites/default/default.settings.php
    @@ -420,6 +420,19 @@
    + * backends which do not support LRU can purge older entries. To disable caching
    + * set the value to 0. Currently applies only to page_cache module.
    

    s/To disable caching/To disable caching of client error responses/

mpdonadio’s picture

+++ b/core/modules/page_cache/src/StackMiddleware/PageCache.php
@@ -244,14 +245,33 @@ protected function fetch(Request $request, $type = self::MASTER_REQUEST, $catch
+      if ($settings_ttl > 0) {
+        $expire = REQUEST_TIME + $settings_ttl;
+      }
+    }

Wouldn't `$request->server->get('REQUEST_TIME')` make more sense here since it is available to avoid the global? Also a few lines down.

alexpott’s picture

Status: Needs work » Needs review
FileSize
5.19 KB
5.58 KB

re #40 / @Wim Leers (thanks for the review)

  1. Change the comment to not talk about reverse proxies / CDNs. Doesn't actually seem that relevant.
  2. fixed
  3. fixed
  4. fixed
  5. fixed
  6. Made more explicit and change commentary
  7. fixed
  8. fixed

re #41/ @mpdonadio (thanks for the review) - sure why not - we have the request.

I also improved a variable name.

catch’s picture

Assigned: Unassigned » effulgentsia
Status: Needs review » Reviewed & tested by the community

I'm not qualified to RTBC the whole patch since I worked on it, but I think I can RTBC the interdiff relative to #38 which was already RTBCed.

Assigning to @effulgentsia since ideally neither me nor Alex Pott commit this one and he said he'd try to look at it.

Status: Reviewed & tested by the community » Needs work

The last submitted patch, 42: 2699613-42.patch, failed testing.

catch’s picture

Status: Needs work » Reviewed & tested by the community

Re-test was fine, looks like a random fail.

effulgentsia’s picture

Assigned: effulgentsia » Unassigned
Status: Reviewed & tested by the community » Needs review
  1. +++ b/core/modules/page_cache/src/StackMiddleware/PageCache.php
    @@ -243,15 +244,33 @@ protected function fetch(Request $request, $type = self::MASTER_REQUEST, $catch
    +    // 403 and 404 responses can fill non-LRU cache backends and generally are
    

    Can 403 responses fill a cache? How?

  2. +++ b/sites/default/default.settings.php
    @@ -420,6 +420,20 @@
    + * Cache TTL for client error (4xx) responses.
    ...
    +# $settings['negative_cache_ttl'] = 3600;
    

    I read the comments in this issue, such as #15 and #28, but I'm not convinced this is a great name, because:

    1. Per #27, there's nothing in the name that scopes it to HTTP responses or renderings performed as part of an error response. What if we at some point decide to implement a negative cache for entity loads (i.e., cache that there is no entity for a given ID)? Should such caches and all other negative caches use the same TTL as for 4xx responses? A setting with this name would imply yes, but I'm not clear on why that would be desirable. For example, NSCD has a negative TTL setting, for which an hour would be way too long.
    2. If the intention behind the name is to imply a parallel with Squid and other web caches, then consider that Squid warns people that a non-0 negative_ttl violates HTTP spec. But the way we're using it in this patch, it does not violate spec, because of the 4xx-response cache tag that we're able to clear whenever something changes that might affect that response. So using a parallel name might actually create a more harmful implication in people's minds than a helpful association.
    3. Per my comment above, do we even want to use this setting for 403 responses? The title and IS of this issue is about 404. And the problem with respect to 404s for this issue isn't that they're negative responses, but that they're an unbounded set. But don't we also have other unbounded sets? For example, appending arbitrary query strings to URLs with 200 responses. So would a setting name that focuses more on unboundedness than on negativity be better?

Unassigning from myself pending feedback on the above.

catch’s picture

#1 - let's say you have a site with a million users, and anonymous users don't have permission to view user profiles. Someone could try to visit all 1m profiles via a script and fill your cache. Regular browsing and search engine crawling would not hit those profiles because they're not linked from anywhere.

You could argue that a site where anonymous users can see the profiles could also get the cache filled, but at least in that case the caching is likely to have a higher hit rate. So I think the comment is valid but interested what you think after the explanation.

#2 - I don't have a strong opinion on this, are you suggesting $unbounded_cache_ttl? Or we could do $client_error_cache_ttl (although that makes it sound like we're controlling the client like max_age which we're not).

catch’s picture

Assigned: Unassigned » effulgentsia
Status: Needs review » Reviewed & tested by the community

Back to Alex - could use concrete suggestions for something other than $negative_cache_ttl

Berdir’s picture

I think a more interesting scenario is the one from the meta issue. Migrating an existing site to Drupal 8, and possibly not migrating all features/content.

And then google wakes up and notices the big change and starts to checks all the old URL's. We have tens of thousands of requests due to that.

effulgentsia’s picture

Status: Reviewed & tested by the community » Needs review

Re #47.1: Yep, thanks for that example.

Re #47.2: My current thinking is:

  • If we think we'll want to scope this setting to only cache items (full responses in this issue plus fragments in #2714227: Use negative_ttl for render cache/url cache context on 404s) specific to 4xx conditions, then I suggest naming it 4xx_cache_ttl.
  • If we think we'll want to use the same setting for per-URL cache items generated for URLs that are valid and accessible, but include arbitrary query strings, then I suggest naming it unbounded_url_cache_ttl.
alexpott’s picture

I'm ok 4xx_cache_ttl - does what it says on the tin. However $4xx_cache_ttl does not work as a php variable so I guess we should go with 'cache_ttl_4xx'. I introduced the idea of negative_cache_ttl in #15. I still think 404 is a failure and this is a negative cache but the name is still meaningful.

In discussion with @dawhener we realised that we are using time(), REQUEST_TIME and $request->server->get('REQUEST_TIME'); - that's insane. So We've standardized on $request->server->get('REQUEST_TIME');

dawehner’s picture

+++ b/core/modules/page_cache/src/StackMiddleware/PageCache.php
@@ -243,15 +244,33 @@ protected function fetch(Request $request, $type = self::MASTER_REQUEST, $catch
+      $expire = ($date > time()) ? $date : Cache::PERMANENT;
...
+    if ($expire === Cache::PERMANENT || $expire > REQUEST_TIME) {

Things are quite inconsistent / unreadable here. Let's use one variable for a consistent time and not deal with relativity theory.

dawehner’s picture

Status: Needs review » Reviewed & tested by the community

Thank you alex!

mpdonadio’s picture

In discussion with @dawhener we realised that we are using time(), REQUEST_TIME and $request->server->get('REQUEST_TIME'); - that's insane. So We've standardized on $request->server->get('REQUEST_TIME');

Thank you. :)

catch’s picture

$cache_ttl_4xx is fine with me, and thanks for the request time standardization. See #2717207: Add a time service to provide consistent interaction with time()/microtime() and superglobals for more request time fun.

effulgentsia’s picture

Thanks. I'm pretty happy with this patch now, and am tempted to commit to 8.2.x. However,

+++ b/core/modules/page_cache/src/StackMiddleware/PageCache.php
@@ -243,15 +244,34 @@ protected function fetch(Request $request, $type = self::MASTER_REQUEST, $catch
-    $date = $response->getExpires()->getTimestamp();
-    $expire = ($date > time()) ? $date : Cache::PERMANENT;
+    if ($response->isClientError()) {
...      (nothing that checks getExpires())
+    }
+    else {
+      $date = $response->getExpires()->getTimestamp();
+      $expire = ($date > $request_time) ? $date : Cache::PERMANENT;
+    }

I'm nervous about committing this to 8.1. #26 is a reasonable comment, but are we sure we want to apply that reasoning to the production branch? Especially if there's a tag released between this getting committed and #2352009: Bubbling of elements' max-age to the page's headers and the page cache?

catch’s picture

#2352009: Bubbling of elements' max-age to the page's headers and the page cache I definitely wouldn't put into a patch release, possibly not even a minor without a configuration option.

This issue I can see either way for 8.1.x or not, but it'd be good to get it into 8.2.x in case anyone is testing that branch, even if it waits longer for 8.1.x

effulgentsia’s picture

Ticking credit boxes.

  • effulgentsia committed cb310b6 on 8.2.x
    Issue #2699613 by alexpott, catch, Wim Leers, Berdir, dawehner,...
effulgentsia’s picture

Status: Reviewed & tested by the community » Needs review

Pushed to 8.2.x.

Setting to NR for 8.1.x. Per #56, I think for that branch we might want to add the Expires header checking for the 4xx responses as well, for BC for the 0-2 production sites that might be relying on it. If we agree to do that, might make sense to add that to 8.2 as well to keep them in sync, unless #2352009: Bubbling of elements' max-age to the page's headers and the page cache lands before then.

catch’s picture

I don't think we should add any more support for the Expires header than we already have - the behaviour is completely undocumented.

alexpott’s picture

I agree with @catch - especially considering max-age is set later on (if configured) and preferred by browsers and reverse proxies.

alexpott’s picture

Ticked credit box for @effulgentsia - since his reviews have definitely influenced the direction of this patch.

effulgentsia’s picture

I don't think we should add any more support for the Expires header than we already have

Ok, but currently in 8.1, an Expires header can control the expiration of all page cache entries: whether 2xx or 4xx. Cherry-picking #59 to 8.1 would mean that a production site would see the Expires header no longer controlling the expiration of 4xx page cache entries. So this isn't about adding "any more" support. It's about whether we're ok with removing existing functionality in a patch release.

the behaviour is completely undocumented

Yes, but developers can read code. If as a developer, you've had a reason to want an expiration for a given page cache entry, you surely would have been able to step through the code and end up finding the PageCache::fetch() method, and seen not only the code for how $expire is determined, but even the code comment of Get the time expiration from the Expires header.

I agree with @catch - especially considering max-age is set later on (if configured) and preferred by browsers and reverse proxies.

Yes, a developer finding the above would surely be puzzled/annoyed that you need max-age for browsers and reverse proxies and Expires for Drupal's page_cache.module. And like with other things one gets puzzled/annoyed by, would eventually accept that that's how Drupal core works, and provide both headers in order to control both things. How does changing page_cache.module to no longer obey a shorter Expires header than $cache_ttl_4xx for 4xx responses in a patch release help those developers? Especially, when fixing page_cache.module to use the max-age header is not yet committed to 8.1, and per #57 probably won't be.

catch’s picture

OK to put it differently, I'm personally not going to work on adding a bc-layer for the Expires header for 8.1.x, I'd be open to moving this back to 8.2.x and marking it fixed there as an alternative.

If someone else wants to pick this up to add a bc-layer for Expires to 8.1.x then they should feel free to, but let's not then forward-port that to 8.2.x

alexpott’s picture

Version: 8.1.x-dev » 8.2.x-dev
Status: Needs review » Fixed

So let's do the simple thing and mark this fixed. I'm slightly concerned about 8.1.x sites being vulnerable to this but if it becomes an issue we can revisit the decision. I think one thing that speaks for ignoring the the expires header is what is the worst that can happen? Nothing it going to be broken.

effulgentsia’s picture

I think one thing that speaks for ignoring the the expires header is what is the worst that can happen? Nothing it going to be broken.

If an 8.1 site currently has some particular URL that returns a 4xx response with a short Expires header (e.g., <1 hour) for anonymous users, and that response then changes after that expiration time (e.g., maybe becomes not a 4xx due to some time-sensitive condition), with no corresponding clearing of the general 4xx-response cache tag or any other cache tag occurring (because if the condition is based on time, then you don't need a cache tag, right), then currently, such a site works as designed, whereas cherry picking #59 to 8.1.x would make such a URL return its old 4xx response for the global $cache_ttl_4xx duration, even if that's longer than the URL-specific time-based condition. This might be an edge case that 0 production sites currently have, but we don't know if that's the case. If such a site has this edge case, then for that site, this would appear as something broken.

Happy to leave this fixed for 8.2 though, and only revisit the BC requirement for 8.1 if we decide we need this in 8.1.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

effulgentsia’s picture

Assigned: effulgentsia » Unassigned
Issue tags: +8.2.0 release notes

Retroactively unassigning myself (I should have done that in #60).

Also tagging for a release notes mention.