Problem/Motivation
In #2158003-62: Remove Block Cache API in favor of blocks returning #cache with cache tags, "Forever" was introduced as an option for the Page cache maximum age setting on the system performance form. If the option is selected, cacheable pages will be delivered with a Cache-Control: public, max-age=31536000, i.e. every browser and intermediary proxy is instructed to cache the response for up to one year.
The following code comment introduced into drupal_serve_page_from_cache() points out the motivation behind this change:
// RFC 2616, section 14.21 says: 'To mark a response as "never expires," an
// origin server sends an Expires date approximately one year from the time
// the response is sent. HTTP/1.1 servers SHOULD NOT send Expires dates more
// than one year in the future.'
In another issue which I regrettably cannot find atm, it has been pointed out that Google suggests a maximum of one year for non-expirable content:
Set caching headers aggressively for all static resources. [...] Set Expires to a minimum of one month, and preferably up to one year, in the future. [...] Do not set it to more than one year in the future, as that violates the RFC guidelines. [...]
However the cited RFC-section as well as the Google recommendation are referring to the Expires header, which is a totally different beast than Cache-Control header. It is very important to understand that it is only safe to use the Expires header on static content (i.e. aggregated JavaScript with a fingerprint in the URL). As soon as it is necessary to add a Vary header on the response (e.g. Vary: Cookie), the Expires header needs to be set to a date in the past. This is because Expires is HTTP/1.0 and Vary was only introduced in HTTP/1.1.
I sincerely cannot think of a situation where anybody would like to instruct browsers/proxy caches to store HTML content for more than a couple of hours. It would be insane to deliver any markup with a Cache-Control: max-age of one year. I guess this option was introduced by accident.
Proposed resolution
Remove the option.
Remaining tasks
User interface changes
API changes
| Comment | File | Size | Author |
|---|---|---|---|
| #12 | Screen Shot 2019-10-31 at 12.44.52 PM.png | 381.24 KB | alberto56 |
| #1 | 2257807-never-ever-forever.diff | 1.84 KB | znerol |
Comments
Comment #1
znerol commentedComment #2
wim leersFirst: AWESOME patch name :D
Blocks are cached inside Drupal, and are hence not governed by HTTP-defined maximum cache durations. That's why Drupal has the notion of "cache permanently", which effectively means "forever". It is logically possible to cache a Drupal block forever — even though that would probably never happen in practice (e.g. when a module is installed, all caches are cleared). That's why "Forever" (corresponding to
CacheBackendInterface::PERMANENT) is an option for a block's "maximum age" setting.The reason pages also received this option is simple: because we wanted consistency in the available choices for the "maximum age" setting across blocks and pages.
You're right that this makes little sense for pages though. The "1 day" maximum age is probably more than sufficient, for the reasons you cite. Even more so, because for pages, Drupal explicitly supports HTTP revalidation (
If-Modified-SinceandIf-None-Match). So I'd be fine with removing this option from the UI.However, I think that the check to ensure the maximum age never exceeds one year is still necessary, because people might modify their config directly to set values higher than one year, which would then still lead to wrong headers. Thoughts?
Comment #3
znerol commentedThe RFC as well as Google page speed document refer to the
Expiresheader, however we're enforcing themax-agedirective of theCache-Controlheader. This just looks wrong.Also I cannot think of a use-case where it would be necessary to specify an insanely big default max-age value for rendered markup. This is certainly different for static assets but that piece of code never is run for those.
Therefore I'd prefer to remove this rather confusing check.
Comment #4
wim leersRE:
Expiresvs.Cache-Control: great point!No remaining objections, then. Let's get this in!
Comment #5
catchI think it's reasonable to set this up to a week - if you have a lot of pages + infrequent updates + cache tag support working how it should with varnish then why not cache for a long time. However yeah, not for a year. Committed/pushed to 8.x, thanks!
Comment #7
znerol commentedI'd rather recommend to just set the max-age to the value which should be propagated to the client (e.g. a couple of minutes) and instead increase the
beresp.ttlfrom withinvcl_fetch, but only on those responses with theX-Drupal-Cache-Tagsheader. This approach is robust insofar that responses lacking the cache-tags header for some reason will not be cached indefinitely. It also avoids problems with stale browser caches when users accidentally bypass Varnish (due to misconfiguration or changes in the setup, e.g. when SSL traffic is routed around the reverse proxy).Set max-age to the maximum you consider acceptable for caches you don't have control over (public proxies, browser cache). Increasing it over that limit is dangerous.
Comment #8
catchWell we also allow max-age to be overridden dynamically by different requests.
So for that approach I'd normally set Varnish to respect max-age from Drupal, but then rewrite the value for the responses it serves to be shorter. Otherwise beresp.ttl can't be affected by Drupal at all.
Comment #9
znerol commentedThink about the reason why it is currently desirable to enforce a
beresp.ttlfrom within Drupal (7). In my opinion this is due to the lack of an easy-to-use and reliable mechanism capable of selectively purging pages from the gateway cache when content changes (it can be challenging to set up the respective contrib modules on a complex site). However, with cache-tags this restriction is gone and with it also the necessity to specify differentberesp.ttlvalues in Varnish. In fact storing anything with e.g.beresp.ttl = 1yregardless of the max-age value returned by Drupal corresponds exactly to the behavior implemented by the internal page cache since #730060: Replace CACHE_TEMPORARY, cache_clear_all() and minimum cache lifetime with cache tags support.Comment #11
alberto56 commentedHere are a two examples:
Right now, in the Acquia article mentioned above, Acquia explicitly suggests setting the max-age to more than one day:
drush @site.env config-set -y system.performance cache.page.max_age 2764800So Acquia, at least, thinks there is a usecase for this. If you run the above command then visit /admin/config/development/performance, it will show "No Caching" as being selected. If you save, the config will be reverted to zero instead of 2764800.
Comment #12
alberto56 commentedHere is another usecase: the Purge module gives a warning if the TTL is less than a month, as in the enclosed image.