Problem/Motivation

In #2752961: No reliable method exists for clearing the Twig cache we're evaluating how to reliably clear the Twig cache upon deploying to production in a multi webheads environment. To achieve that, the proposed resolution involves rebuilding Drupal caches and invalidating all URLs with the rendered cache tag if there's e.g. Varnish and/or a CDN in front.

Since Drupal 8 caches a lot more things than Drupal 7 and cache tags now allow changes to show up immediately on a site, even with Varnish or a CDN is in front (see the Purge API refactoring for D8), it's a shame we have to resort to such expensive operations when e.g. only a minor cosmetic change is made to a Twig template.

Proposed resolution

Upon deploying to production, we should find a less expensive way to only clear out what really needs to be cleared (e.g. render cache) and also make it easy for e.g. contrib to hook into Drupal and invalidate the rendered tag (or any other tag that would make sense and ideally be less 'global') when a cache rebuild has been initiated.

Remaining tasks

Discuss and evaluate options.

User interface changes

None.

API changes

None anticipated.

Data model changes

None.

Comments

anavarre created an issue. See original summary.

nielsvm’s picture

In my opinionated opinion, I see only one sustainable answer to the problem in question:

  • Removal of the "Clear all caches" button.
  • Removal of the drush cr command.
  • Consider removal of API functions like drupal_rebuild.

I know that what I'm proposing sounds (and perhaps smells) pretty radical, but the truth of the matter is that many Drupalists continue to clear all caches upon every single deployment and it isn't even exceptional to see this being a standard part of continuous integration scripts. However, in real-life no single deployment is the same and many different kind of operations happen after a code deploy, for example:

  • Module installation.
  • Module uninstallation.
  • Changes to TWIG template files.
  • New plugins became available.
  • Removal of plugins.
  • Features that changed (drush fra).
  • Config changes.

Once the ability to destroy your sit.... clear all caches has been taken away, the individual actions (and also Drush commands and UI functionality) that lead to changes or to effects that require specific cache clears, then those code paths should do exactly that: clear the relevant tags. Sometimes this will be rendered but more often than not it will be more narrow than this.

In a way the purge API is just an extension to core's caching system and as long as we're culturally acceptive of the fact that clearing all caches is a normal thing to do (its on t-shirts!), we'll continue to see sites going down for tiny deployments like template changes. Obviously I understand that it is pragmatic to do more wide-sweeping clearances (e.g. with rendered) because it would otherwise be too complicated to accomplish, but that is still better than wiping out all caches at the same time.

One intermediate scenario that I could think off is allowing to clear caches as long as development.services.yml is loaded, but not otherwise.

Wim Leers’s picture

Issue tags: +Twig, +D8 cacheability

To achieve that, the proposed resolution involves rebuilding Drupal caches and invalidating all URLs with the rendered cache tag if there's e.g. Varnish and/or a CDN in front.

[…] it's a shame we have to resort to such expensive operations when e.g. only a minor cosmetic change is made to a Twig template.

We can quite easily solve this: if we assign a cache tag to every theme hook, so that whenever the corresponding Twig template is modified (or added, in case it did not exist, or in case a more specific one is added), that cache tag is invalidated. But that's going to mean dozens more cache tags per HTML response. This is a hard requirement to do what you describe.
Adding the Twig and D8 cacheability issue tags.


but the truth of the matter is that many Drupalists continue to clear all caches upon every single deployment and it isn't even exceptional to see this being a standard part of continuous integration scripts

Indeed. But that's D9 material at best.

The problem is once again the same: dependencies, dependencies, dependencies. I've been talking about this for years: dependencies, dependencies, dependencies! Yes, it's annoying. But without it, we can't actually have a reliable system. There's a reason that modules and configuration specify dependencies. I was not involved with either of them. But on so many other levels, we're lacking dependency metadata.

Wim Leers’s picture

Title: Make the deployment process less expensive in a multi webheads environment » Deploying Twig template changes is too expensive: it requires all caches to be completely invalidated, as well as all reverse proxies

Clarifying issue title.

moshe weitzman’s picture

@wim - can you elaborate on the cache-tag-per-theme-hook idea? How would drupal learn that fivestar.whatever.twig has changed for a given deployment?

This problem isn't just limited to twig files. When Fivetar.php changes its code, then it can affect rendered output as well. Are we going to have cache tags for php files? I think most would say that this is too much. There is a risk here that curing this problem adds more complexity that its worth. Most people dont mind a cache clear when deploying a new footer.twig, for example.

Wim Leers’s picture

#5: First: the Twig question.

We'd need a cache tag for the precise invocation + associated code. For example:

$field_name = 'foo';
…
'#theme' => 'links__entity__comment__' . $field_name,

(from \Drupal\comment\CommentLinkBuilder::buildCommentedEntityLinks())

This would need a twig:links__entity__comment__foo cache tag that would be invalidated whenever:

  1. there is a change in which Twig template is used for this, e.g. if a subtheme is overriding a base theme template
  2. the underlying Twig template is modified via code deployment, i.e. when $settings['deployment_identifier'] changes
  3. the underlying Twig template is modified via local development, this would only work when twig.config.auto_reload is enabled (or debug of course)

I think this strategy would work for individual template changes.

Wim Leers’s picture

#5 Now the general code question.

You're absolutely right, this would not be manageable. Not even remotely manageable. Because a code deployment change could lead for example to

'#theme' => 'links__entity__comment__' . $field_name,

being modified to

'#theme' => 'links__entity__comment__BLAH__' . $field_name,

Result: 💥 — there's no way for us to know that suddenly there is a different cache tag at play for this particular page. This is impossible.

Wim Leers’s picture

Are we going to have cache tags for php files? I think most would say that this is too much. There is a risk here that curing this problem adds more complexity that its worth.

I could not agree more. Sorry for not making that more clear in #3.

Basically, this issue is about the following problem:

I have a codebase that has a single entry point (index.php), with many routes/URLs, and with a reverse proxy in front of it.

Therefore any code in the codebase could result in changes to the responses for one route/URL, or for all routes/URLs. I want to ensure that changes in my code that only affect responses for 1 or a few routes/URLs do not require all cached responses to be invalidated.

There are only two possible solutions:

  1. For every code deployment, also determine ahead of time which routes/URLs would need to be invalidated. Put this in a YAML/JSON/blah file and let deployment invalidate those responses in the reverse proxy.
  2. Ensure that you have dependency data for every response that tracks all data that is involved (we already have cache tags for that), but also all code that is involved (that's what I described in #3).

Clearly, both approaches are mind-bogglingly onerous. Which is why I think this is not feasible. And indeed a cache clear is going to be the only feasible solution complexity-wise.

anavarre’s picture

#3

if we assign a cache tag to every theme hook, so that whenever the corresponding Twig template is modified (or added, in case it did not exist, or in case a more specific one is added), that cache tag is invalidated. But that's going to mean dozens more cache tags per HTML response. This is a hard requirement to do what you describe.

Understood. With 16M HTTP headers across the stack (Apache, nginx, Varnish, and potentially an external CDN/reverse proxy) you can still sometimes get into 'too large HTTP headers' WSODs. So I agree that it'd be a hard requirement and would likely put users at even a greater risk of running into troubles, at least until #2241377: [meta] Profile/rationalise cache tags lands.

#8

And indeed a cache clear is going to be the only feasible solution complexity-wise.

Take this very plausible use case:

  • Drupal site powers an e-commerce solution
  • It's Black Friday and you have a very important discount to push to prod
  • Traffic is crazy high, but you've anticipated the spike in traffic by adding more web servers. So far so good.
  • Upon deployment, you need to rebuild caches and have no mechanism easily available to clear out outdated cache tags, thus you decide to flush Varnish cache
  • A couple minutes after, your site goes down because you have no Varnish caching available and no Drupal caches to protect you either
  • The most important day for your business now becomes an epic failure and downtime makes you at risk of losing significant revenue and trust from your customers

I understand it's a very complex problem to solve and there's not an immediate answer to doing things safely and easily, but if we put ourselves in the customer's shoes, this is not acceptable and makes cache tags moot for critical business operations.

dawehner’s picture

Its not only that, you don't even know easily which twig template changed. You'd need to know exactly the code before and after, this is so much more than what we do usually.

On top of that though this really just tackles template changes. There are all these other kind of changes, arbitrary changes in PHP code. There is no system how we can determine which code was involved to render a part of the page.

For me the solution is rather a different deployment/caching strategy, as in, use the old caches and then slowly build up the new ones.

Wim Leers’s picture

use the old caches and then slowly build up the new ones.

Yes. stale-while-revalidate: https://www.fastly.com/blog/stale-while-revalidate-stale-if-error-availa....

BTW, the way companies like Facebook solve this, is by making users stick to a certain server (or in their case, server rack). They deploy to production all the time. But they roll it out in batches. And they ensure their data model only changes in an additive way, with the code handling the case of field X not yet being defined (fall back to default value, or "edit X" link). That way, the code version does not need to be in sync with the data schema/config schema version anymore.

dawehner’s picture

Yes. stale-while-revalidate: https://www.fastly.com/blog/stale-while-revalidate-stale-if-error-availa....

This is super nice, thank you for the suggestion!

BTW, the way companies like Facebook solve this, is by making users stick to a certain server (or in their case, server rack)

I would have expected datacenter :P

Wim Leers’s picture

I would have expected datacenter :P

:D They're big, but not that big that each data center is only 1-2 percent of their users :P

anavarre’s picture

#11

Wow, I didn't know about the stale-while-revalidate HTTP Cache-Control extension!

https://tools.ietf.org/html/rfc5861 states that:

The stale-while-revalidate HTTP Cache-Control extension allows a cache to immediately return a stale response while it revalidates it in the background, thereby hiding latency (both in the network and on the server) from clients.

Looks, very, very promising and it would be the one thing that wouldn't turn me off when it comes to deploying and wiping caches like we're currently forced to do.

Unfortunately browser support is far from being a reality I'm afraid. To talk only about Chrome (since it has the most market share), they've stopped working on it ATM :-(

Wim Leers’s picture

#14: the point is that you can use stale-while-revalidate in Surrogate-Control to control your reverse proxies. This means you can deploy it today to make your infrastructure work more efficiently.

Also being able to use this all the way to the client would be wonderful, but is not essential. With the client, you can work with a low max-age, to ensure speedy updates for clients.

bonus’s picture

Do we want to consider adding support for stale-while-revalidate or stale-if-error statements to Cache-Control headers in the response from core?

Wim Leers’s picture

We may want to consider adding support for it to the page_cache module. But let's not forget that on the wider web, pretty much only Fastly supports it out of the box.

Version: 8.4.x-dev » 8.5.x-dev

Drupal 8.4.0-alpha1 will be released the week of July 31, 2017, which means new developments and disruptive changes should now be targeted against the 8.5.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.