Problem/Motivation

Twig caching produces a directory structure looking like the following:

sites/default/files/php/twig
  - 5915e35d823c3_container.html.twig_5ytYVUs_gdkx819cOy1xqJ2Z_
   - 5915e35d823c3_datetime-form.html.twig_F5e-roY2x9_yW4qxQPvi924S5
   ...

These file names consist of

  • The prefix, generated by uniqueid() everytime the available twig extensions change, which is triggered by module install,
    and as such problematic on multi webhead environments. This prefix is stored inside state to be shared across all webheads, already.
  • The twig filename
  • Protection provided by PHP file storage

The current method of clearing the Twig cache (as seen in drupal_flush_all_caches()) is this:

  // Wipe the Twig PHP Storage cache.
  PhpStorageFactory::get('twig')->deleteAll();

By default (see \Drupal\Core\PhpStorage\PhpStorageFactory::get()), it is stored in the public:// stream wrapper, specifically in public://php/:

    if (!isset($configuration['directory'])) {
      $configuration['directory'] = PublicStream::basePath() . '/php';
    }

So, by default, it is written to public://, which by definition is shared across all webheads.

However, on hosting environments that use multiple web heads (such as Acquia Cloud Enterprise), the public:// stream wrapper is backed by a high-latency distributed file system. This is fine performance-wise for downloadable files (because they can be cached by a reverse proxy such as Varnish), but not for files that are required tens if not hundreds or thousands of times per second, which is the case for compiled Twig templates. Even when those files are cached in PHP's OpCache, the file system is checked to see whether the file has changed in the system, still resulting in many file system hits. If that file system then has high latency, this leads to terrible performance.

So, Acquia Cloud Enterprise, like many other Drupal hosting platforms, does something like this in settings.php:

$settings['php_storage']['twig'] = array(
  'directory' => '/some/writable/local/directory',
);

The result is that on hosting platforms that do this, this code:

  // Wipe the Twig PHP Storage cache.
  PhpStorageFactory::get('twig')->deleteAll();

really only clears the local directory that stores the Twig cache, on the current webhead only.

The root cause of this problem is quite simple: drupal_flush_all_caches() assumes that the Twig PHP Storage Cache's directory is shared across web servers.

Proposed resolution

Either:

  1. use $settings['deployment_identifier'] to key compiled Twig templates by — having multiwebhead hosting platforms set this to the git commit hash would then solve it (#5 (catch) + #8 (Wim Leers) + #14 (Berdir))
  2. add a Twig::invalidate() API that drupal_flush_all_caches() would call, which clears the state variable we already have. This enforces a recalculation of the prefix across all webheads (#7 (dawehner) + #10 (msonnabaum).

Concerns for solution 1 are that nobody knows about $settings['deployment_identifier']; the general response to that is that you already need to know about that anyway (se #14).

Concerns for solution 2 are that it makes PHPStorage dependent on State (or Cache).
This is no longer a valid concern, as it already depends on it.

A concern that applies to both is that stale files can linger forever (#48 + #49 + #52). This is impossible to solve though; Drupal can't be responsible for deleting those stale files, because it may require a long-running process to delete potentially many thousands of files. Therefore, it must be up to each hosting platform to delete those files in a background job. #49 contains one possible approach (the one pioneered by Pantheon).

Remaining tasks

User interface changes

None.

API changes

None.

Data model changes

None.

Comments

msonnabaum created an issue. See original summary.

cilefen’s picture

dsouza_rohan’s picture

Wim Leers’s picture

catch’s picture

We should at least use $deployment_identifier to key twig templates by.

catch’s picture

Also general note I agree with splitting this out from the phpstorage issue - there are things like $deployment_identifer we can do that don't require changes to phpstorage itself.

dawehner’s picture

In order to invalidate information across multiple webheads one needs a way to share information in an easy way. Much like we are now using the cache backend for the container, we could use a state => cache entry which is used by the files:

  • Twig::invalidate() => generates a new random value for the twig hash, invalidates the cache, stored in state
  • On runtime the phpstorage uses this hash for its hash loading, potentially cached, so no DB query is required

Would that simply work?

Wim Leers’s picture

Why not just deployment identifier?

What you say can work too (the random hash would be cleared in a single shared location and would therefore affect all web heads), but what's the benefit?

IMO the benefit of a deployment identifier is that you also prevent race conditions when two web heads are not updated at the exact same time (eg a few seconds apart).

dawehner’s picture

Why not just deployment identifier?

Well, of course deployment identifier works for existing people which are aware of it, but well, the fact that people run into the problem show that they don't know about it.
This issue is for solving the problem in a different way potentially.

What you say can work too (the random hash would be cleared in a single shared location and would therefore affect all web heads), but what's the benefit?

At the moment the invalidation just happen on the one webhead which retrieves the invalidation. When we use cache/state we would share the invalidation between web heads automatically, so something like drush cr would actually work as expected.

msonnabaum’s picture

What dawehner proposes is very much what I had in mind. We need to think about this in terms of correctness first.

dawehner’s picture

Status: Active » Needs review
FileSize
2.52 KB

So here is a patch for it. I talked for a bit with @fabianx about it and he seemed to be in favour of it as well.

Status: Needs review » Needs work

The last submitted patch, 11: 2752961-11.patch, failed testing.

dawehner’s picture

Status: Needs work » Needs review
FileSize
3.09 KB
578 bytes

Let's quickly fix this failure.

Berdir’s picture

But the downside of that is that phpstorage now depends on cache/state?

We actually have a cache backend that uses phpstorage, if someone were to use that for bootstrap, that wouldn't end well :)

Yes, you have to know about deployment identifier. But if you do multi-head deployments, then you have to *know* stuff anyway. You have to know what to share, how to share, how to deploy/invalidate.

I think what most of those sites actually want for production, be it single or multi-head, is a phpstorage backend that isn't read only but ignores delete requests, because you know that it will not change unless you actually do a deployment. And then you can either use a deployment identifier or explicitly delete the files on deploy.

The problem that I see with both this patch and the deployment identifier approach is that we don't actually delete the old files. Over time, won't that become a problem?

dawehner’s picture

I see a potential problem in that indeed.

Over time, won't that become a problem?

Mh in order to solve that properly we could introduce some prefix for the filenames which point deployment identifier or so. By that we could cleanup old files when they no longer match to the current deployment identifier.

dawehner’s picture

Here is the alternative approach which at least allows to use the deployment identifier.

Status: Needs review » Needs work

The last submitted patch, 16: 2752961-16.patch, failed testing.

chr.fritsch’s picture

Status: Needs work » Needs review
FileSize
2.1 KB

I fixed the tests to bring back the attention to this issue.

Status: Needs review » Needs work

The last submitted patch, 18: no_reliable_method-2752961-18.patch, failed testing.

Berdir’s picture

Status: Needs work » Needs review

Failed with random fails before.

anavarre’s picture

Any reviewer? It's a pretty stubborn issue with multiple web heads.

dawehner’s picture

+++ b/core/modules/system/src/Tests/DrupalKernel/ContainerRebuildWebTest.php
index 67f1104..0c6ec8a 100644
--- a/sites/default/default.settings.php

--- a/sites/default/default.settings.php
+++ b/sites/default/default.settings.php

+++ b/sites/default/default.settings.php
+++ b/sites/default/default.settings.php
@@ -295,7 +295,7 @@

@@ -295,7 +295,7 @@
  * custom code that changes the container, changing this identifier will also
  * allow the container to be invalidated as soon as code is deployed.
  */
-# $settings['deployment_identifier'] = \Drupal::VERSION;
+$settings['deployment_identifier'] = \Drupal::VERSION;
 
 /**

One Thing I'm wondering ... does this mean we should have some default fallback for the deployment identifier to always contain the Drupal version? Maybe a dedicated method Settings::getDeploymentIdentifier() and use that by default?

catch’s picture

https://api.drupal.org/api/drupal/core!lib!Drupal!Core!DrupalKernel.php/... uses the version string and the deployment identifier.

dawehner’s picture

Right, given that, we should simply not comment out default.settings.php and call it a day?

Cottser’s picture

@dawehner in other words leave default.settings.php as is, correct?

dawehner’s picture

@dawehner in other words leave default.settings.php as is, correct?

Yeah I would think so?

Version: 8.1.x-dev » 8.2.x-dev

Drupal 8.1.9 was released on September 7 and is the final bugfix release for the Drupal 8.1.x series. Drupal 8.1.x will not receive any further development aside from security fixes. Drupal 8.2.0-rc1 is now available and sites should prepare to upgrade to 8.2.0.

Bug reports should be targeted against the 8.2.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.3.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

xjm’s picture

Status: Needs review » Needs work
Issue tags: +Triaged D8 major

@alexpott, @cilefen, @catch, @effulgentsia, and I discussed this issue today and agreed that it is a major bug because it reduces scalability and can lead to invaild caches, fatals, etc. on multiple web heads.

I thought #2301163: Create a phpstorage backend that works with a local disk but coordinates across multiple webheads had been fixed and was going to reference that, but no such luck. This issue was spun off from that one. I started to move this to the theme system, but it looks like the fix is actually still for the PHP storage here.

Is there any way to provide some manner of test coverage here?

Also, sounds like the patch needs work to remove the default.settings.php hunk (or perhaps expand the documentation for the line instead of uncommenting it).

Thanks everyone!

bradjones1’s picture

Re-rolled.

Version: 8.2.x-dev » 8.3.x-dev

Drupal 8.2.6 was released on February 1, 2017 and is the final full bugfix release for the Drupal 8.2.x series. Drupal 8.2.x will not receive any further development aside from critical and security fixes. Sites should prepare to update to 8.3.0 on April 5, 2017. (Drupal 8.3.0-alpha1 is available for testing.)

Bug reports should be targeted against the 8.3.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.4.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

moshe weitzman’s picture

Yeah, I think some more documentation about the value of deployment_identifier would be nice.

I guess the site should increment it for each deployment? So, something like $settings['deployment_identifier'] = \Drupal::VERSION. '-deploy1'. The next time you deploy you write in 'deploy2', and so on. Seems like a hassle compared to the State solution.

P.S. @anavarre et al - Acquia could use its settings.inc to automatically set a unique deployment_identifier.

moshe weitzman’s picture

Is there some other cache backend that twig could/should use in a multi web-head situation?

dawehner’s picture

In a good environment one could use environment variables to get this deployment identifier, see #2501901: Allow configuring the environment via an environment variable as a "discussion" about leveraging those.

bkosborne’s picture

As someone prepping to launch a D8 site on multiple web heads, it seems like there are just two current options to resolve this:

1. After any twig template has changed and has been deployed, SSH into each web server and clear the Twig cache: \Drupal\Core\PhpStorage\PhpStorageFactory::get("twig")->deleteAll();
2. Apply this patch above, then establish a method where the deployment identifier variable is changed every time you deploy updates to Twig templates.

Is that correct?

webflo’s picture

That is correct.

bkosborne’s picture

Heads up for anyone on Acquia: This patch alone won't work. In Acquia's settings.php file that's included on all sites, it sets $settings['php_storage']['twig']['secret'] = $drupal_hash_salt. Not sure why it does that when that's already what core does in PhpStorageFactory. Anyway, you need to unset that secret in your settings.php file so that the logic in PhpStorageFactory runs.

MKorostoff’s picture

Hey guys, just a thought, but would it be possible to make Drupal configurable such that twig cache can reside in a shared location, much like other caches? Given the option, I'd be very happy to store twig cache in a memcache server that is shared amongst all of my apache servers. Thanks!

MKorostoff’s picture

Here's what I don't understand about this issue: I currently see in my project a directory called sites/default/files/php/twig. In this directory is a bunch of files that sure look like they're the twig cache. This is significant because this directory is shared among all my apache servers. So my question is: if this directory isn't the twig cache, what is it? Am I understanding this accurately if I say that these files are the "compiled" output of twig templating, but the actual cache is elsewhere?

Berdir’s picture

Nope, that is the cache, there is nothing else. If they are in a shared directly, then everything is working correct. But the performance isn't great, because writing and accessing those files is not as fast as it could be. Being able to do that in a local directory on each server, especially someting that's actually tmpfs would be faster.

This issue is about supporting such scenarios better.

You can even do local folders already if you include a deployment step to clear that folder on every server on deployment. As long as you don't have a scenario where configuration changes could result in *changed* compiled twig templates. There's nothing in core that does that AFAIK, but it's technically possible.

MKorostoff’s picture

@Berdir, thanks for the reply. So here's what I don't get: currently, on Acquia hosting they are officially recommending to ssh into each individual server and clear twig cache. So if you have 15 apache servers, for the time being, that means opening and closing 15 ssh sessions. The thing is, every acquia production instance has a filesystem that is shared by all of the web servers. So my question is: when I ssh into a server to clear twig cache, what am I clearing? If I were only clearing the contents sites/default/files/php/twig, wouldn't I just have to do it once on any randomly selected web server?

Berdir’s picture

If it is not in a local directory, then that is enough, yes. In fact, a normal full cache clear like drush cr will already do that as well.

bkosborne’s picture

@MKorostoff Acquia does not store the twig cache files on the shared filesystem, instead they are on temp file stores for each web head. I believe I saw a comment in the Acquia config file somewhere where that's set saying it's related to performance.

MKorostoff’s picture

@bkosborne thanks!

deviantintegral’s picture

would it be possible to make Drupal configurable such that twig cache can reside in a shared location, much like other caches? Given the option, I'd be very happy to store twig cache in a memcache server that is shared amongst all of my apache servers.

I did some investigation into this a few months back, as I was thinking the same thing. I'll copy my notes here.

  • My initial thoughts about using memcache likely won't pan out. The PHP opcache doesn't work with eval(), and skipping the opcache is a pretty significant performance hit, especially when sites will have hundreds or thousands of templates. One (sketchy) idea I had was to create a stream wrapper, but that adds a ton of extra code just to load a template.
  • This twig PR makes it clear that upstream twig will never support anything other than the filesystem for templates. There's an interface now instead of being hardcoded, but the answer seems to be "you're doing it wrong" if you can't use the filesystem for compiled templates.

The options I see are, ordered from least to most complex:

  • If we end up resolving this by having a shared temporary directory, then we could fairly easily map templates to temp. Unless we decided to back that with RAM, then we'd be having some sort of persistent storage, in which case we might as well save the complexity and just use public://.
  • Precompile templates during a site build job. Symfony supports this, but Drupal doesn't. We'd have to include this as a part of all builds for all sites. We'd have to write a custom template loader as well. This is somewhat complex as it requires a bootstrapped Drupal to compile templates.
  • Compile at runtime, but without templates being shared across web servers. This could hurt performance after a deployment (each webhead would independently compile templates), and would only work if we could ensure temp is wiped out on all webheads after a deployment.
  • Store template compilations in memcache, and write a stream wrapper so compiled templates can be stored in the opcache. This is the most risky strategy - I'm not even sure if opcaches work with custom stream wrappers.

If I remember right, twig template files use the hash of the file contents in their names. In other words, there isn't much of a reason to clear template caches unless disk space is a concern. After a deployment, even if old templates are on the web nodes, any template changes should result in a different hash.

For this set of sites, we ended up writing and using https://www.drupal.org/project/twig_temp and not worrying about clearing templates as we were deploying with Docker, so filling up all the disk space was never an issue.

deviantintegral’s picture

Oh, and warming the template caches at build isn't an option either, even if you wrote a drush command for it, since #template can be used in render arrays to generate templates at runtime. You could warm all file-based templates, but dynamic templates would still be generated on a per-web-node basis.

Wim Leers’s picture

#42 is correct, Acquia Cloud Enterprise has multiple webheads, and to not hit the slow (because network latency) filesystem that is shared across webheads, it stores compiled Twig templates on the local (ephemeral) storage.

#44: Thanks so much for writing up your thought process! But I'm confused why some of the options you listed even made the list. Feedback below:

If we end up resolving this by having a shared temporary directory, then we could fairly easily map templates to temp. Unless we decided to back that with RAM, then we'd be having some sort of persistent storage, in which case we might as well save the complexity and just use public://.

We should not require the use of a shared temporary directory.

Precompile templates during a site build job. Symfony supports this, but Drupal doesn't. We'd have to include this as a part of all builds for all sites. We'd have to write a custom template loader as well. This is somewhat complex as it requires a bootstrapped Drupal to compile templates.

This is not possible, because in Drupal it's impossible to enumerate all Twig templates. This would be splendid though; it'd remove some deployment risks.

Compile at runtime, but without templates being shared across web servers. This could hurt performance after a deployment (each webhead would independently compile templates), and would only work if we could ensure temp is wiped out on all webheads after a deployment.

That is effectively what Acquia Cloud Enterprise does. Indeed the current work-around is to wipe temp on all webheads after deployment. However, an actual solution would be to do what this patch has been doing: ensure there is an identifier/hash/… that all webheads respect from a single source of truth (either State in the DB, or deployment_identifier in the code, see the discussion in #1 through #15).
Let's not forget this is first and foremost a synchronization/coordination problem.

Store template compilations in memcache, and write a stream wrapper so compiled templates can be stored in the opcache. This is the most risky strategy - I'm not even sure if opcaches work with custom stream wrappers.

Drupal can't require memcache. We could store it in the DB instead, but that'd just mean we'd be using the DB as a cache storage. Per \Drupal\Component\PhpStorage\PhpStorageInterface's docs, that'd be supported. But we'd be incurring network latency for every single file. The whole point is to avoid that latency.

If I remember right, twig template files use the hash of the file contents in their names. In other words, there isn't much of a reason to clear template caches unless disk space is a concern. After a deployment, even if old templates are on the web nodes, any template changes should result in a different hash.

… but the point of caching compiled Twig templates is that we don't need to hit the file system for every single template on every single request. You're right that we hash the contents, but we don't check if the original Twig template actually has been modified on every single request. And the compiled Twig templates are stored in PHP's OpCache, so we don't hit the file system at all.

The problem is that (as the IS says) after a deployment, drupal_flush_all_caches() is called, which calls PhpStorageFactory::get('twig')->deleteAll();, but that only affects the storage on the current machine. All other webheads still have the compiled Twig templates, and are blissfully unaware that Twig templates may have been updated.

So what we need, is some hash in State or the deployment_identifier in settings.php to be modified for every deployment, so that coordination across multiple webheads is possible.

(Getting back into this, so I hope I didn't make any mistakes/am not misremembering certain things.)

Wim Leers’s picture

Issue summary: View changes

Now, let's update the IS. Because it's currently biased towards Acquia Cloud Enterprise: it portrays the problem as something that affects all Drupal 8 sites/hosting, but that's not true.

Also, let's list the proposed solutions that can work, and the concerns raised against them.

Hopefully that gets this issue back on track again :)

Berdir’s picture

#46 looks correct to me. Only comment is that IMHO, if the infrastructure supports it, I would recommend some kind of deployment job that clears the folder on all servers over drupal providing a changing identifier.

Because with such an identifier, the old files will not be removed until you do a cache clear in drupal (and then only on one server), so there is a risk that you might get a considerably large amount of files on some servers, and that in turn can slow down accessing those files and writing new ones.

That apparently seems to be a hard problem to solve (not sure why), platform.sh enterprise does also not support that yet.

Wim Leers’s picture

Issue summary: View changes

Good point, added that concern to the IS.

And funny enough, the Acquia Cloud ticket was updated less than an hour ago with information about an alternative solution, pioneered by Pantheon: https://github.com/pantheon-systems/drops-8/pull/182/files. It does this:

/**
 * Place Twig cache files in the Pantheon rolling temporary directory.
 * A fresh rolling temporary directory is provided on every code deploy,
 * guarenteeing that cached files that may be invalidated by code updates
 * are always fresh. Note that in the case of twig cache files, the
 * rendered results are also cached in the database, so a cache clear is
 * still necessary to see updated results after a code deploy.
 */
if (isset($_ENV['PANTHEON_ROLLING_TMP']) && isset($_ENV['PANTHEON_DEPLOYMENT_IDENTIFIER'])) {
  // Relocate the compiled twig files to <binding-dir>/tmp/ROLLING/twig.
  // The location of ROLLING will change with every deploy.
  $settings['php_storage']['twig']['directory'] = $_ENV['PANTHEON_ROLLING_TMP'];
  // Ensure that the compiled twig templates will be rebuilt whenever the
  // deployment identifier changes.  Note that a cache rebuild is also necessary.
  $settings['deployment_identifier'] = $_ENV['PANTHEON_DEPLOYMENT_IDENTIFIER'];
  $settings['php_storage']['twig']['secret'] = $_ENV['DRUPAL_HASH_SALT'] . $settings['deployment_identifier'];
}

As part of that approach, one could also delete the "old temp" on every deployment. It wouldn't introduce problems because it's not shared across webheads anyway, so there's no coordination problem. And it'd solve the "ever growing file system usage" problem described in #48 AFAICT.

Thoughts?

Does this make solution 2 in the IS (introducing Twig::invalidate(), which updates a hash/key/ID stored in State or Cache) more appealing again?

moshe weitzman’s picture

I support solution #2 for the reasons dawehener mentions in #9. It works for the most people as they would expect. I'm not overly bothered by the fact that invalidation relies on State.

pwolanin’s picture

minor point, related to other fixes we made in the Twig system, we don't necessarily want a random value to check against, but rather a unique value each time (e.g. microtime() or uniqid() ), since a truly random value can repeat (in theory).

greg.1.anderson’s picture

In the Pantheon solution, the `ROLLING` portion of the `PANTHEON_ROLLING_TMP` environment variable is simply the deployment identifier. As was previously mentioned, deleting the stale twig cache files is not critical, but not deleting them at all causes them to pile up over time. Pantheon solves this problem by deleting stale rolling temporary directories slowly over time in a backend scavenge process. The scavenger can look up the deployment identifier for each binding directory, so it knows to skip the directory that contains the active set of cache files. It also avoids deleting any directory that has been modified too recently, just for good measure. Drupal sites with very complex custom themes can have a very large number of twig cache files, which can sometimes take a significant amount of time to delete. Keeping scavenging out of the deployment process is therefore beneficial, as that's one less thing for deploys to have to wait for.

It seems to me that solving the multi-head cache scavenge problem in Drupal is not the most practical solution. The services that are managing the different heads is better suited for cleanup; deleting a lot of twig files as part of a web request, similar to the way poor man's cron works for sites without cron, would be even worse than doing it at deploy time. If code to do this is added to core, it should be an opt-in service for folks to turn on. Since single-head already works, perhaps multi-head could be solved in documentation only.

@deviantintegral's https://www.drupal.org/project/twig_temp will suffice for some use cases as well.

Wim Leers’s picture

#50: I don't know PHPStorage well enough to know/predict what the consequences are of making it dependent on State. Hopefully @Berdir can chime in.

Since single-head already works, perhaps multi-head could be solved in documentation only.

+1!

Wim Leers’s picture

I just noticed that #2569119: Use Crypt::hashBase64(), not hash('crc32b') or sha1 for placeholder tokens made some changes to TwigEnvironment. The changes (and the code) are a bit confusing, but

  1. what matters is that there's now a twig_cache_prefix that is recalculated (and set to uniqid()) whenever:
  2. the Twig extension hash changes (which can only happen when the container is rebuilt and \Drupal\Core\DependencyInjection\Compiler\TwigExtensionPass discovers new or updated twig.extension-tagged services during that container rebuild)
  3. the twig_extension_hash_prefix State entry does not exist (either because it's a fresh install, or because this key-value pair was deleted)

(The accompanying change record: https://www.drupal.org/node/2833264)

This means that proposed solution 2, which includes the introduction of Twig::invalidate(), could be as simple as this:

public function invalidate() {
  $this->state->delete('twig_extension_hash_prefix');
}

Patch attached.

(I think this is the change that @pwolanin is referring to in #51?)

Wim Leers’s picture

FileSize
1.87 KB
3.16 KB

The twig_extension_hash_prefix key is a very confusing misnomer, because it contains more than just the extension hash prefix.

Let's at least move that to a class constant: that prevents bugs to typos, and allows us to make it clearer.

Wim Leers’s picture

Issue summary: View changes

#48's concern that files can linger, which was confirmed and elaborated on by #52, has been added to the IS by me in #49. I attributed it as a problem for solution 1. But really, it's a problem for both solutions.

IS updated.

Wim Leers’s picture

The patch in #54/#55 implements solution 2. But we could actually have it implement both solutions. Essentially by changing

new TwigPhpStorageCache($cache, $this->twigCachePrefix)

to:

new TwigPhpStorageCache($cache, $this->twigCachePrefix . Settings::get('deployment_identifier'))

Because the current solution still requires drupal_flush_all_caches() to be called to get updated Twig templates. We shouldn't require that.

In other words:

  1. #54/#55 basically fixed the "Clear all caches" button at /admin/config/development/performance for multiwebhead setups
  2. but it still doesn't fix the "code deployment with updated Twig templates" problem

Also relying on the deployment identifier would fix that.

Thoughts?

Wim Leers’s picture

Also, per #56, we'll need documentation for multiwebhead hosting platforms, so it's clear what Drupal expects them to do/provide. And undoubtedly we want a change record to announce that to the world.

dawehner’s picture

  1. +++ b/core/lib/Drupal/Core/Template/TwigEnvironment.php
    @@ -63,7 +77,7 @@ public function __construct($root, CacheBackendInterface $cache, $twig_extension
         if ($options['cache'] === TRUE) {
    -      $current = $state->get('twig_extension_hash_prefix', ['twig_extension_hash' => '']);
    +      $current = $state->get(static::CACHE_PREFIX_METADATA_KEY, ['twig_extension_hash' => '']);
    

    So does that mean we have an uncached state query on every request in HEAD?

  2. +++ b/core/lib/Drupal/Core/Template/TwigEnvironment.php
    @@ -83,6 +97,15 @@ public function __construct($root, CacheBackendInterface $cache, $twig_extension
       /**
    +   * Invalidates all compiled Twig templates.
    +   *
    +   * @see \drupal_flush_all_caches
    +   */
    +  public function invalidate() {
    +    $this->state->delete(static::CACHE_PREFIX_METADATA_KEY);
    +  }
    +
    

    We should document here how this helps multi-webhead deployments, IMHO.

Wim Leers’s picture

#59:

  1. Yep, since #2569119: Use Crypt::hashBase64(), not hash('crc32b') or sha1 for placeholder tokens… Introduced by @pwolanin in #2569119-62: Use Crypt::hashBase64(), not hash('crc32b') or sha1 for placeholder tokens based on discussion with @alexpott apparently. Fabianx RTBC'd. Nobody thought of that. (Man, I really wish we had automated performance tests, that'd have caught this: #2308683: [meta] Create a 'Profiling' install profile, for testbot-powered simple profiling and easier local profiling.)
  2. Oh, yes, docs are still needed. For now, just asking for a review of the approach in #55 (which implements solution 2 in the IS), as well as what thoughts are about also implementing solution 1 as I proposed in #57.
dawehner’s picture

Sorry for not commenting on the actual fix. I think it solves the problem perfectly!

Crell’s picture

Hi, representative from a multi-webhead host here. :-)

If I'm following the proposal here properly, it's just a question of how we generate a unique value for each deploy/change to use, right? Either use a host-provided value or generate a random one on each cache clear (which would also happen after a deploy and thus pick up that use case, too)?

We (Platform.sh) have a git tree hash value we can trivially map into the settings array so option 1 would be fine with us. My concern is that it assumes that the templates will only ever change on a deploy; while in a logical world with most frameworks that's the case, it would preclude user-editable Twig files that get compiled like any other Twig file. I'm not sure if anyone is doing that yet, but it's conceptually something that I know we discussed on and off many times in the Panels/SCOTCH world. (Think, basically, Contemplate module but not terrible. Generate a Twig file based on user config, compile it, the compiled version gets used so there's no performance overhead and no PHP is stored in the DB so there's no security issue.)

That would imply option 2 is more robust, as it still leaves the door open for configuration-based changes to the compiled twig templates. Leveraging the deployment identifier as part of that seems fine though; in fact I'm going to go add that value to our standard project template now that I know it exists. :-)

The other question is cleanup. I agree that blocking a cache clear or delete on deleting many files, possibly over a network connection, is a bad idea. However, we shouldn't just punt it to hosts unless we give hosts a clear way to do so. If I read the current proposal correctly (and I may well not be), the prefix is just a file prefix; that means the cleanup process needs to "delete all files in $dir except those that begin with $value_in_the_state_system"? That seems error prone and inefficient, and the cleanup process would need access to Drupal anyway to look up that value. (Since, per above, commit hash alone is insufficient.)

Using the ID as a directory name rather than file prefix would help a little, but ideally we'd still allow an outside process to do the cleanup without needing to access Drupal systems. If we need that, then we may as well let Drupal do it itself in a cron task.

'course, now that I think about that, though... If the whole point is to let each web head have the cache on a *local* disk, is it really going to be that slow? Slow enough that we can't just do it inline on deploy and eat the extra second or two to rm a directory? (And for changes coming from config without deploy, meh, disk is cheap.)

I may be missing some thing here...

dawehner’s picture

Issue summary: View changes
Issue tags: -Needs change record

The more I think about it, the more I like wims solution. I updated the issue summary to explain what is actually the status quo and how the state solution solves the problem, especially because it doesn't worry about the cleanup step, given the reasons described in the issue summary. Not sure anyone would care about it, but in case someone does, there is always room for another issue, solving that here would just derail the solution we have.

Here is a change record: https://www.drupal.org/node/2879963
For a second I was wondering whether we should use \Twig_Environment::clearTemplateCache but it turns out, cacfb069b2e65d5487489677238142b464135b40 in twig removed the capability , see https://github.com/twigphp/Twig/issues/1879 as a comment.

Berdir’s picture

Yeah, since we have the state key already, makes sense to build on top of that.

However, the patch completely removes deleting the actual files, we can't rely on some infrastructure process existing by default? That's also going to be annoying for local debugging when you need to debug something in a generated twig file?

What about keeping the default delete for single-head server as a default but provide a subclass that deletes the state key instead?

dawehner’s picture

What about doing both for now. A multi-webhead project can still implement their own subclass, which does something better, if they want?

Crell’s picture

Question: How likely is a cache clear without redeploy to change a cached Twig file?

I noted above the potential for config-driven templates, but I don't know if anyone is doing that currently. Anyone else know?

Here's the challenge I see, as a host:

1) If a cache clear does not change compiled twig files, then the deploy ID is sufficient and totally easy. In which case the state ID is kinda irrelevant.

2) If cache clear does change compiled twig files and cache clears are frequent compared to deploys, then any clean up script MUST leverage the state ID and thus MUST run from within Drupal.

3) If cache clear does change compiled twig files and cache clears are not frequent compared to deploys, then it's safe enough for hosts to ignore the extra files and fall back to option 1.

So then the two questions are:

1) Do cache clears change compiled twig files (viz, is that a thing we want to support)
2) Are cache clears sufficiently frequent compared to deploys that we care about them being separate events?

greg.1.anderson’s picture

Cleanup in cron would be fine. Providing a default cleanup algorithm that can be overridden would also be fine. If supporting downloading twig templates or in-browser twig editing et. al. is desired, then managing the deployment identifier inside Drupal makes sense.

I am wary about doing cleanup at the instant that the twig files are invalidated. If another process on the same web head was accessing the twig cache at exactly the same time, then you would thrash the cache. At the end of the day, the inefficiencies of this sort of thing are not going to be too great; however, in any cleanup scheme, would still behove you to avoid deleting files that might still be in use (i.e. by a process still holding the old deployment identifier in a local or static variable).

greg.1.anderson’s picture

Posted before I saw #66. From my standpoint, those questions are still open. However, it seems that the cost of managing the state ID from within Drupal is not great, whereas NOT managing this ID in Drupal prevents these scenarios from working smoothly on multi-webhead hosts.

bkosborne’s picture

Just a note that if the proposed patch goes in, then ALL drupal 8 hosts (not just multi-head) would need to implement a solution to clear old files from the twig cache directory, right? That seems like a pretty big change.

dawehner’s picture

Just a note that if the proposed patch goes in, then ALL drupal 8 hosts (not just multi-head) would need to implement a solution to clear old files from the twig cache directory, right? That seems like a pretty big change.

Could we maybe continue to clear the files like now?

dawehner’s picture

@bkosborne
You are making indeed a great point to be honest.

With #70 we would

a) Not make the behaviour worse for environment with one webhead.
b) Fix the deployment problem with multiple webheads.
c) Make it not worse for the multiple webheads environment, aka. keeping files around potentially. We can then continue discussing that problem in a follow up, as well, its a orthogonal problem IMHO. Multiple webheads environment could then also solve the problem independent from Drupal.

Crell’s picture

@dawehner Any thoughts on the questions at the end of #66? Speaking as a host those are the main questions I have that would drive how we'd do the cleanup process.

dawehner’s picture

Do cache clears change compiled twig files (viz, is that a thing we want to support)

It does when you change the underlying file. Cache clear is an obvious thing people execute constantly on development environments.

2) Are cache clears sufficiently frequent compared to deploys that we care about them being separate events?

I would say not on production environments. Just deployments should clear caches like that, unless someone just installs modules as they want.

anavarre’s picture

Status: Needs review » Needs work

Spent some time trying to carefully document a step-by-step to reproduce.

In dev (or in any env with one web node only), make a simple change to page.html.twig such as you get:

<div id="page-wrapper" style="color:red">

Confirm you see the changes on the page. If not, rebuild caches / enable Twig debug mode.

Now that you've confirmed the change works, push to prod or in any env with 2 or more web nodes (if Twig debug mode is enabled, turn it off).

Without the patch in #71

1) Ensure your code is here on e.g. 2 different web heads:

d8@web-1:/$ grep 'page-wrapper' /var/www/html/d8.prod/docroot/core/themes/bartik/templates/page.html.twig
<div id="page-wrapper" style="color:red">

d8@web-2:/$ grep 'page-wrapper' /var/www/html/d8.prod/docroot/core/themes/bartik/templates/page.html.twig
<div id="page-wrapper" style="color:red">

2) Confirm the code isn't there on both web heads, in the compiled Twig files directory (exact location depends on the web host):

d8@web-1:$ grep -inr --color=auto 'page-wrapper' /mnt/tmp/d8/php_storage/twig/t
wig/*page.html.twig*/*.php                                                                                                                         
/mnt/tmp/d8/php_storage/twig/twig/58f21bd31253c_page.html.twig_u5wurMyx4yyLrcXBTHkEErVIO/pWkNtzcstkD07zUw7SE8v0CySwUyce3hvZiUSxLv48E.php:43:        echo "<div id=\"page-wrapper\">
/mnt/tmp/d8/php_storage/twig/twig/58f45d3b61a71_page.html.twig_u5wurMyx4yyLrcXBTHkEErVIO/SijtshVYiXj8U3Q_79u96r9rRE4r3lFmAmZ2qo0pZnM.php:43:        echo "<div id=\"page-wrapper\">
/mnt/tmp/d8/php_storage/twig/twig/58f961ac01847_page.html.twig_u5wurMyx4yyLrcXBTHkEErVIO/-gSuTWoFwK-TZsPGBT54C0prckemsk7VJ7V-bhVBxXA.php:43:        echo "<div id=\"page-wrapper\">

d8@web-2:$ grep -inr --color=auto 'page-wrapper' /mnt/tmp/d8/php_storage/twig/t
wig/*page.html.twig*/*.php                                                                                                                         
/mnt/tmp/d8/php_storage/twig/twig/58f21bd31253c_page.html.twig_u5wurMyx4yyLrcXBTHkEErVIO/pWkNtzcstkD07zUw7SE8v0CySwUyce3hvZiUSxLv48E.php:43:        echo "<div id=\"page-wrapper\">
/mnt/tmp/d8/php_storage/twig/twig/58f45d3b61a71_page.html.twig_u5wurMyx4yyLrcXBTHkEErVIO/SijtshVYiXj8U3Q_79u96r9rRE4r3lFmAmZ2qo0pZnM.php:43:        echo "<div id=\"page-wrapper\">
/mnt/tmp/d8/php_storage/twig/twig/58f961ac01847_page.html.twig_u5wurMyx4yyLrcXBTHkEErVIO/-gSuTWoFwK-TZsPGBT54C0prckemsk7VJ7V-bhVBxXA.php:43:        echo "<div id=\"page-wrapper\">

3) Invalidate the cached Twig template files only on one of these two webs. E.g.

d8@web-1:$ drush @d8.prod --uri=http://site.tld/ ev '\Drupal\Core\PhpStorage\PhpStorageFactory::get("twig")->deleteAll();'

4) Confirm the twig directory is empty for that specific web node only. Immediately after, visit the site and confirm the text is red. If not, then you've hit the web node for which the Twig compiled files haven't been rebuilt.

5) Rebuild caches on both web nodes.
6) Confirm the template modification is now showing up on the server on at least one of those web nodes.

d8@web-2:/$ grep -inr --color=auto 'page-wrapper' /mnt/tmp/d8/php_storage/twig/twig/*page.html.twig*/*.php
43:        echo "<div id=\"page-wrapper\" style=\"color:red\">

d8@web-1:/$ grep -inr --color=auto 'page-wrapper' /mnt/tmp/d8/php_storage/twig/twig/*page.html.twig*/*.php
grep: /mnt/tmp/d8/php_storage/twig/twig/*page.html.twig*/*.php: No such file or directory

My understanding is you need to manually invalidate the Twig cache for each web server in your stack for changes to show up for all.

With the patch in #71

Repeat all steps, and change the color to make sure you see a difference (e.g. green). You shouldn't have to do anything and changes should appear immediately without having to manually invalidate Twig caching.

Unfortunately, even after multiple hard refreshes in the browser and comparing as auth/anon, I'm still not seeing consistency in both web nodes. E.g.

d8@web-1:/$ grep -inr --color=auto 'page-wrapper' /mnt/tmp/d8/php_storage/twig/twig/*page.html.twig*/*.php         
43:        echo "<div id=\"page-wrapper\" style=\"color:green\">

d8@web-2:/$ grep -inr --color=auto 'page-wrapper' /mnt/tmp/d8/php_storage/twig/twig/*page.html.twig*/*.php         
43:        echo "<div id=\"page-wrapper\" style=\"color:red\">

So unless I'm missing something as to the expected behavior or there are steps I haven't done correctly, this is still not working. It's also why I was under the impression leveraging the deployment_identifier might be best. Could someone please clarify?

Setting back to Needs work for the time being.

dawehner’s picture

@anavarre
Is it just me that you have render caching on top of that involved? Maybe page_cache or others are still problematic?
Did you also actually triggered the flushing of the caches / manual invalidation of the template? Do you mind describing how you actually deploy?

anavarre’s picture

Is it just me that you have render caching on top of that involved? Maybe page_cache or others are still problematic?

I'm testing with a site set up to have realistic production settings, that is with page_cache / dynamic_page_cache enabled and a load balancer with Varnish in front of it with max-age set to e.g. a day.

Did you also actually triggered the flushing of the caches / manual invalidation of the template?

Usually, what works is to invalidate the Twig cache on each web node and this is what I expect this issue can help solve. Imagine in a setup with 10 web nodes, this means the below command would have to be run from each web.

$ drush @d8.prod --uri=http://site.tld/ ev '\Drupal\Core\PhpStorage\PhpStorageFactory::get("twig")->deleteAll();'

Do you mind describing how you actually deploy?

Deploying happens by pushing code via a UI, but what it does under the hood is it creates a new tag and pulls that tag on each of the production webservers in the stack. There's no specific deploy 'hook' like clearing caches or anything fancy.

Let me please know if you need me to clarify things any further?

dawehner’s picture

Well, at least the solution for now is relying on the idea to have some form of deployment step. This deployment step would call \Drupal::service('twig')->invalidate(); on any of the webheads.

On top of that though you still have the problem of page caching and all the other things. Conceptually if you think about it, any kind of code deployment (whether its twig files, or PHP code) could change your renderered output, so you might want to invalidate the 'rendered' tag.

So in other words, ensure your deployment does those two things.

anavarre’s picture

you might want to invalidate the 'rendered' tag.

This is what concerns me. IIUC, invalidating the rendered tag means invalidating each and every URL that would have this tag in its cache tags list. E.g. any node, view, the frontpage, etc. The point of cache tags, as I understand, is to make sure you don't have to rebuild caches anymore (unless when truly required) and only what needs to be invalidated now gets invalidated.

If we're clearing out the rendered tag we're basically telling Drupal (and Varnish if we're using Purge and a purger) to flush out a ton of things.

Is this really the best approach? Am I not understand something?

dawehner’s picture

Well at least for know cache tags are just about content and configuration not about code. If you think about it, each piece of rendered output would need to know every template which was involved along the process.

This though would just cover the deployment of twig files, but not the deployment of any php file change. There is no sane way to determine which code is involved in generated which part of the page.

I think this discussion is totally out of scope though here

a) There is a concrete problem we need to solve: deployments leave up wrong templates, with NO way to solve it properly right now
b) This problem would have to be solved, beside some additional magic.

moshe weitzman’s picture

@anavarre - before this fix, you could do a cache-rebuild and still get old templates if your rebuild happenned on a different web head. With this change, assuming it works, a cache-rebuild will be enough.

The goal here isn't to avoid a cache-rebuild on each deploy. Thats still needed, unless your code change is trivial (i.e. doesn't affect any rendered output).

anavarre’s picture

#80 - I understand but can't come to terms with doing this, because it means we'd have to warm up caches always, which can impact high traffic sites. And Drupal 8 caches so much compared to D7 it's a shame we can't only invalidate the cache that needs to be invalidated. That being said, I agree it'd already be an improvement compared to the current state of things, especially when you have to manually invalidate caches on 10 web heads.

Here's another round of tests with the patch in #71, still. On both live web heads I made sure to run drush cr and clear Varnish to make sure I'd start fresh.

d8@web-1:/$ grep -inr --color=auto 'page-wrapper' /mnt/tmp/d8/php_storage/twig/twig/*page.html.twig*/*.php                      
grep: /mnt/tmp/d8/php_storage/twig/twig/*page.html.twig*/*.php: No such file or directory

d8@web-2:/$ grep -inr --color=auto 'page-wrapper' /mnt/tmp/d8/php_storage/twig/twig/*page.html.twig*/*.php                      
grep: /mnt/tmp/d8/php_storage/twig/twig/*page.html.twig*/*.php: No such file or directory

Then, I queried each web node's frontpage to make sure I'd have the correct Twig template generated:

d8@web-1:/$ grep -inr --color=auto 'page-wrapper' /mnt/tmp/d8/php_storage/twig/twig/*page.html.twig*/*.php
43:        echo "<div id=\"page-wrapper\" style=\"color:red\">

d8@web-2:/$ grep -inr --color=auto 'page-wrapper' /mnt/tmp/d8/php_storage/twig/twig/*page.html.twig*/*.php
43:        echo "<div id=\"page-wrapper\" style=\"color:red\">

I then made a change locally (changed color from red to green) and pushed to dev. Then, deployed to prod and made sure to query both web nodes and hit the frontpage. Confirmed the color is still red at this point.

Now, ran drush cr from one of those web nodes. With Drush aliases it shouldn't matter which one. Querying each web node's frontpage and grepping for page-wrapper does return the correct color (green). The filesystem, however doesn't exactly look like it has wiped outdated template files for the web from which I didn't run drush cr. E.g.

d8@web-1:/$ clear ; grep -inr --color=auto 'page-wrapper' /mnt/tmp/d8/php_storage/twig/twig/*page.html.twig*/*.php
43:        echo "<div id=\"page-wrapper\" style=\"color:green\">

d8@web-2:/$ clear ; grep -inr --color=auto 'page-wrapper' /mnt/tmp/d8/php_storage/twig/twig/*page.html.twig*/*.php
/mnt/tmp/d8/php_storage/twig/twig/5948e1d4045af_page.html.twig_u5wurMyx4yyLrcXBTHkEErVIO/FLKvxGS1Jz2aDj8AxIDGKYrEPjV8-oKBmzHPvdZ5SSc.php:43:        echo "<div id=\"page-wrapper\" style=\"color:red\">
/mnt/tmp/d8/php_storage/twig/twig/5948e85c1d6ff_page.html.twig_u5wurMyx4yyLrcXBTHkEErVIO/iNQCJMtwf87WTPdDcWjxt28unskdMqg6DQSkgKLFTtk.php:4$:        echo "<div id=\"page-wrapper\" style=\"color:green\">

Is this the expected behavior? Is Drupal smart enough in this instance to look up for the newest template and not serve one or the other randomly? Also, while running a drush cr with this patch might solve the issue for authenticated users, we'd still have to invalidate the rendered tag so that Varnish cache gets invalidated too, IIUC?

Also, I did try to replace drush cr by drush @d8.prod ev '\Drupal::service('twig')->invalidate();' to try and resort to less expensive operations but it didn't work in my testing.

moshe weitzman’s picture

Yes, old unused compiled twig files are left behind. Drupal and twig cache are smart enough to ignore old files. We discussed this a bunch in this issue and we are leaving this cleanup for hosts to figure out (for now). It would be fine to just wipe this dir on each web head every n months.

Also, while running a drush cr with this patch might solve the issue for authenticated users, we'd still have to invalidate the rendered tag so that Varnish cache gets invalidated too, IIUC?

Correct.

anavarre’s picture

Yes, I remember the cleanup was discussed in #52 but I was more wondering about Drupal picking up the correct template and I'm glad Drupal/Twig are smart enough to handle that just fine.

As to the rendered tag, would it be in scope for this issue to account for its invalidation? Because it won't happen via a cache rebuild so I'm wondering what are our options here.

moshe weitzman’s picture

I see clearing of all cache bins in drupal_rebuild(). That would clear the 'rendered' tag, unless I am missing something. This gets run during drush cache-rebuild, just like it does for drupal's rebuild.php

anavarre’s picture

I see clearing of all cache bins in drupal_rebuild(). That would clear the 'rendered' tag, unless I am missing something. This gets run during drush cache-rebuild, just like it does for drupal's rebuild.php

I understand it'll clear Drupal caches but not the corresponding stored HTML responses in Varnish. So, with this patch, a typical deployment workflow would be:

  • Push to prod
  • drush cr
  • Find a way to clear out all stored Varnish HTML responses with the rendered tag

That's this last bit I'm struggling to see how we can best accommodate for, if at all possible.

anavarre’s picture

Status: Needs work » Needs review

Discussed this with @dawehner in IRC and I'll follow his advice to create a follow-up to discuss the deployment process, which is not strictly in scope for this issue, thus the mismatched expectations. Also setting back to Needs review for this reason.

Here we go: #2888082: Make the deployment process less expensive in a multi webheads environment

Thanks for your input and @dawehner and @moshe weitzman - This does help with framing the discussion.

Berdir’s picture

drush cc render clears the rendered cache tag

Wim Leers’s picture

My last comment at #60 dates from just before Acquia Build Week. About time I respond here again!

#61 + #63: yay!

#62 + #64#68: I agree with @greg.1.anderson that it make sense to provide a default that can be overridden by hosting companies that have multi-web head setups. Also agreed that doing cleanup immediately is a very bad idea, because it will block the response.

#69 + #70: no, we probably want to stick with the current cleanup behavior by default. So +1 for the change in #71. Although I'd like to see that moved to a separate cleanupDisk() method or something like that, because 100% of sites will want to have the $this->state->delete(…) line, but X% of sites will want to override PhpStorageFactory::get('twig')->deleteAll().

#74 + #75: yep, deploying Twig changes also requires Render Cache (fragments) + Dynamic Page Cache (skeletons for any user) + Page Cache (entire page for anon user) + Varnish (same) to be invalidated. @anavarre: are you sure that you were triggering drupal_flush_all_caches(), by e.g. using the "Clear all caches" button at /admin/config/development/performance or drush cache-rebuild aka drush cr? Because that'd also have cleared Render Cache + Dynamic Page Cache + Page Cache. It's never going to be automatic, because that'd require scanning the file system on every request for Twig changes, which is too costly.

#78 + #79 + #80 + #81 + #85 + #86: this is indeed out of scope here, for that we now have #2888082: Deploying Twig template changes is too expensive: it requires all caches to be completely invalidated, as well as all reverse proxies.

#81: thanks for confirming it works!

Wim Leers’s picture

I think what remains to be done here is:

  1. Test coverage!
  2. Ensure that TwigEnvironment::invalidate() can effectively be overridden by hosting providers. See @greg.1.anderson's reasoning in #52 & #67. And test coverage for this.
  3. Decide whether we also want to invalidate Twig based on $settings['deployment_identifier'], which would be trivial as I explained in #57. Crucially, this would remove the need for drush cr after deploying Twig changes!
    It removes the need for drush cr … but only if the output generated by the Twig template is not cached in Render Cache, nor in Dynamic Page Cache nor in Page Cache. That's a lot of ifs.
    We could ensure that those are also updated by keying cache entries for Render Cache/Dynamic Page Cache/Page Cache on the deployment identifier as well. (And we could even expose a Drupal-Deployment-Identifier response header that Varnish could key on, which would mean that Varnish would be instantaneously updated as well.)
    But all that … seems like it's making the scope of this change too large. Therefore I think it's probably better to defer that to a follow-up. Since this was my proposal and I'm now arguing against it, I think most will agree it should be a follow-up.

So I think that only leaves points 1+2.

Thoughts?