Problem/Motivation

#2752961: No reliable method exists for clearing the Twig cache solved a big problem. See the CR at https://www.drupal.org/node/2908461.

The remaining problem is now the deletion of stale (unused, irrelevant) compiled Twig templates on multi-webhead setups where each web server has its own compiled Twig template cache (because most shared file systems are slow).

The change record (https://www.drupal.org/node/2908461) says:

Note: the new approach invalidates files but doesn't delete files across multiple webheads. If you are running Drupal in a multi webhead environment, you might want to delete stale files from the storage manually. To read more about a potential solution read comments #102–#123 at #2752961-102: No reliable method exists for clearing the Twig cache, which contains a couple of suggestions for potential strategies.
In other words: newly deployed Twig templates will be picked up as expected across all webheads (as long as \Drupal::service('twig')->invalidate(); is called after code deployment). But if each webhead is doing its own (local) caching, then stale compiled Twig templates will remain there. So over time, this could accumulate to a significant size.

(Emphasis added.)

Proposed resolution

@greg.1.anderson wrote at #2752961-121: No reliable method exists for clearing the Twig cache:

If I am not mistaken, then the purpose of #112 is to implement the solution proposed in #7. In terms of converting a single-head solution into a double-head solution, what hosting providers need is:

1. The ability to put the twig cache files in a completely different directory each time they are cleared.
- When our scavenger deletes stale cache files, it can just delete entire directories at a time.
2. The ability to know which directory the current cache files are stored in.
- When scavenging stale cache files, we want to skip the directory that is active.
- We currently do this by forcing the directory calculation used in 1 so that we do not need to query Drupal.
- We could also potentially do this simply by examining the creation date of the cache folder, and presume that the newest is the active one.
3. The ability to stop Drupal from deleting the files in the twig cache
- If we have a scavenger, then we don't need to spend time during the web request to delete large directories.

Remaining tasks

TBD

User interface changes

None.

API changes

TBD

Data model changes

TBD

Comments

Wim Leers created an issue. See original summary.

webchick’s picture

Category: Feature request » Task
Priority: Normal » Major

This seems less like a feature to me, and more like necessary clean-up, unless I'm wrong? Otherwise, any Drupal 8 site in a multi-headed environment (which is probably between many and most of them, given D8's suitability for "ambitious digital experiences") is going to hit this, unless their hosting platform is somehow handling this for them behind the scenes (which introduces logic to said platform that's highly specialized to this one application, so definitely not a guarantee).

Version: 8.6.x-dev » 8.7.x-dev

Drupal 8.6.0-alpha1 will be released the week of July 16, 2018, which means new developments and disruptive changes should now be targeted against the 8.7.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

glennpratt’s picture

The current implementation seems to contain a race condition:

  public function invalidate() {
    PhpStorageFactory::get('twig')->deleteAll();
    $this->templateClasses = [];
    $this->loadedTemplates = [];
    $this->state->delete(static::CACHE_PREFIX_METADATA_KEY);
  }

This code starts deleting files without a lock and before changing the key. I believe rendering processes may have cached templates deleted while they are in the process of reading them or may regenerate them in the now expired key.

There is this rather old todo about the general problem beyond just Twig (thanks Wim for pointing this out):

 * @todo Add a global lock to ensure that caches are not primed in concurrent
 * requests.
 */
function drupal_flush_all_caches() {

I believe this race impacts any Drupal installation, not just installations using multiple webheads. Cache clear can be called by a parallel request or by CLI drush even on a single server.

Proposal

I'm proposing a lockless solution so that Twig cache invalidation does not require locking against rendering processes.

Avoid race conditions against rendering PHP requests by not deleting cached templates while invalidating the “twigCachePrefix”.

Change TwigEnvironment::invalidate() to not delete cached templates by removing this line:

PhpStorageFactory::get('twig')->deleteAll();

Allow asynchronous cleanup of old cached templates.

Change TwigEnvironment::invalidate() to set last invalidation time in state:

$this->state->set(static::CACHE_LAST_INVALIDATE_TIMESTAMP_KEY, time());

Add and implement PhpStorageInterface::->deleteAllOlderThan($time)

Create TwigEnvironment::cleanInvalidatedCacheTemplates():

public function cleanInvalidatedCacheTemplates() {
    $invalidateTime = $this->state->get(static::CACHE_LAST_INVALIDATE_TIMESTAMP_KEY);
    PhpStorageFactory::get('twig')->deleteAllOlderThan($invalidateTime);
}

Automatically cleanup old cached templates in cron.

Create (or add to existing) hook_cron() to call TwigEnvironment::cleanInvalidatedCacheTemplates().

Create or document a pattern for calling TwigEnvironment::cleanInvalidatedCacheTemplates() on every webnode for hosts that do not wish to store Twig cache on a shared filesystem.

lauriii’s picture

Thanks @glennpratt for the proposal, and sorry for the delay on the feedback in my end.

I'm wondering if we are concerned at all if there are sites that doesn't have cron configured and as consequence would end up having their disks filled with cache garbage.

From the point of view of a subsystem maintainer, I'm happy with the proposal in #4. Let's update the issue summary to include that. Would be also awesome if we could get a review from someone who has more in-depth knowledge about caching.

rjg’s picture

I think it could help if twig used cache tags as mentioned in https://www.drupal.org/project/drupal/issues/2975479 or if twig cache files were put in a completely different directory each time they are cleared as mentioned above, so that the base twig cache path was something like 'php/twig/{hash}/{filename}' rather than 'php/twig/{hash}_{filename}' and twig creates cache subfolders in there where are all the subfolder names are prefixed with the hash.

Another downside to the current core code is that PhpStorageFactory::get('twig')->deleteAll(); deletes everything, including active/valid twig cache.

Throwing this out for discussion -- to delete only old/invalid cache files, something like this could be added to \Drupal\Core\Template\TwigEnvironment, or used in a service decorator, or with getTwigCachePrefix() being a public method this could be tweaked to ependency inject the twig service (`\Drupal::service('twig')`) in a custom class or drush command:

  /**
   * Deletes invalidated Twig cache files.
   */
  public function deleteInvalidatedFiles() {
    if ($prefix = $this->getTwigCachePrefix()) {
      /** @var \Drupal\Component\PhpStorage\PhpStorageInterface $twig_storage */
      $twig_storage = PhpStorageFactory::get('twig');
      $file_paths = $twig_storage->listAll();
      $invalidated_file_paths = preg_grep('/^' . $prefix . '/', $file_paths, PREG_GREP_INVERT);
      foreach ($invalidated_file_paths as $invalidated_file_path) {
        $status = $twig_storage->delete($invalidated_file_path);
      }
    }
  }

It would be easier to check 'php/twig/{hash}/' for a match if the twig files were put in a new base folder each time the hash changed.

This could serve as an interim fix to alleviate the issue where twig cache files are not invalidated across load balanced web servers. Although that would require a deploy hook, or a scheduled/cron job run against each server on a routine basis. Or maybe offer it as an option to enable during core cron

Berdir’s picture

> or if twig cache files were put in a completely different directory each time they are cleared as mentioned above

You can do that yourself by configuring the php_storage directly, here's what I do:

$settings['php_storage']['twig']['directory'] = '/tmp/' . getenv('PLATFORM_PROJECT') . '/phpstorage/twig/' . getenv('PLATFORM_TREE_ID');

Until that issue does it somehow automatically, but I'm honestly not sure how that would work exactly. I'm doing the cleaning in a daily cleanup cronjob then that deletes all folders except the current TREE_ID, which is an automatic identifier generated by platform.sh.

Version: 8.7.x-dev » 8.8.x-dev

Drupal 8.7.0-alpha1 will be released the week of March 11, 2019, which means new developments and disruptive changes should now be targeted against the 8.8.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Wim Leers’s picture

#4:

@glennpratt Like @lauriii, I feel bad for not having followed up sooner. We were super deep in JSON:API at the time you posted this and then I forgot about it, but that's no valid excuse :(

I of course am in favor of a lockless solution if one exists. Unlike @lauriii, I'm not entirely convinced by #4's proposal. I see two potential problems, the first of which is the biggest:

  1. Isn't it possible for all cron runs to be performed by one and the same web node, and hence we'd only end up deleting the Twig cache on that single web node, still resulting in the others from only ever accumulating more?


    Because that is the fundamental problem here: how to guarantee deletion across multiple web nodes — it is this same problem that is left to hosting providers in https://www.drupal.org/node/2908461.


    I'd love to be wrong on this one! :)
  2. Furthermore, this assumes generating file system listings is cheap (because we'll need to list all directory mtimes). I'm not sure this is always true. In fact, I suspect this was part of the original design. I'll need to dig in to the original issues that added \Drupal\Core\PhpStorage\PhpStorageFactory and friends to figure that out.

#6:

I see two concerns here:

  1. #2975479: Improve clearing the Twig cache has the same weakness as the solution currently in Drupal 8 core: it requires hosting providers to implement some Drupal-specific logic. This is exactly what @glennpratt wants to avoid.
  2. deleteInvalidatedFiles () triggers the same second weakness above.

Version: 8.8.x-dev » 8.9.x-dev

Drupal 8.8.0-alpha1 will be released the week of October 14th, 2019, which means new developments and disruptive changes should now be targeted against the 8.9.x-dev branch. (Any changes to 8.9.x will also be committed to 9.0.x in preparation for Drupal 9’s release, but some changes like significant feature additions will be deferred to 9.1.x.). For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

Version: 8.9.x-dev » 9.1.x-dev

Drupal 8.9.0-beta1 was released on March 20, 2020. 8.9.x is the final, long-term support (LTS) minor release of Drupal 8, which means new developments and disruptive changes should now be targeted against the 9.1.x-dev branch. For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

Version: 9.1.x-dev » 9.2.x-dev

Drupal 9.1.0-alpha1 will be released the week of October 19, 2020, which means new developments and disruptive changes should now be targeted for the 9.2.x-dev branch. For more information see the Drupal 9 minor version schedule and the Allowed changes during the Drupal 9 release cycle.

Version: 9.2.x-dev » 9.3.x-dev

Drupal 9.2.0-alpha1 will be released the week of May 3, 2021, which means new developments and disruptive changes should now be targeted for the 9.3.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 9.3.x-dev » 9.4.x-dev

Drupal 9.3.0-rc1 was released on November 26, 2021, which means new developments and disruptive changes should now be targeted for the 9.4.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Eric_A’s picture

Is this major task somehow a duplicate of the normal bug report #3032078: Multiple webheads can cause infinite growth of Twig cache? (Which has a patch.)

Version: 9.4.x-dev » 9.5.x-dev

Drupal 9.4.0-alpha1 was released on May 6, 2022, which means new developments and disruptive changes should now be targeted for the 9.5.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 9.5.x-dev » 10.1.x-dev

Drupal 9.5.0-beta2 and Drupal 10.0.0-beta2 were released on September 29, 2022, which means new developments and disruptive changes should now be targeted for the 10.1.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 10.1.x-dev » 11.x-dev

Drupal core is moving towards using a “main” branch. As an interim step, a new 11.x branch has been opened, as Drupal.org infrastructure cannot currently fully support a branch named main. New developments and disruptive changes should now be targeted for the 11.x branch, which currently accepts only minor-version allowed changes. For more information, see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

dungahk’s picture

@Eric_A That issue is different, I initially thought they were the same too, but this one is specifically about making sure inactive Twig compiled files are deleted across multiple webheads, that other issue did not fix this, one still needs to delete them separately.