Problem/Motivation

Plugin derivers are essentially a foreach loop for providing plugin definitions.
Typically they iterate over the result of some other service, creating a definition for each item.
If that list of items changes, the plugin definition cache must be cleared so that the deriver can accurately reflect the up-to-date list.

In practice, this has meant that another piece of code, like a hook implementation, will manually clear the plugin definition cache.

For example, \Drupal\system\Plugin\Derivative\SystemMenuBlock loops over all menu entities to provide menu blocks.
\Drupal\system\Entity\Menu::save() and \Drupal\system\Entity\Menu::delete() are then responsible for clearing the block plugin definition cache.

Proposed resolution

Plugin managers already support the use of cache tags for their definitions cache.
Allow the derivers themselves to affect the list of cache tags used but NOT cache contexts or max-age, see @Berdir's comment at #74.
This will remove the need for external code (like a hook) to clear the cache.

Remaining tasks

User interface changes

API changes

Data model changes

CommentFileSizeAuthor
#90 plugin-managers.png1.24 MBkristiaanvandeneynde
#76 drupal-core-3001284-v9.5.11.patch42.63 KBNigel Cunningham
#67 drupal-core-3001284-v9.5.4-menu.patch41.59 KBrkcreation
#66 3001284-nr-bot.txt160 bytesneeds-review-queue-bot
#64 drupal-core-3001284-v9.5.1.patch32.47 KBrkcreation
#62 3001284-62.patch48.62 KB_pratik_
#57 3001284-57.patch33.4 KBravi.shankar
#51 3001284-deriver-51.patch33.42 KBnginex
#49 drupal-3001284-48-deriver-cache-tags.patch34.85 KBgeek-merlin
#40 3001284-deriver-40.patch33.42 KBpenyaskito
#38 3001284-deriver-38-interdiff.txt2.8 KBtim.plunkett
#38 3001284-deriver-38.patch33.43 KBtim.plunkett
#27 18DD0023-0D14-4B1F-BFB5-0F238D183FFB.png195.62 KBtim.plunkett
#25 3001284-deriver-25-interdiff.txt691 bytestim.plunkett
#25 3001284-deriver-25.patch36.23 KBtim.plunkett
#22 3001284-deriver-22-interdiff.txt17.28 KBtim.plunkett
#22 3001284-deriver-22.patch36.35 KBtim.plunkett
#16 3001284-deriver-16.patch28.74 KBWim Leers
#16 interdiff.txt18.52 KBWim Leers
#14 3001284-deriver-14.patch25.59 KBtim.plunkett
#14 3001284-deriver-14-interdiff.txt4.77 KBtim.plunkett
#13 3001284-deriver-13.patch27.02 KBtim.plunkett
#13 3001284-deriver-13-interdiff.txt6.53 KBtim.plunkett
#12 3001284-deriver-12.patch20.49 KBtim.plunkett
#12 3001284-deriver-12-interdiff.txt3.12 KBtim.plunkett
#10 3001284-deriver-10-interdiff.txt5.35 KBtim.plunkett
#10 3001284-deriver-10.patch19.98 KBtim.plunkett
#8 3001284-deriver-8.patch18.9 KBtim.plunkett
#8 3001284-deriver-8-interdiff.txt2.16 KBtim.plunkett
#6 3001284-deriver-6.patch18.98 KBtim.plunkett
#6 3001284-deriver-6-interdiff.txt3.11 KBtim.plunkett
#2 3001284-deriver-2.patch18.91 KBtim.plunkett
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

tim.plunkett created an issue. See original summary.

tim.plunkett’s picture

Status: Active » Needs review
FileSize
18.91 KB
tim.plunkett’s picture

+++ b/core/lib/Drupal/Component/Plugin/Discovery/DerivativeDiscoveryDecorator.php
@@ -98,16 +99,7 @@ protected function getDerivatives(array $base_plugin_definitions) {
+        $plugin_definitions += $this->getDerivative($deriver, $base_plugin_id, $plugin_definition);

@@ -119,6 +111,23 @@ protected function getDerivatives(array $base_plugin_definitions) {
+  protected function getDerivative(DeriverInterface $deriver, $base_plugin_id, $plugin_definition) {

+++ b/core/lib/Drupal/Core/Plugin/Discovery/ContainerDerivativeDiscoveryDecorator.php
@@ -32,4 +37,13 @@ protected function getDeriver($base_plugin_id, $base_definition) {
+  protected function getDerivative(DeriverInterface $deriver, $base_plugin_id, $plugin_definition) {
+    $plugin_definitions = parent::getDerivative($deriver, $base_plugin_id, $plugin_definition);
+    $this->addCacheableDependency($deriver);
+    return $plugin_definitions;
+  }

This whole dance is needed because RefinableCacheableDependencyInterface is in Drupal\Core, which is not allowed to be used in Drupal\Component code like DerivativeDiscoveryDecorator.
Needs to be cleaned up a bit.

tim.plunkett’s picture

Title: Allow plugin derivers to specify cacheability metadata for their definitions » Allow plugin derivers to specify cache tags for their definitions

It's really just cache tags, not the rest of it

Status: Needs review » Needs work

The last submitted patch, 2: 3001284-deriver-2.patch, failed testing. View results

tim.plunkett’s picture

Status: Needs work » Needs review
FileSize
3.11 KB
18.98 KB

Status: Needs review » Needs work

The last submitted patch, 6: 3001284-deriver-6.patch, failed testing. View results

tim.plunkett’s picture

Status: Needs work » Needs review
FileSize
2.16 KB
18.9 KB

NO IDEA why the order of the plugin definition +=ing matters. Something for another day.

Status: Needs review » Needs work

The last submitted patch, 8: 3001284-deriver-8.patch, failed testing. View results

tim.plunkett’s picture

Status: Needs work » Needs review
FileSize
19.98 KB
5.35 KB

More fixes.
The fact that tests (and possibly runtime code!?) depend on the order that unsorted plugin definitions are returned is very frightening.

Status: Needs review » Needs work

The last submitted patch, 10: 3001284-deriver-10.patch, failed testing. View results

tim.plunkett’s picture

Status: Needs work » Needs review
FileSize
3.12 KB
20.49 KB

I guess we fail patches for missing docblocks now.

tim.plunkett’s picture

Last set of calls to clearCachedDefinitions that correspond to a deriver.
There's still one in \Drupal\language\Form\NegotiationConfigureForm that I don't understand.
And the one in views_invalidate_cache() that is called in several places.

tim.plunkett’s picture

DerivativeDiscoveryDecorator has two codepaths to fix. So new approach here. (This is all because of Drupal\Core vs Drupal\Component).

Wim Leers’s picture

Status: Needs review » Needs work
Issue tags: +D8 cacheability

Allow the derivers themselves to affect the list of cache tags used.

This sounds sensible.

This will remove the need for external code (like a hook) to clear the cache.

👏


  1. +++ b/core/lib/Drupal/Core/Entity/EntityFieldManager.php
    @@ -624,4 +627,11 @@ public function getExtraFields($entity_type_id, $bundle) {
    +  public function getCacheTags() {
    +    return ['entity_field_info'];
    +  }
    

    👍

    Let's call this in \Drupal\Core\Entity\EntityFieldManager::clearCachedFieldDefinitions() instead of hardcoding the same array of cache tags there.

    And there are two more places in this class where this cache tag shows up, both of those in combination with the 'entity_types' cache tag, which is the cache tag for EntityTypeManager.

    This makes me think we actually want a structure/approach similar to that in Entity:

      /**
       * {@inheritdoc}
       */
      public function getCacheTagsToInvalidate() {
        if ($this->isNew()) {
          return [];
        }
        return [$this->entityTypeId . ':' . $this->id()];
      }
    
      /**
       * {@inheritdoc}
       */
      public function getCacheTags() {
        if ($this->cacheTags) {
          return Cache::mergeTags($this->getCacheTagsToInvalidate(), $this->cacheTags);
        }
        return $this->getCacheTagsToInvalidate();
      }
    

    In this case, that'd mean EntityFieldManager::getCacheTagsToInvalidate() returns ['entity_field_info'] and EntityTypeManager::getCacheTagsToInvalidate() returns ['entity_types'].

    And because EntityFieldManager depends on EntityTypeManager, it ought to also merge EntityTypeManager's cacheability.

  2. +++ b/core/lib/Drupal/Core/Plugin/DefaultPluginManager.php
    @@ -23,9 +24,10 @@
    +class DefaultPluginManager extends PluginManagerBase implements PluginManagerInterface, CachedDiscoveryInterface, RefinableCacheableDependencyInterface {
    

    Implementing the RefinableCDI interface means that anybody can modify the cacheability of plugin managers based on this class.

    Is that desirable? AFAICT it isn't. We don't want arbitrary cacheability to be injected unless it's absolutely necessary to have that capability.

  3. +++ b/core/lib/Drupal/Core/Plugin/DefaultPluginManager.php
    @@ -282,6 +284,11 @@ protected function getFactory() {
    +    // Add the discovery as a dependency after retrieving the definitions so
    +    // that derivers have the opportunity to run and add their own dependencies.
    +    $this->addCacheableDependency($this->getDiscovery());
    

    This makes it look like all we want is for this to inherit the cacheability of the discovery.

    If that's the case, we can stick to CDI and use \Drupal\Core\Cache\CacheableDependencyTrait.

  4. +++ b/core/lib/Drupal/Core/Plugin/Discovery/ContainerDerivativeDiscoveryDecorator.php
    @@ -32,4 +37,11 @@ protected function getDeriver($base_plugin_id, $base_definition) {
    +    $this->addCacheableDependency($deriver);
    

    Same remark here. Because it's this object managing its own cacheability, rather than external object manipulating this object's cacheability, CacheableDependencyInterface suffices.

  5. +++ b/core/modules/block_content/src/Plugin/Block/BlockContentBlock.php
    @@ -164,9 +164,7 @@ public function blockForm($form, FormStateInterface $form_state) {
       public function blockSubmit($form, FormStateInterface $form_state) {
    -    // Invalidate the block cache to update custom block-based derivatives.
         $this->configuration['view_mode'] = $form_state->getValue('view_mode');
    -    $this->blockManager->clearCachedDefinitions();
       }
     +++ b/core/modules/system/src/Entity/Menu.php
    @@ -98,10 +98,6 @@ public static function preDelete(EntityStorageInterface $storage, array $entitie
    -    // Invalidate the block cache to update menu-based derivatives.
    -    if (\Drupal::moduleHandler()->moduleExists('block')) {
    -      \Drupal::service('plugin.manager.block')->clearCachedDefinitions();
    
    @@ -111,11 +107,6 @@ public function save() {
    -    // Invalidate the block cache to update menu-based derivatives.
    -    if (\Drupal::moduleHandler()->moduleExists('block')) {
    -      \Drupal::service('plugin.manager.block')->clearCachedDefinitions();
    -    }
    
    +++ b/core/modules/views/src/Plugin/views/display/Block.php
    @@ -377,9 +376,6 @@ public function remove() {
    -    if ($this->blockManager instanceof CachedDiscoveryInterface) {
    -      $this->blockManager->clearCachedDefinitions();
    -    }
    

    🤘

  6. +++ b/core/modules/block_content/src/Plugin/Derivative/BlockContent.php
    @@ -43,6 +47,8 @@ public static function create(ContainerInterface $container, $base_plugin_id) {
    +    $this->addCacheTags($this->blockContentStorage->getEntityType()->getListCacheTags());
    +++ b/core/modules/system/src/Plugin/Derivative/SystemMenuBlock.php
    @@ -44,6 +48,8 @@ public static function create(ContainerInterface $container, $base_plugin_id) {
    +    $this->addCacheTags($this->menuStorage->getEntityType()->getListCacheTags());
    
    +++ b/core/modules/views/src/Plugin/Derivative/ViewsBlock.php
    @@ -73,6 +77,8 @@ public function getDerivativeDefinition($derivative_id, $base_plugin_definition)
    +    $this->addCacheTags($this->viewStorage->getEntityType()->getListCacheTags());
    
    +++ b/core/modules/views/src/Plugin/Derivative/ViewsExposedFilterBlock.php
    @@ -72,6 +76,8 @@ public function getDerivativeDefinition($derivative_id, $base_plugin_definition)
    +    $this->addCacheTags($this->viewStorage->getEntityType()->getListCacheTags());
    

    Same thing. But here, it's also doing it on every call to getDerivativeDefinitions(). We can do it in the constructor instead.

  7. +++ b/core/modules/layout_builder/src/Plugin/Derivative/ExtraFieldBlockDeriver.php
    @@ -77,6 +80,8 @@ public static function create(ContainerInterface $container, $base_plugin_id) {
    +    $this->addCacheableDependency($this->entityFieldManager);
    +++ b/core/modules/layout_builder/src/Plugin/Derivative/FieldBlockDeriver.php
    @@ -86,6 +89,8 @@ public static function create(ContainerInterface $container, $base_plugin_id) {
    +    $this->addCacheableDependency($this->entityFieldManager);
    

    Again :)

  8. =

Wim Leers’s picture

Status: Needs work » Needs review
FileSize
18.52 KB
28.74 KB
tim.plunkett’s picture

Thanks for the insight, and also the fixes!

+++ b/core/lib/Drupal/Core/Plugin/Discovery/ContainerDerivativeDiscoveryDecorator.php
@@ -41,7 +42,9 @@ protected function getDeriver($base_plugin_id, $base_definition) {
-    $this->addCacheableDependency($deriver);
+    $existing_cacheability = CacheableMetadata::createFromObject($this);
+    $additional_cacheability = CacheableMetadata::createFromObject($deriver);
+    $this->setCacheability($existing_cacheability->merge($additional_cacheability));

This is way less intuitive, which is why I mistakenly went for the RefinableCDI, because the oneliner made sense to me. Maybe CacheabilityDependencyTrait could have a protected version of addCacheableDependency that does this dance?

Wim Leers’s picture

This is way less intuitive, which is why […]

Yep, I totally get that!

Maybe CacheabilityDependencyTrait could have a protected version of addCacheableDependency that does this dance?

I'd support that :)

Status: Needs review » Needs work

The last submitted patch, 16: 3001284-deriver-16.patch, failed testing. View results
- codesniffer_fixes.patch Interdiff of automated coding standards fixes only.

Wim Leers’s picture

Prophecy\Exception\Doubler\MethodNotFoundException: Method `Double\EntityTypeManagerInterface\P262::getCacheTags()` 

Makes sense: core's entity type manager implements this additional interface, but that's not mandated by the entity type manager interface. So we'll need to either modify the interface (which breaks BC, so nope, despite it being an interface that probably nobody ever will implement from scratch…), or add an extra conditional check.

tim.plunkett’s picture

Assigned: Unassigned » tim.plunkett

Going to revisit this

tim.plunkett’s picture

Assigned: tim.plunkett » Unassigned
Status: Needs work » Needs review
FileSize
36.35 KB
17.28 KB

Okay this patch does several things.
1) Fixes the test failures by checking the interface on entityTypeManager first
2) Moves the super helpful public methods in RefinableCacheableDependencyTrait to be protected in CacheableDependencyTrait; now RefinableCacheableDependencyTrait is only to make those protected methods public
3) Restores the calls to all the RefinableCacheableDependencyTrait methods that were in my last attempt.

Wim Leers’s picture

Status: Needs review » Reviewed & tested by the community

2) Moves the super helpful public methods in RefinableCacheableDependencyTrait to be protected in CacheableDependencyTrait; now RefinableCacheableDependencyTrait is only to make those protected methods public

👏

Status: Reviewed & tested by the community » Needs work

The last submitted patch, 22: 3001284-deriver-22.patch, failed testing. View results
- codesniffer_fixes.patch Interdiff of automated coding standards fixes only.

tim.plunkett’s picture

Status: Needs work » Reviewed & tested by the community
FileSize
36.23 KB
691 bytes

Interesting that it passed before. +1 for strict phpcs though!

Status: Reviewed & tested by the community » Needs work

The last submitted patch, 25: 3001284-deriver-25.patch, failed testing. View results

tim.plunkett’s picture

Status: Needs work » Reviewed & tested by the community
FileSize
195.62 KB

???
Screenshot

catch’s picture

+++ b/core/modules/block_content/tests/src/Kernel/BlockContentDeletionTest.php
@@ -55,6 +55,13 @@ public function testDeletingBlockContentShouldClearPluginCache() {
+
+    // While the persistent cache is cleared, the static cache must be manually
+    // reset within the same request.
+    $property = new \ReflectionProperty($block_manager, 'definitions');
+    $property->setAccessible(TRUE);
+    $property->setValue($block_manager, NULL);
+

This is the main thing that sticks out in the patch. Should we look at taking a similar approach to #1596472: Replace hard coded static cache of entities with cache backends for plugin managers? The entity cache version doesn't need to deal with cache tags, but in principle it could work for that too. Not suggesting trying to do that here, but we could open a follow-up and link back. This is only affecting tests in the patch but it does mean the static cache is out of sync for real requests past the block deletion too.

tim.plunkett’s picture

Opened #3013659: Replace hard coded static cache of plugin definitions with cache backends including a description of why this isn't really a problem yet, but should be fixed anyway.

jibran’s picture

Issue tags: +Needs change record

I think contrib can benefit with this as well so let's have a change notice.

tim.plunkett’s picture

catch’s picture

Other question is whether we should leave the manual clearing in - then we wouldn't need the test changes at all, and the follow-up could just remove the workaround. As it is, the test changes are masking an actual bug in core and we've have to explicitly remember to remove it.

Or if #3013659: Replace hard coded static cache of plugin definitions with cache backends is simple to implement might be better to do that first.

catch’s picture

Status: Reviewed & tested by the community » Needs review
tim.plunkett’s picture

We hit the same problem in #2860346: Reset plugin discovery when a module/theme is installed and I had forgotten.

Wim Leers’s picture

catch’s picture

I don't think the tests should be changed by this patch because it's introducing a regression then adapting the tests to work around it (albeit a small regression that people are unlikely to notice). That means either leaving the hook implementations in, or doing #3013659: Replace hard coded static cache of plugin definitions with cache backends.

Version: 8.7.x-dev » 8.8.x-dev

Drupal 8.7.0-alpha1 will be released the week of March 11, 2019, which means new developments and disruptive changes should now be targeted against the 8.8.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

tim.plunkett’s picture

Rerolling without the hacks to the tests.
Should fail a bunch.

Status: Needs review » Needs work

The last submitted patch, 38: 3001284-deriver-38.patch, failed testing. View results

penyaskito’s picture

Rerolled, the changes at core/modules/views/src/Plugin/views/display/Block.php already happened. The rest of the patch was auto-merged.

Same tests as #38 should fail.

Status: Needs review » Needs work

The last submitted patch, 40: 3001284-deriver-40.patch, failed testing. View results

geek-merlin’s picture

Found an identical issue and marked that one as dup. Note that it contains patch and discussion, so there may be valuable stuff to merge in there.

Wim Leers’s picture

Version: 8.8.x-dev » 8.9.x-dev

Drupal 8.8.0-alpha1 will be released the week of October 14th, 2019, which means new developments and disruptive changes should now be targeted against the 8.9.x-dev branch. (Any changes to 8.9.x will also be committed to 9.0.x in preparation for Drupal 9’s release, but some changes like significant feature additions will be deferred to 9.1.x.). For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

geek-merlin’s picture

This is also needed for the 2.x version of role_toggle, for MenuLinkContentDeriver cache contexts.

geek-merlin’s picture

geek-merlin’s picture

It looks like i found a good-enough other way.

Version: 8.9.x-dev » 9.1.x-dev

Drupal 8.9.0-beta1 was released on March 20, 2020. 8.9.x is the final, long-term support (LTS) minor release of Drupal 8, which means new developments and disruptive changes should now be targeted against the 9.1.x-dev branch. For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

Status: Needs review » Needs work

The last submitted patch, 49: drupal-3001284-48-deriver-cache-tags.patch, failed testing. View results

nginex’s picture

Version: 9.1.x-dev » 9.2.x-dev

Drupal 9.1.0-alpha1 will be released the week of October 19, 2020, which means new developments and disruptive changes should now be targeted for the 9.2.x-dev branch. For more information see the Drupal 9 minor version schedule and the Allowed changes during the Drupal 9 release cycle.

Version: 9.2.x-dev » 9.3.x-dev

Drupal 9.2.0-alpha1 will be released the week of May 3, 2021, which means new developments and disruptive changes should now be targeted for the 9.3.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 9.3.x-dev » 9.4.x-dev

Drupal 9.3.0-rc1 was released on November 26, 2021, which means new developments and disruptive changes should now be targeted for the 9.4.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

geek-merlin’s picture

I opened #3268681: Allow plugin derivers to specify cache contexts for their definitions and wonder why this was not included here. So feel free to suck that in here or leave it separate.

Also did some housekeeping, crosslinked #2563979: Add a @CacheableMetadata annotation and set #2633878: Finalize cacheability for plugins as umbrella.

geek-merlin’s picture

Title: Allow plugin derivers to specify cache tags for their definitions » Allow plugin derivers to specify cacheability (tags, contexts, max-age) for their definitions

OMFG, the patch already contains cache tags, contexts and max-age, so updating title and dup-ing the other issue.

ravi.shankar’s picture

Added reroll of patch #51 on Drupal 9.4.x.

Version: 9.4.x-dev » 9.5.x-dev

Drupal 9.4.0-alpha1 was released on May 6, 2022, which means new developments and disruptive changes should now be targeted for the 9.5.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 9.5.x-dev » 10.1.x-dev

Drupal 9.5.0-beta2 and Drupal 10.0.0-beta2 were released on September 29, 2022, which means new developments and disruptive changes should now be targeted for the 10.1.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

rkcreation’s picture

Hi,

Patch #57 has worked for me, but in 9.5.x it seems lot of work of that patch was already done, and patch does not apply anymore.
Could someone reroll the patch please ?

Thanks for your work !

geek-merlin’s picture

Issue tags: +Needs reroll

Summoning good spirits.

_pratik_’s picture

Status: Needs work » Needs review
FileSize
48.62 KB

Reroll for 9.5.x
Thanks

ravi.shankar’s picture

Status: Needs review » Needs work

Patch #62 still not getting applied.

rkcreation’s picture

This one works for me :-)

rpayanm’s picture

Status: Needs work » Needs review
needs-review-queue-bot’s picture

Status: Needs review » Needs work
FileSize
160 bytes

The Needs Review Queue Bot tested this issue. It either no longer applies to Drupal core, or fails the Drupal core commit checks. Therefore, this issue status is now "Needs work".

Apart from a re-roll or rebase, this issue may need more work to address feedback in the issue or MR comments. To progress an issue, incorporate this feedback as part of the process of updating the issue. This helps other contributors to know what is outstanding.

Consult the Drupal Contributor Guide to find step-by-step guides for working with issues.

rkcreation’s picture

Re-rolled for core 9.5.4.
I also add implementation of CachedDiscoveryInterface and RefinableCacheableDependencyInterface for MenuLinkManager, because I've experienced trouble with menu derivatives.
I think it might be MenuTreeStorage responsibility to propagate plugins' cache tags instead of this solution, but I've no really advised opinion on this and no time to explore this for now...

Berdir’s picture

For the record, we're moving in the opposite direction, to have as few cache tags in plugin definitions as possible. See #3335768: Manually clear cache keys from plugin managers with finite variations instead of using cache tags and https://www.drupal.org/node/3344524 for the reason. Cache tags have a cost, plugin definitions usually only have few or just a single cache key and while it requires a bit of extra code, it's far more efficient to invalidate them in the rare cases when they do change as opposed to checking if the plugin definitions are still valid on every single cache get.

-1 from me on supporting this.

Version: 10.1.x-dev » 11.x-dev

Drupal core is moving towards using a “main” branch. As an interim step, a new 11.x branch has been opened, as Drupal.org infrastructure cannot currently fully support a branch named main. New developments and disruptive changes should now be targeted for the 11.x branch, which currently accepts only minor-version allowed changes. For more information, see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Berdir’s picture

Status: Needs work » Postponed (maintainer needs more info)

The issue mentioned above has been committed now, as has the changes to the chained fast backend that will from now on cause it to validate cache tags in the discovery bin on every request.

Several of the affected cases here are in consideration to be refactored to not use derivates anymore. Personally, I've always considered those to be an anti-pattern, while there are use case that are valid as there are sometimes really different block definitions (views for example, with different argument contexts), things like block_content, layout builder, menu shouldn't be separate block plugin derivates, but a single one with configuration to select the block or menu that should be displayed. See #2940755: block_content block derivatives do not scale to thousands of block_content entities for scalability issues that the current architecture causes. Also #3365551: Add the notion of a 'configured layout builder block' to solve a number of content-editor and performance pain points for layout builder.

And with those changes, those would no longer be plugins and the need for invalidation of the block plugin cache would go away, instead there will either be a runtime check in the block plugin or we'll query block config instances and delete those.

The workarounds that are needed if this is not present are a cache clear call through an existing API, which is a single line of code, that's typically needed in 2-3 places at most.

I discussed a bit more with @catch in slack, he also added that this kind of encourages these "bad" patterns.

Setting to postponed (needs more info) for now.

andypost’s picture

kristiaanvandeneynde’s picture

Re #70:

Several of the affected cases here are in consideration to be refactored to not use derivates anymore. Personally, I've always considered those to be an anti-pattern, while there are use case that are valid

Another example is Group, plugins like group_node:CONTENT_TYPE need to use derivatives because the plugin ID is saved to the DB so that query access can be applied per content type by virtue of some complicated joins that also check for the plugin ID. Using a config form entry to select what content type to serve would make this impossible. There's more reasons why Group needs derivatives, but I won't bore you with the details.

Adding this to make it clear we shouldn't be moving away from derivatives up to a point where they are no longer an option as that would put Group (and other modules perhaps) in a tough spot.

Having said that, I never used cache tags to clear these derivatives in the first place. I resorted (and can keep resorting) to:

/**
 * Implements hook_ENTITY_TYPE_insert().
 */
function gnode_node_type_insert(NodeTypeInterface $node_type) {
  \Drupal::service('group_relation_type.manager')->clearCachedDefinitions();
}

Right?

catch’s picture

@kristiaanvandeneynde yes that will continue to work fine. We definitely can't deprecate using derivatives, we just need to stop using them in some specific cases. I would hope there are a lot less group types than fields or custom blocks.

Berdir’s picture

Exactly, the sole purpose of this issue is to replace clearCachedDefinitions() calls with cache tags. when you can invalidate cache tags you can always also call that API method. For an IMHO marginally better DX. It's a task, not a bugfix.

That said, the issue title also mentions contexts and age, which this issue does not yet implement and I also struggle to see use cases for that. Plugins that exist or don't exist or are different in some cases seem like a very bad idea.

Nigel Cunningham’s picture

The patch also needs more work. It removes BlockContent::invalidateBlockPluginCache without modifying the invocations of that method.

Nigel Cunningham’s picture

Here's an updated version of #67 that addresses the BlockContent issue I mentioned in the previous comment.

Wim Leers’s picture

Title: Allow plugin derivers to specify cacheability (tags, contexts, max-age) for their definitions » Allow plugin derivers to specify cache tags tags for their definitions
Issue summary: View changes

@Berdir in #74: good point. Re-scoped. (The scope was expanded in #56.)

But reading your #68, @kristiaanvandeneynde's #72 and @catch's #73, I'm wondering if we should instead close this issue? 🤔 #3043330: Reduce the number of field blocks created for entities (possibly to zero) already landed, and #2940755: block_content block derivatives do not scale to thousands of block_content entities and #3365551: Add the notion of a 'configured layout builder block' to solve a number of content-editor and performance pain points would further reduce the need for this, for the one use case (Layout Builder) where this is a significant performance issue?

kristiaanvandeneynde’s picture

I wouldn't necessarily read my comment as "nope, we don't need this".

I'm trying to get rid of caches we need to manually clear in my modules in favor of those that support cacheable metadata (mainly cache tags). For the Group example I made, if I were to tag my group_node deriver with node_type_list, I wouldn't have to manually clear my definitions anymore. So on that front it is useful.

FWIW I'd actually like to see most of these manual cache clears go away in core too as it's asking for trouble when some outside code inevitably changes the data without knowing that they should manually clear a cache somewhere. There will probably be a few cases where we can't use tags (such as the internal property of MemoryCache), but any "cache property" on a class that can be replaced by a MemoryCache, should be IMO. Same for persistent caches that are currently not using tags, but could be (such as is the case here).

Or at least where it makes sense.

Berdir’s picture

> But reading your #68, @kristiaanvandeneynde's #72 and @catch's #73, I'm wondering if we should instead close this issue?

Yes, that is my suggestion. There is an API to clear caches, if you can invalidate a tag you can also invalidate the plugin cache. Per #68, we specifically worked on removing as many cache tags as possible in plugin definitions because every plugin definition with cache tags results in runtime costs on every single request (if the cache tag isn't used anywhere else as well).

kristiaanvandeneynde’s picture

There is an API to clear caches, if you can invalidate a tag you can also invalidate the plugin cache

This is implying we always manually clear cache tags, though. When saving an entity this is far from the case as they have an individual cache tag, the default list cache tags and a potentially high amount of custom (list) cache tags.

To repeat the Group example, I'd need to implement group_node_type_(insert/update/delete) to clear a cache where, if we were able to add the node_type_list cache tag on the discovered collection, I would have to do nothing at all. But in that case, I wonder if we'd want to add said list cache tag in the deriver or the plugin definition or base plugin class instead.

Regardless of whether we close this issue, a big downside is that if I were to have, say, group_node, group_menu, group_whatever, I'd have to constantly clear the plugin definition cache whenever one of these groupable entities is saved. Which is insane, yet I see no way around this as we don't have "subcollections", only one big pile of all definitions.

Per #68, we specifically worked on removing as many cache tags as possible in plugin definitions

Do you have an issue link on this so I can read up? Might change my mind when I see the discussion surrounding that.

TL;DR: Derivers can lead to poor cacheability either way (see Group example), so I'm on the fence.

kristiaanvandeneynde’s picture

Come to think of it, if we were to split up the cache for a plugin manager to the following:

  • One entry for the 'regular" definitions, tagged with probably nothing
  • One entry for each deriver's definitions, tagged with the list tag of what it depended on
  • One aggregate entry containing all definitions, tagged with all the things

Then it would work a little more like the render cache.

Say you have a deriver that depends on node types, one that depends on menus and one that depends on vocabularies. Then, with the above, if a new vocabulary was cleared, we'd still get cache hits for the common plugins, the node type ones and the menu ones. So the aggregate would return a cache miss, we'd find all but the taxonomy base definitions as cache hits and be able to quickly store a new aggregate.

With this system, it would not be so bad to add cache tags to the derivers as they wouldn't flush the whole list of plugin definitions, but rather a subset.

joachim’s picture

> There is an API to clear caches, if you can invalidate a tag you can also invalidate the plugin cache.

Yes, but isn't the point of this issue that module B can define some derivative plugins that depend on module A's things, and module A doesn't know about it.

So when you change module A's data, module A doesn't know that the plugins need to be updated. What it does do is invalidate its cache tags.

Wim Leers’s picture

#82++

👀

Berdir’s picture

> So when you change module A's data, module A doesn't know that the plugins need to be updated. What it does do is invalidate its cache tags.

Do you have a specific example for this?

If module B defines something based on data from module A, then it can listen to those changing and invalidate the plugin definitions. almost any derivatives that I've seen that can change are entities, and those have standardized hooks.

tim.plunkett’s picture

I don't remember which hooks we had or if they were "standardized" yet, but 6 years ago when I opened this, I said:

Allow the derivers themselves to affect the list of cache tags used.
This will remove the need for external code (like a hook) to clear the cache.

So I don't know that this has any practical application anymore if we're okay with needing hooks forever

Berdir’s picture

It's not my decision to make, but my *opinion* is that a slightly improved DX (see changes in block_content as a good example) is not worth the performance cost that this would definitely cause.

The block_content changes also shows a considerable performance regression because I suspect this patch predates non-reusable blocks and with cache tags, we currently don't have a way to only invalidate on reusable blocks (maybe non-reusable blocks shouldn't invalidate the list cache tag? but that's a separate issue)

joachim’s picture

> If module B defines something based on data from module A, then it can listen to those changing and invalidate the plugin definitions. almost any derivatives that I've seen that can change are entities, and those have standardized hooks.

We can do that.

But I thought the benefit of cache tags was that instead of listening to changes and then clearing caches, we more simply declare what things our cache depends on, and then invalidation happens automatically.

John Pitcairn’s picture

^^ This. Being able to simply declare what entity list tags you depend on is a lot more transparent than having to discover which ugly procedural hooks you need to implement in a separate file (update and insert, what?), and what you have to do in those to then invalidate the entire plugin cache.

The DX improvement is more than minor IMO.

Berdir’s picture

> and what you have to do in those to then invalidate the entire plugin cache.

You always invalidate the entire plugin cache. #81 is not what is implemented here and I have no idea how that would work. Plugins in many cases require us to know all the plugins of a given type, as you often pick them from a list of all the plugins. Yes, sometimes you just load a specific plugin but the discovery API would require major IMHO BC breaking changes to support anything like that.

Cache tags are designed to be fast to invalidate an unlimited amount of cache entries, with the downside of having a runtime cost to check if cache tags are still valid on every single cache get. Plugin caches are in almost all cases a single or a finite amount of cache entries (e.g. one per language). That is not the use case that cache tags were designed to solve.

This didn't actually do test runs in a long time, a MR would show those problems in the new performance tests.

If you're concerned about the amount of derivative plugin definitions and frequent invalidations of them then you very likely should not be using derivatives but block configuration or so. See #70.

And as mentioned in #86, the the patch currently would invalidate block plugin definitions every single time that non-reusable blocks in e.g. layout builder would be edited, another performance regression.

kristiaanvandeneynde’s picture

FileSize
1.24 MB

Cache tags are designed to be fast to invalidate an unlimited amount of cache entries, with the downside of having a runtime cost to check if cache tags are still valid on every single cache get. Plugin caches are in almost all cases a single or a finite amount of cache entries (e.g. one per language). That is not the use case that cache tags were designed to solve.

This part I understand and can appreciate. Looking at the plugin managers in core, we usually set only one cache or one cache per language, so at best you have as many cache entries as there are managers and at worst you multiply that by the available languages.


However, we have almost 50 plugin managers in core (see screenshot for part of the list). Out of those, 3 (entity_types, local_task, breakpoint) already set cache tags on the cache entry, 4 vary by language, and then there's Migrate and Views also multiplying their cache entries by plugin type. The ones that already set cache tags

At one point you've got to question the DX behind having to manually clear one out of somewhere between 70 and 100 cache entries, as opposed to not having to do anything because cache tags take care of things for you.

Furthermore, the cache tags we'd probably be setting are (bundle) list cache tags. While not guaranteed, it's not unthinkable that many pages will contain at least one element that is also tagged with said cache tags and, as both you and the docs mention, that does not incur an extra performance hit.

See this part of the default plugin manager:

 * @param array $cache_tags
   *   (optional) When providing a list of cache tags, the cached plugin
   *   definitions are tagged with the provided cache tags. These cache tags can
   *   then be used to clear the corresponding cached plugin definitions. Note
   *   that this should be used with care! For clearing all cached plugin
   *   definitions of a plugin manager, call that plugin manager's
   *   clearCachedDefinitions() method. Only use cache tags when cached plugin
   *   definitions should be cleared along with other, related cache entries.
   */
  public function setCacheBackend(CacheBackendInterface $cache_backend, $cache_key, array $cache_tags = []) {

So I'd say we don't close this just yet but rather wait for someone to pour the patch into an MR and see what performance tests have to say?

kristiaanvandeneynde’s picture

Re your mention of #70 I'm still not sure where to go if derivers are considered a "bad" pattern. In Group, you have plugins for every node type, menu, taxonomy, etc. How would we achieve that if not for derivers?

joachim’s picture

If the problem with cache tags is performance and definitions, what about changing how a particular type's plugins are cached?

We currently cache plugin definitions as a single cache entry in the 'discovery' cache bin. That works fine for a relatively small set of plugins. But for a larger set, we could define a cache bin specifically for that plugin type, and store one cache entry per plugin definition and then each one gets its own invalidation.

Berdir’s picture

> Furthermore, the cache tags we'd probably be setting are (bundle) list cache tags.

You might, but the primary use case that's implement in this issue is for block_content content entities and their list tag.

> While not guaranteed, it's not unthinkable that many pages will contain at least one element that is also tagged with said cache tags and, as both you and the docs mention, that does not incur an extra performance hit.

I assume you mean the list cache tag of the bundle entity type (e.g. node_type_list) in this case and not the per-bundle list cache tags of the main entity type (e.g. node_list:article). I would argue that those are very, very rarely used on pages, because page content usually doesn't need to react on having new/removed bundles? rendered entities rely on field and formatter configuration, but those are different cache tags. And even if a cache tag is used on some/many pages, it won't be used on many other things like API, Drush, .. where your plugins will be loaded and used as well.

> At one point you've got to question the DX behind having to manually clear one out of somewhere between 70 and 100 cache entries, as opposed to not having to do anything because cache tags take care of things for you.

I don't get the argument here. You need to clear the specific plugin type(s) that you provide derivates for. It is not relevant if there are 10, 70 or 500 others? I would assume that cases where you expose something to more than 1/2/3 different plugin types are very rare?

> Out of those, 3 (entity_types, local_task, breakpoint) already set cache tags on the cache entry

Yes, that's what's left after #3335768: Manually clear cache keys from plugin managers with finite variations instead of using cache tags, it was quite a few more before that. The remaining ones are special, local task for example caches things based on the current page, there are many cache ids and we need a cache tag, the other have similar reasons, as does the library cache for example.

> Re your mention of #70 I'm still not sure where to go if derivers are considered a "bad" pattern. In Group, you have plugins for every node type, menu, taxonomy, etc. How would we achieve that if not for derivers?

I wouldn't. I think your use case is valid and and most sites have a manage amount of these things that rarely change (menu might be a bit tricky, we do have a site with 250 menus, already starts to get tricky there, but there are far worse performance issues there about so many menus and menu links).

I'm talking about cases like the two issues mentioned in #77. There are sites with hundreds and thousands of reusable block content entities, that causes havoc for the block plugin definitions. (And tens of thousands of non-reusable ones).

I'm not suggesting to remove derivers completely, just that you should consider if derivers are the right tool for your use case, because they do not scale and it's often not necessary. per-menu block derivates in core for example IMHO could easily be replaced with a configuration element where you select the menu, I don't think UX would be worse than now.

> We currently cache plugin definitions as a single cache entry in the 'discovery' cache bin. That works fine for a relatively small set of plugins. But for a larger set, we could define a cache bin specifically for that plugin type, and store one cache entry per plugin definition and then each one gets its own invalidation.

We can't discover single items, so invalidating single items also can't work. And it would result in even more cache tags, because then we'd definitely need cache tags for these entries. As mentioned, an existing pattern for this would be views data, where we have a single global cache and per-table caches because it's a pretty safe assumption that most pages will only need one/a few tables of views data. But invalidation still invalidates everything.

The only feasible solution for large sets with the current plugin architecture is to avoid having large set in the first place.

But: this issue is not about large sets, and it's not about removing the ability to have derivatives. Those issues are just related/being mentioned because this issue changes a few specific implementations that shouldn't use derivates in the first place. But that's not the reason why I'm against doing this. The reason for that is #3335768: Manually clear cache keys from plugin managers with finite variations instead of using cache tags and the related changes around fast chained cache backends.

kristiaanvandeneynde’s picture

Re #93: Thanks for clarifying. For a second I was under the impression that all derivers were considered "bad". I fully agree on blocks being too giddy on the things they create derivatives for (such as your menu block example).

The only feasible solution for large sets with the current plugin architecture is to avoid having large set in the first place.

Yeah, hence my comment in #81. A bit besides the question here, but something that could solve the "all plugins are thrown into one big pile" issue.

I've read up on #3335768: Manually clear cache keys from plugin managers with finite variations instead of using cache tags and I can see why we did that. We were adding arbitrary cache tags that nothing else was using and probably never would use. So I'm 100% on board with these cache tags being removed there. What I'm looking for here is for a deriver to say: "I'm gonna spit out a bunch of derivatives based on node types, so the collection should be cached with the node type list tag".

We don't need to use that for blocks, or any other place where using derivatives is already questionable. But for those places where derivers do make sense, such as in Group, it would be a nice QOL improvement to be able to specify (list) cache tags. Even though that may end up flushing the full list of plugin definitions whenever a new node type, menu, etc. is introduced. But that we can tackle in a separate issue which focuses more on what I said in #81.

Would the above paragraph be considered on topic for this issue? Because if it is, then maybe we can narrow down the MR to facilitating just that.

kristiaanvandeneynde’s picture

Re:

I don't get the argument here. You need to clear the specific plugin type(s) that you provide derivates for. It is not relevant if there are 10, 70 or 500 others

Yeah, but I'd rather just add node_type_list as a cache tag than have to manually call a cache clear in node_type_insert, node_type_update, node_type_delete and repeat that process for any other deriver that relies on menus, taxonomies, etc. I'd end up with a dozen hook implementations where a few simple cache tags could have achieved the same result.

catch’s picture

If/when we do #3366083: [META] Hooks via attributes on service methods (hux style) then node_type_insert, node_type_update, node_type_delete could all be implemented in a single method with three attributes (or one attribute with three hook names) - assuming they do the same thing.

If bundles is the main use case here, I do wonder whether a single 'bundle_list' cache tag that covers every bundle on every entity type is worth looking at, then we wouldn't get cache tag checksum overload if it's re-used. Would be a bit less granular but bundles being acted on is fairly unusual.

kristiaanvandeneynde’s picture

If bundles is the main use case here, I do wonder whether a single 'bundle_list' cache tag that covers every bundle on every entity type is worth looking at, then we wouldn't get cache tag checksum overload if it's re-used. Would be a bit less granular but bundles being acted on is fairly unusual.

That would seem reasonable to me. Can then add said cache tag in my plugin manager and document it as such:
"Many plugins that allow you to put a specific entity type into a group will often have a derivative for each bundle of said entity type. As such, we invalidate the plugin definitions cache whenever a bundle is updated."

The only major downside being that if you're just using GroupNode, you'd also flush the cache whenever you edit a taxonomy, group type, etc. But you're right: How often does that happen?

Somehow this feels like asking "what's the worst that could happen?", to which the answer is often not what you expected :P

Berdir’s picture

We already have a entity_bundles cache tag, invalidated in \Drupal\Core\Entity\EntityTypeBundleInfo::clearCachedBundles(). So if you want to use that in your plugin manager, you could. But not everything you list are bundles (menus are not) and things are bundles that you might be aware of (webforms for example, which is I think the major reason why #3043330: Reduce the number of field blocks created for entities (possibly to zero) caused such issues for some sites, as having dozens/hundreds of webforms isn't unheard of, compared to other bundles types).

Ther's also hook_entity_bundle_create and delete, but not update. But there is hook_entity_update/insert/delete, where you can check the entity type bundle_of information, and you could then only invalidate if that's actually an integrated entity type, if you have that information.

kristiaanvandeneynde’s picture

Okay, so maybe a bundle cache tag isn't great. Which leads us back to #94: Could we allow a deriver to set a cache tag, but not necessarily implement that for things like blocks. At least, that way, Group and other modules could use it.

These cache tags wouldn't be unique, like the ones from #3335768: Manually clear cache keys from plugin managers with finite variations instead of using cache tags were. If you clear node_type_list, the /node/add page among other things also need rebuilding. The only downside is that you'd now check for the node_type_list tag in the database on pages where you previously might not have.

To me it feels like better DX vs one extra query to the cache tags table. Or am I oversimplifying this too much?

kristiaanvandeneynde’s picture

I mean:

  • I know my definitions depend on node types.
  • I know that when the list of node types changes, so do my plugin definitions.
  • I know there's a cache tag that I can use for this.

To me, the DX screams: Use the cache tag!

Now I understand that not using a cache tag means one query fewer, provided I manually clear the cache on certain events. But isn't this why we have cache tags in the first place? To make it so we don't have to worry about manually clearing caches.

I also get the nuance that we can't predict all the CIDs like we can here and that that's where cache tags truly shine. For instance: it's impossible to know the CIDs of all the views that might depend on the list of nodes without running some extra, likely expensive, code.

I don't know, I just feel like this is a step backwards in terms of DX. Hux-type hooks, while awesome, won't help here either because the manager does not know beforehand which entity types' hooks it needs to subscribe to. I can't predict the entity types people might want to write a Group plugin for.

Berdir’s picture

Now I understand that not using a cache tag means one query fewer, provided I manually clear the cache on certain events. But isn't this why we have cache tags in the first place? To make it so we don't have to worry about manually clearing caches.

Not sure what else I can say that I didn't already say here.

Yes, cache tags are great, but they come with a cost, and that needs to be considered. Not with every basic cache tag you add to a render array, that usually doesn't result in a measurable difference (also because there often are already unique cache tags on them anyway), but when deciding whether or not we should add such a fundamental API? Definitely. Plugin caches are usually using the fast chained backend, so that means no extra DB queries apart from the per-bin timestamp check, so every plugin that that ends up having at least one cache tag means we suddenly have database queries where before we didn't.

*If* we have this API in place, then specific contrib/custom deriver implementations are in most cases not going to carefully consider the implications, they'll that this is supported and use it.So it's on us, in this issue, to decide about supporting that.

That's why we have #2241377: [meta] Profile/rationalise cache tags open for a decade now, to consider questions like this (although with the performance testing that we have in place now, we might actually be able to just close that?)

If you clear node_type_list, the /node/add page among other things also need rebuilding. The only downside is that you'd now check for the node_type_list tag in the database on pages where you previously might not have.

Yes, but that's exactly the point? Cache tags on a specific page (and likely many specific pages then) is the only relevant metric. Cache tag retrieval is only a static cache, so how many unique cache tags you have on a specific page is exactly what matters. Loading the checksum for node_type_list on /node/add is perfectly fine, doing so on basically every single page is a different story.

That's also why the long list of plugin caches you posted in #90 is for me an argument against doing this. That just means more possible cases that could result in additional, unexpected cache tag lookups.

Hux-type hooks, while awesome, won't help here either because the manager does not know beforehand which entity types' hooks it needs to subscribe to. I can't predict the entity types people might want to write a Group plugin for.

I think hux/event style things were mentioned because you could have a single method to be notified on insert/update/delete events, likely for multiple specific entity types too, while currently you'd need several functions and either per-entity-type hooks or all with runtime checks.

joachim’s picture

A couple of thoughts:

1. Part of the problem here is that we treat both plugins and blocks as hammers to use on everything. Plugins are designed to be swappable pieces of functionality, but they end up being used for anything we need a 'thing' there could be more than one of. Similarly blocks - we've extended them with content blocks, and then re-purposed them to work inside layouts, which means that instead of having maybe 20-50 blocks on a big site, we have hundreds because now there are blocks that are only used on one page.

We've pushed both of these systems beyond what they were designed to do.

2. Cache invalidation is weird and not well documented. But basically, the cost is that every cache read needs to check the tags to see if anything has been invalidated. I've always wondered why it's this way round rather than what I would naively have done, which is to delete cache entries when the invalidating change is made. In this particular case, would that kind of cache cleaning strategy work bettter? Could it be made swappable, or is tag invalidation baked too deeply into the cache system?

catch’s picture

@joachim

Cache invalidation is weird and not well documented. But basically, the cost is that every cache read needs to check the tags to see if anything has been invalidated. I've always wondered why it's this way round rather than what I would naively have done, which is to delete cache entries when the invalidating change is made. In this particular case, would that kind of cache cleaning strategy work bettter? Could it be made swappable, or is tag invalidation baked too deeply into the cache system?

It's entirely swappable, but trying to directly delete cache items is almost never the correct implementation, and it's why so many other cache implementations (including Symfony's and Laravel's) outside Drupal fail to implement cache tags, or when they do, they do so in a way that doesn't scale to the kind of situations Drupal's implementation allows for (like hundreds of thousands of cached rendered entities with their own cache tags).

1. When caching in a relational database, it's possible to delete cache items on tag invalidation, but we would need to run a DELETE query, and deleting a lot of rows is extremely expensive and slow. Doing this on cache_render is a non-starter.

2. Core's other main cache backend is APCu, and it's not possible to introspect all APCu entries to check if they have cache tags. In the chained fast case we just blow away the 'fast' cache whenever there's an invalidation/write anyway because it's backed by the persistent backend, that's why there are no cache tags checks on chained fast hits. But e.g. memcache has the same issue, so the checksum approach is necessary for a lot of contrib cache backends too.

So it would be possible to write a no-checksum cache tag supporting database backend, and then change the discovery cache bin(s) to use that database backend. But as soon as that cache tag is used in a render array or anything that's stored in a cache bin using the checksum backend, we'd still have to query it anyway, and when it gets invalidated, it'd need to be both deleted from one type of cache backend, and incremented in the checksum implementation, so now two things have to happen on invalidation instead of one.

There's a good high-level write up here https://mglaman.dev/blog/drupal-cache-tags-all-regardless-your-backend

@berdir

I think hux/event style things were mentioned because you could have a single method to be notified on insert/update/delete events, likely for multiple specific entity types too, while currently you'd need several functions and either per-entity-type hooks or all with runtime checks.

Yes exactly.

joachim’s picture

> 1. When caching in a relational database, it's possible to delete cache items on tag invalidation, but we would need to run a DELETE query, and deleting a lot of rows is extremely expensive and slow. Doing this on cache_render is a non-starter.

In our case, the cache invalidation happens when a user edits a config entity. It's an admin task and it's fine for that to take some time.

So this ties in to another feature I've been thinking about -- deleting dependent entities or cleaning up dangling references. There are various contrib modules that specifically do this, and they all hit the problem of scaling -- when you save an entity form, you can't go off and delete or update 1000s of entities during the save hook.

Core needs a way to be able to say 'Hey this create/update/delete API call you're making requires some follow-on tasks, we're taking you to a Batch API page while we run them'. I don't know how we'd do that, as batch API is tied closely to forms, whereas we'd want something at the API level.

catch’s picture

In our case, the cache invalidation happens when a user edits a config entity. It's an admin task and it's fine for that to take some time.

It's not necessarily an issue that it's slow for the person triggering the cache invalidation, it's that slow queries can lock up the database in general when there's lots of contention.

Let's say a DELETE query on the cache_render table takes 2s. During those two seconds, requests are coming in, and they're trying to write cache items to it as well - either unrelated ones, or ones that were just invalidated within the two seconds. Then on top of this, someone saves a node which invalidates the node:list cache tag and this triggers another 2s delete query on the cache_render table, but is blocked on the first query due to row locking.

It's the sort of thing that can cause site outages. Exactly how bad it is depends on what the indexes are like, but given we'd have to store all the tags in one column, it would require a LIKE query so it's not going to be perfectly indexed whatever we do. This is the kind of issue we avoid even having to think about by using the checksums.

Core needs a way to be able to say 'Hey this create/update/delete API call you're making requires some follow-on tasks, we're taking you to a Batch API page while we run them'. I don't know how we'd do that, as batch API is tied closely to forms, whereas we'd want something at the API level.

#1189464: Add an 'instant' queue runner and #1797526: Make batch a wrapper around the queue system would be steps towards this - i.e. we should really use the queue API for things like this, but then we also don't want to wait up to three hours for cron to run (and maybe two days for it to clear out the queue), or necessarily to block someone's use of the website for 20 minutes while the batch runs. If we could add items to queues, and then have a non-blocking queue runner clear stuff out of the queues either end of request or via a small AJAX indicator in the toolbar or similar, we'd be able to defer a lot of these longer jobs that way, and massive sites with dedicated/frequent queue runners could rely on those too. Bit off-topic for here though, and it doesn't help with the cache invalidation issue for the reasons given above.

kristiaanvandeneynde’s picture

Right, I've spent some time reviewing the last few comments and I fully understand where the hesitation is coming from.

However, it seems like we're shying away from implementing a QOL feature due to a drawback from another feature (cache tag invalidation count lookups). So let me counter this by saying: "What if we made that drawback go away instead?"

The current implementation is that for every cache item we check its cache tags and, for those that we did not retrieve the invalidation count for yet, we fetch the invalidation counts from the DB. This leads to multiple of the same queries to the DB and this query count would increase if we allow derivers to add cache tags to the discovery.

Counterpoint: What if we build a simple system that allows us to either specify or identify commonly used cache tags and, at the very first call to ::calculateChecksum(), we load ALL of them with one query. This would both improve the performance of Drupal as it stands and make the current issue less of a hassle because we could (but don't have to) add whatever cache tag our deriver specifies to the "common" list.

I feel like this wouldn't necessarily have to be a smart system that tries to automagically deduce which tags belong to the list, but could rather be a very rudimentary system where both core and contrib can specify tags to add to the list. We could then even have the entity type manager make sure that all entity type's list tags are added to said list, for instance, but that's up for discussion.

This would be a simple alternative to #3358701: Add a CacheTagsAggregator that's actually quite easy to implement and test with our new performance tests. We could start with looking at grafana for the tags that are being retrieved during our performance tests and add all of them (aside from node:ID etc.) to this list.

catch’s picture

Counterpoint: What if we build a simple system that allows us to either specify or identify commonly used cache tags and, at the very first call to ::calculateChecksum(), we load ALL of them with one query. This would both improve the performance of Drupal as it stands and make the current issue less of a hassle because we could (but don't have to) add whatever cache tag our deriver specifies to the "common" list.

I feel like this wouldn't necessarily have to be a smart system that tries to automagically deduce which tags belong to the list, but could rather be a very rudimentary system where both core and contrib can specify tags to add to the list. We could then even have the entity type manager make sure that all entity type's list tags are added to said list, for instance, but that's up for discussion.

This seems worth a go, it would be a bit like a (simpler) version of the path alias multiple load/cache - generate the list of things that we think might be loaded, and then load the actual things in one query.

kristiaanvandeneynde’s picture