Cache tags grow endlessly [#3097393]

Problem/Motivation

Drupal core ships with cache tags support - awesome.
To ensire that tags for an item were not invalidated on cache read a CacheTagsChecksumInterface service is used (eventuially) and D8 core provides such a service as a DB variant.

My issue is with the cache tags sub-system internals. As it is implemented, the list of cache tags on the platform will grow endlessly due to the DatabaseCacheTagsChecksum implementation.

In the following scenario with a highly-volatile custom entities - add 100k instances, delete them, add new 100k back.

The system will end up with a 200k cache tags in the table, 100k of them will not be used again ever. They will just stay there and clutter the database and cause overall slow-downs. Imagine when this process continues for a while...

Even after a full cache clear (that happens only rarely) all tags are still kept there.

In my case the cache tags is the highest throughput and biggest time consumer as a DB query in the whole of system, even though it's fast on average. I have around 40-50k valid entities in the system and around 120-130k cache tags in the table.

I think this problem affects only the DB implementation, as Memcache and Redis (if they have implementations on the interface) will scale in O(1) compared to Log(N) based on the amount of data in the system. On top of that they have a robust garbage collection mechanisms in case of memory pressure. SQL databases have none of that.

Proposed resolution

Create a new interface Drupal\Core\Cache\CacheTagsPurgeInterface that has one method purge()
Have cache_tags.invalidator and cache_tags.invalidator.checksum implement the interface and method. In cache_tags.invalidator::purge(), loop through all the collected invalidators and call purge() on those that implement the interface. In cache_tags.invalidator.checksum::purge(), truncate the cachetags table
In drupal_flush_all_caches(), call cache_tags.invalidator::purge() right before flushing cache bins. (If a cache tag is invalidated between the tags being purged and cache bins are flushed, then it will be included in the checksum of any new cache items, but still valid because the tag was written before the creation of cache item.)

Remaining tasks

User interface changes

None.

API changes

cache_tags.invalidator and cache_tags.invalidator.checksum services now implement a new interface Drupal\Core\Cache\CacheTagsPurgeInterface and its purge() method. This method is used by the cache tag checksum service to delete its data.

If alternative implementations of the cache tag checksum service also need to explicitly purge their data, they can implement this interface.

Data model changes

Release notes snippet

Issue fork drupal-3097393

Show commands

Start within a Git clone of the project using the version control instructions.

Add & fetch this issue fork’s repository

Or, if you do not have SSH keys set up on git.drupalcode.org:

Add & fetch this issue fork’s repository

3097393-cache-tags-grows changes, plain diff MR !11875
Check out this branch for the first time

Check out existing branch, if you already have it locally

About issue forks

Comments

Comment #1

28 November 2019 at 16:09

ndobromirov created an issue. See original summary.

Comment #2

ndobromirov commented 28 November 2019 at 16:17

Issue summary:

View changes

Comment #3

wim leers

Ghent 🇧🇪🇪🇺

commented 4 December 2019 at 22:56

Status:

Active

» Postponed (maintainer needs more info)

In the following scenario with a highly-volatile custom entities - add 100k instances, delete them, add new 100k back.

The system will end up with a 200k cache tags in the table, 100k of them will not be used again ever. They will just stay there and clutter the database and cause overall slow-downs. Imagine when this process continues for a while...

This is indeed expected behavior.

You're doing something pretty atypical: creating 100K entities, deleting them, then recreating them.

Is this a custom entity type? Are entities of this type always as ephemeral as in your description?

Comment #4

gapple

he/they

English

commented 5 December 2019 at 06:51

My use case is a migration that regularly consumes content from an external source, and after a period the content expires and the nodes are deleted. It's probably (only) up to a few hundred nodes per day in my case, but there's currently 1.5m cachetags on the site due to resetting the migration frequently in the past.

Comment #5

ndobromirov commented 5 December 2019 at 08:58

Status:

Postponed (maintainer needs more info)

» Needs review

In our case the problem is not that extreme as the example in the description, but there are still ~70k unused cache tags in the cachetags table.

It is content from an external system that is synced as local custom entities.
200-300 editors manage content from that system and from time to time delete content.
The deletes items are propagated to Drupal and gert deleted as well daily.
Deletes we do are only the delta of the change (not delete all add all), usually ranging from 50-150 deleted things daily out of ~80k total.

We had to clear some redirects related to those contents, due to URL problems (all those deleted redirect cache tags are still there), etc.

Long story short - for about an year we've accumulated ~60k+ dead cache tags in total.
They are a mix of: our custom entities, redirects, terms, nodes etc.

As this query started showing as a top DB consumer in NewRelic so I've digged in a bit :).

This is indeed expected behavior

I suspected so, as I came to the same conclusion going through the code's documentation.

The issue was opened with the aim at documenting that it's the intended behavior as well as hopefully a direction of how to clean-up the dead tags in a safe way.

The easy things I can think of doing are:
- Update hook that will truncate the table them as a one-off.
- Drush command that will do essentially the same + a full cache clear upon invocation. We could have that run once per month, so data does not pile too much.
- Another direction is to move to a Memcache / Redis implementation of the checksum service (seems there is one on both modules). As they have native LRUs internally sooner or later cache tags will get reclaimed.

Personally seems like a strange behavior to just pile the data endlessly, without any process / tool to reclaim those resources in a way different than truncating the table manually from SQL.

Comment #6

ndobromirov commented 5 December 2019 at 08:59

Status:

Needs review

» Active

Wrong status...
There is still nothing to review.

Comment #7

jcisio commented 9 January 2020 at 11:33

In one of our projects, which a lot of imports, there are 1.7 millions entries in the cachetags table. A simple query UPDATE {cachetags} SET invalidations=invalidations + 1 takes 10 seconds. Luckily Drupal 8.8 with transaction supported cachetags invalidation partly helps with that, but a large table is always a problem.

Comment #8

9 January 2020 at 11:33

Version:

8.9.x-dev

» 9.1.x-dev

Drupal 8.9.0-beta1 was released on March 20, 2020. 8.9.x is the final, long-term support (LTS) minor release of Drupal 8, which means new developments and disruptive changes should now be targeted against the 9.1.x-dev branch. For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

Comment #9

9 January 2020 at 11:33

Version:

9.1.x-dev

» 9.2.x-dev

Drupal 9.1.0-alpha1 will be released the week of October 19, 2020, which means new developments and disruptive changes should now be targeted for the 9.2.x-dev branch. For more information see the Drupal 9 minor version schedule and the Allowed changes during the Drupal 9 release cycle.

Comment #10

9 January 2020 at 11:33

Version:

9.2.x-dev

» 9.3.x-dev

Drupal 9.2.0-alpha1 will be released the week of May 3, 2021, which means new developments and disruptive changes should now be targeted for the 9.3.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Comment #11

toamit commented 27 May 2021 at 01:16

Just released Cache Utility contributed module to truncate cachetags and cache_* table via the UI, Drush and Curl commands. With this module I can set a cron job to periodically purge the cachetags table to keep database sizes sane.

Comment #12

jweowu commented 17 September 2021 at 02:48

Could someone give a summary of the logic behind this behaviour?

In particular, why cachetags is excluded from a full cache purge.

If the purpose of the cachetags table is to detect whether or not it would be valid to obtain and use a given cache entry, but the entire cache has been purged (meaning that there are no cache entries to obtain, valid or otherwise), then why do we need that old cachetags data to stick around?

The only thing that occurs to me offhand is that, in the middle of the process of purging all of the caches one by one, some of the caches may be rebuilt before the last of the caches is emptied, and furthermore some of the new entries in the rebuilt caches might be invalidated before the remaining cache purges are completed -- in which case you'd want to retain the information about the invalidations which occurred since the purge began.

If that's why, is there any reason not to use a REQUEST_TIME timestamp instead of a sequential count invalidations | integer in the cachetags table? I may well be missing something as I've only started looking at this, but at present I don't understand why the specific number of invalidations might be needed.

The only code which calls getTagInvalidationCounts() is calculateChecksum() so my firm impression is that invalidations just needs to be different to all previously-used values for the purpose of generating a different checksum, in which case a timestamp would (a) achieve that equally well; (b) eliminate any need to do +1 calculations when updating; and (c) it would allow us to say "We started the cache purge at time T and have finished that purge, so now delete all cachetags rows with a timestamp < T, because all of those rows were made redundant by the purge."

Comment #13

jweowu commented 17 September 2021 at 03:31

Thinking more clearly, a fairly-obvious answer to my question is that a given item may be cached and then invalidated multiple times during the same request, or by different requests happening at the same REQUEST_TIME, which does mean you need something like a sequence with write-locking for enforcing uniqueness.

A timestamp could perhaps be an additional column in that case, rather than a substitute for the count? The ability to flush old rows still seems important.

Comment #14

loopy1492 commented 8 November 2021 at 18:57

We also have 70,000+ cachetags due to weekly content import. Is there anything we can run on cron to clear this out?

Comment #15

toamit commented 8 November 2021 at 22:12

@loopy1492 See a new module for this purpose noted in #11

Comment #16

8 November 2021 at 22:12

Version:

9.3.x-dev

» 9.4.x-dev

Drupal 9.3.0-rc1 was released on November 26, 2021, which means new developments and disruptive changes should now be targeted for the 9.4.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Comment #17

8 November 2021 at 22:12

Version:

9.4.x-dev

» 9.5.x-dev

Drupal 9.4.0-alpha1 was released on May 6, 2022, which means new developments and disruptive changes should now be targeted for the 9.5.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Comment #18

acbramley commented 28 July 2022 at 22:52

We are seeing these issues as well when investigating problems with Redis. In our case, there are over 800k webform_submission cachetags sitting in Redis. Most of these submissions no longer exist as we have logic to purge these submissions every 2 weeks and archive them into an S3 bucket. Since these cache entries never expire, this over time will just continue to grow.

Comment #19

28 July 2022 at 22:52

Version:

9.5.x-dev

» 10.1.x-dev

Drupal 9.5.0-beta2 and Drupal 10.0.0-beta2 were released on September 29, 2022, which means new developments and disruptive changes should now be targeted for the 10.1.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Comment #20

hkirsman commented 28 December 2022 at 12:26

We found 8 million cachetags entries. About 7.2 million of them we are sure don't have corresponding entity anymore.

Yet to figure out if that's the cause of performance issues on the site.

Comment #21

wim leers

Ghent 🇧🇪🇪🇺

commented 28 December 2022 at 13:43

7.2 million entities have disappeared?!

What entity type is this? Is it perhaps one without storage?

Comment #22

hkirsman commented 28 December 2022 at 14:31

It's a custom feature. On Drupal side we create queue items (which is custom entity) for search indexing when saving node. As we use Elasticsearch then we don't want to make saving batch of nodes slow eg save 100 nodes, send 100 requests to Elasticsearch.

Didn't know there will be leftovers after deleting nodes / entities.

Wondering if this would be ok fix for now:

In .install file:

function my_module_update_8003() {
  \Drupal::database()->delete('cachetags')
    ->condition('tag', 'elastic_request:%', 'LIKE')
    ->execute();
}

And in that custom entity class overwrite postDelete(). Didn't see postDelete() doing anything other than invalidateTagsOnDelete. I would have also called parent::postDelete() if it would not be somehow re-adding that entry to DB.

class ElasticRequest extends ContentEntityBase implements ElasticRequestInterface {

  public static function postDelete(EntityStorageInterface $storage, array $entities) {
    // Delete elastic_request cache tag when deleting elastic_request entity.
    // @todo Remove postDelete() after https://www.drupal.org/project/drupal/issues/3097393 is fixed.
    foreach (array_keys($entities) as $entity_id) {
      \Drupal::database()->delete('cachetags')
        ->condition('tag', 'elastic_request:' . $entity_id)
        ->execute();
    }
  }

  ...

Comment #23

28 December 2022 at 14:32

Version:

10.1.x-dev

» 11.x-dev

Drupal core is moving towards using a “main” branch. As an interim step, a new 11.x branch has been opened, as Drupal.org infrastructure cannot currently fully support a branch named main. New developments and disruptive changes should now be targeted for the 11.x branch, which currently accepts only minor-version allowed changes. For more information, see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Comment #24

samlerner commented 4 August 2023 at 18:45

Could someone describe what the potential negative impact is to clear the entire cachetags table?

Or, how do you determine what tags are no longer in use? Check each one to see if it references deleted stuff? Could we use that to run a cleanup command on a regular basis?

Comment #25

gapple

he/they

English

commented 5 August 2023 at 04:02

As I understand, there is no problem if cachetags are cleared at the same time as a full cache flush - a new cachetag entry will be added to the database as needed when a tag is next invalidated. If done automatically when flushing the cache, it would just make the operation take longer.

If tags are cleared without clearing cache items, there is a specific, probably unlikely, case where a cached item could have the same checksum and be out-of-date but still served.
The checksum is a sum of the invalidations on the item's tags, so if:
- Item saved with: A:1,B:0 (checksum 1)
- tags are cleared, so invalidations are reset to 0
- B is invalidated
- the cache item is valid against the new checksum of A:0,B:1 despite being out-of-date

The checksum is checked for equality, so a lower checksum from the database will still invalidate the cache item (e.g. an item with two tags' checksum is 12, but the cachetags in the database have both been reset to 0 - since 0 != 12 the item will be invalid).

Comment #26

jweowu commented 6 August 2023 at 01:17

IIUC (see #12) the main problem will be that you cannot (reliably) clear all cachetags "at the same time" as clearing the caches.

Even trusting that all cache backends will reliably behave the same way, clearing caches invokes hooks so that modules can react and clear their data, and I assume there's no guarantee that along the way some of that arbitrary code doesn't cause something to be cached and invalidated.

If all of your caches were in the database and you knew where they all lived and you executed a single transaction which (only) truncated all the cache tables, deleted any other data which was required, and deleted the cachetags then I imagine that would be fine; but in practice I don't think "clearing all caches" is nearly so clean a process.

You can't naively purge cachetags before clearing other caches, because cache entries may be needed in the process of clearing caches.

You can't naively purge cachetags after clearing other caches, because in the process of clearing caches you may have acquired and invalidated new cache entries.

(n.b. This is my speculation -- I don't have a deep understanding of these processes so I might be wrong, but thinking about the issue had led me to those conclusions.)

I expect we need a way to mark the pre-existing cachetags prior to the full purge, so that entries which are unchanged following the purge can be recognised and removed.

Comment #27

eduardo morales alberti

Spanish

Spain, 🇪🇺

commented 20 September 2023 at 13:55

Some questions are related here with Vollatile Custom Entities, all entities' cache tags are invalidated on post delete, which makes sense if you are using these entities on cached places, like the case of Nodes, but if those caches tags are not used anywhere, it is better to override the method postDelete to avoid add new entries to the database,

Method postDelete EntityBase.php:

  public static function postDelete(EntityStorageInterface $storage, array $entities) {
    static::invalidateTagsOnDelete($storage->getEntityType(), $entities);
  }

Overriding the postDelete method will not invalidate the tags on delete.

Comment #28

amavi commented 4 October 2023 at 18:33

Hi :)

I use Drupal with OVH since 5 years, all is very nice with Durpal, but complex as you know.... But I have ONE problem since 5 years, all time the same, and 5 years I looking for a solution for that...

My SQL become big all time.... WIth PHP 8 I need to Clear Cache all days or my DATA SQL go to more 8giga... ANd I am block all time ba my hosting.... :(

SOmetime ca explain me a solution please? 5 Years I try....

I am on Drupal 9.5

Thx. :)

Comment #29

catch

he/him

English

commented 4 October 2023 at 19:22

Replying to #26: Cache tags can pretty reliably be cleared after bins are emptied. If a new cache item is created and not invalidates, its cache tag checksum would be 0 which will match the cache tag blank slate.

If it's created and then immediately invalidated, it's cache tag checksum will be 1 or more which will not match 0, and then it would be invalidated when next retrieved.

The only case that is not covered is if it's invalidated, has a checksum of e.g. 1, the cache tags are wiped, it is not requested, then they're invalidated in such a way that the checksum matches again and its only requested again then. This is such an extreme edge case/race condition that it can probably be ignored.

A workaround would be a cache tag last garbage collection timestamp to compare against and write that alongside the cache items. Then anything created before it would also be wiped but that adds an extra state or k/v request on every request.

Comment #30

jweowu commented 5 October 2023 at 02:55

> This is such an extreme edge case/race condition that it can probably be ignored.

I can only assume that wasn't the conclusion when this was implemented, though -- it doesn't seem plausible that the question of clearing cachetags along with other caches never came up in discussion at the time.

> A workaround would be a cache tag last garbage collection timestamp to compare against

Yeah, I was also pondering that approach in #12 and #13. I think it's a pretty reasonable idea.

An alternative would be to have a pre-purge process which copied the cachetags table to a temporary table, and then post-purge deleted all cachetags matching an entry in the temp table. That has its own noteworthy costs, but they're isolated to the time of the purge. If the table was really gargantuan, though, it might not be great (and at the point in time when a fix is deployed, some sites are inevitably going to have tables fitting that description). Offhand I think I'd lean towards the timestamp column.

Comment #31

jweowu commented 5 October 2023 at 03:08

Amavi: Is that on account of the cachetags table specifically? If a normal cache rebuild fixes things for you -- if temporarily -- then it's definitely not about cachetags (as the entire reason for the present issue is that cache rebuilds do not purge the cachetags table).

Your database will have many different cache tables, and I suspect your problem is something different to this issue. You should start by confirming which specific cache table(s) are getting so large, and then you can look for or post an issue related to that.

Comment #32

catch

he/him

English

commented 5 October 2023 at 06:56

#636454: Cache tag support is the original issue. I worked on it at the time, haven't reread it for ages, but it would not surprise me if purging just didn't get discussed or got deferred to a follow-up that never happened. It was more or less the first API addition in Drupal 8 and years before a release so that particular problem was very abstract for a very long time - no real sites used it for ages.

Comment #33

jweowu commented 5 October 2023 at 10:28

A more-palatable variant of the temporary table suggestion has occurred to me. I don't know how practical it is, but in principle I think this avoids the down sides of the other approaches mentioned.

In essence, while the cache rebuild is taking place, new cache invalidations get written to a temporary table. Then, after the cache rebuild, the cachetags table is truncated, and the rows of the temporary table are inserted.

In order for that to work, cache lookups need to know about the temporary table, something like:

if (a cache rebuild is in progress) {
  check the temporary table for cache validity
  if (the temporary table contained a row for that cache id) {
    return result;
  }
  else {
    // nothing about this ID in the temporary table
    check the regular cachetags table
    return result
  }
}
else {
  // not currently rebuilding the cache
  check the regular cachetags table
  return result
}

And similarly, when invalidating a cache entry the new invalidations value written to the temporary table would be an increment of the value in the temporary table if a row existed there already, and otherwise an increment of the row from the original cachetags table.

The lookups during cache rebuilds could be on a join of the two tables, rather than two separate look-ups.

No timestamp column needed, and no wholesale copying of cachetags; and outside of cache rebuilds the behaviour can be much the way it is at present.

Is that practical? I won't be surprised if I'm missing something, and I haven't thought through the ramifications of multiple simultaneous cache rebuilds (if that's currently permitted to happen), but it seemed worth suggesting.

Comment #34

catch

he/him

English

commented 5 October 2023 at 10:43

I think we should just add cache tag purging as part of a drupal_flush_all_caches() step, immediately after emptying the bins, and document + open a follow-up for the potential race condition where a cache item is both set and and invalidated but then not requested until after a further invalidation again, the chances of that happening are miniscule but the potential issues deriving from storing timestamps or creating temporary tables will affect everyone.

Also I think it's worth looking at starting and ending a database transaction in drupal_flush_all_caches() in case that's viable.

Comment #35

raduciobanu commented 28 November 2023 at 01:27

Got the same issue but on a larger scale on a site with lots of webform submissions, around 3.5M+ and for each submission there's an entry in the cachetags table, currently sitting at a bit more than 4M rows.

Comment #36

mariaioann commented 24 January 2024 at 07:20

I have the same problem with Message entities. I am creating messages as notifications for a large number of users, but they are being purged when they are 30 days old. Currently, we have 400K messages, but the cachetags table has 23M entries for messages, as it includs all deleted messages as well.
Is it safe to delete the relevant cache tags on a message postDelete hook?
And is it safe to batch delete all cache tags of non existing messages as an one-off cleaning action?
What I have not understood is why don't we delete an entity's related cache tag when the entity gets deleted in general?

Comment #37

catch

he/him

English

commented 24 January 2024 at 09:44

What I have not understood is why don't we delete an entity's cache tag when the entity gets deleted in general?

Cache tag storage and implementation is swappable, so there's no inherent concept of a cache tag existing as a row in a database with a counter. It would for example be possible (but very slow) to store the cache tags with the cache items, and query all cache items when a cache tag is invalidated, and have no dedicated cache tag storage at all. Because of this, there's no concept of the tag as a thing existing in itself, the checksum implementation that core uses introduces the counter system on top of string tags, but consumers of the cache API, like the entity system, don't need to know about it - they just get/set/delete cache items and invalidate tags.

If you have a high traffic site with a lot of users/content, you should strongly consider using https://www.drupal.org/project/redis, which won't run into this problem because it evicts items when it runs out of memory. This is a good idea for lots of reasons other than a large cache tags table.

Comment #38

wim leers

Ghent 🇧🇪🇪🇺

commented 24 January 2024 at 14:56

@MariaIoann

Nothing prevents you from saying "my entity does not need cache tags". See \Drupal\Core\Entity\EntityInterface::getCacheTagsToInvalidate() and \Drupal\Core\Cache\CacheableDependencyInterface::getCacheTags(). Message entities the way that you describe them appear very ephemeral, so it makes sense to me that they would not use/need cache tags.
See \Drupal\Core\Datetime\Entity\DateFormat::getCacheTagsToInvalidate() for another example.

Comment #39

catch

he/him

English

commented 16 April 2025 at 15:39

Probably the simplest solution here would be:

1. Add a CacheTagsChecksumPurgableInterface with a ::purge() method and implement it in the database backend.

2. Call this method on all cache_tags_invalidator services that implement the interface, as late in drupal_flush_all_caches, immediately after cache bins are emptied.

3. Open a follow-up for the potential race condition described in #29.

Comment #40

catch

he/him

English

commented 18 April 2025 at 00:10

Status:

Active

» Needs review

Implemented #39.

Comment #41

18 April 2025 at 00:10

catch opened merge request !11875

Comment #42

catch

he/him

English

commented 18 April 2025 at 00:15

When implementing this there's actually not really a race condition as described in #29.

If we implemented cache tag purging outside of drupal_flush_all_caches() then it would be possible for cache entries to exist with an 'incidentally matching' cache tag checksum. However, when we purge immediately after emptying the cache bins, there should be no cache entries created before the tags are purged, everything gets reset at the same time. There would have to be entries written literally during the purging itself for there to be a problem. This is about as likely as an entry being written in one cache bin that's just been emptied based on cached information in a table that's just about to be emptied - e.g. no worse than it is now.

So given that, what's in the MR might be enough.

Comment #43

godotislate

he/him

commented 18 April 2025 at 01:51

Status:

Needs review

» Needs work

A few comments on the MR.

Comment #44

amavi commented 21 April 2025 at 14:47

After 5 years of problem I have do this:

-write some Rules in my SQL for purge all CASH each hours.

Comment #45

godotislate

he/him

commented 29 April 2025 at 16:07

Status:

Needs work

» Needs review

Pushed a commit to address the issue with the test by moving the assertions to the DatabaseBackendTest subclass.

I think this showed that purge() needs to call reset(), so added that.

Comment #46

berdir

German

Switzerland

commented 29 April 2025 at 20:54

Status:

Needs review

» Needs work

Posted a review.

> -write some Rules in my SQL for purge all CASH each hours.

I'd recommend not purging all your CASH every few hours, that sounds like a very costly thing to do. (Sorry, could not resist).

More serious: If you have a large enough site that you are struggling with the size of the database cache backend and have to flush it so frequently, you should really consider using a backend with a fixed size like redis/memcache. The database backend is a basic implementation for small to medium sites. Also, this is about cache tags only, which should grow far, far less than actual cache backends.

Comment #47

godotislate

he/him

commented 29 April 2025 at 22:57

Getting test failures now. Will look again later.

Comment #48

nicxvan commented 30 April 2025 at 03:02

Title:

Cache tags grows endlessly

» Cache tags grow endlessly

Comment #49

godotislate

he/him

commented 30 April 2025 at 03:32

OK, one of the test failures is Drupal\FunctionalTests\Bootstrap\UncaughtExceptionTest::testLostDatabaseConnection, which is apparently a new intermittent failure that seems to happen a lot, even after re-run.

The other test failure was one I thought I had fixed by calling reset() in purge(), but apparently because I moved the test to a different class and its own method, the cache gets are succeeding after purge because the checksums are 0 which match the 0 returned from the table because it's empty.

Comment #50

acbramley commented 30 April 2025 at 04:20

Random failure being tracked here #3521851: [random test failure] Drupal\FunctionalTests\Bootstrap\UncaughtExceptionTest::testLostDatabaseConnection

Comment #51

catch

he/him

English

commented 30 April 2025 at 06:22

but apparently because I moved the test to a different class and its own method, the cache gets are succeeding after purge because the checksums are 0 which match the 0 returned from the table because it's empty.

Ah yeah this was one of the reasons I tried to re-use the existing test. But if we invalidate tags before the purge it should be OK?

Comment #52

berdir

German

Switzerland

commented 30 April 2025 at 06:52

I'm not sure if we even need to test cache bins at all there. Possibly for the reset() edge case, but yes, then I'd add an invalidate first.

Comment #53

godotislate

he/him

commented 30 April 2025 at 14:29

Status:

Needs work

» Needs review

Tests are fixed.

Comment #54

berdir

German

Switzerland

commented 30 April 2025 at 14:52

What if we interact directly with the cache tag checksum service only and assert what we want to know with that? Instead of indirectly going through cache entries? That feels convoluted and is IMHO tested elswhere.

Something like this:

// invalidate the tag through general invalidator service.

// assert current value.
assertEquals(1, $checksum_invalidator->getCurrentChecksum([$tag]));

// purge through general invalidator service.

// assert current value
assertEquals(0, $checksum_invalidator->getCurrentChecksum([$tag]));

the database query we can keep.

and we can still do the purge through the cache_tags.invalidator service to make sure the loop there is correct.

Comment #55

godotislate

he/him

commented 30 April 2025 at 15:37

I think all MR comments addressed.

Comment #56

berdir

German

Switzerland

commented 1 May 2025 at 11:18

Status:

Needs review

» Needs work

Left a suggestion on how to handle that in a way that I think is easier to understand.

Comment #57

godotislate

he/him

commented 1 May 2025 at 13:11

Status:

Needs work

» Needs review

Made the change per the suggestion. Also changed back to invalidating multiple tags. I'm not sure it makes any difference, but it was trivial to do just in case.

Comment #58

berdir

German

Switzerland

commented 1 May 2025 at 13:35

Status:	Needs review	» Reviewed & tested by the community
Issue tags:		+Needs change record

Looks good to me, needs a change record, not aware of a contrib implementation that would actually need to implement this, but who knows.

Comment #59

godotislate

he/him

commented 5 May 2025 at 17:10

Status:	Reviewed & tested by the community	» Needs review
Issue tags:	-Needs change record

Added change record.

Also in converation with @catch on Slack, moved the `cachetags` table truncation before clearing the cache bins, because having entries in the cachetags table doesn't hurt anything, but there could be an issue if the cache bin tables do without corresponding checksums in cachetags. Pushing back to NR for that.

Comment #60

catch

he/him

English

commented 5 May 2025 at 21:36

Examples for #59. Let's assume we have one cache item, and one cache invalidation for node:1, and our one cache item is tagged with node:1

If we clear cache bins before tags, then the following can happen:

1. Cache bins are cleared.

2. Cache miss happens, cache item is written with node:1 having 1 invalidation - because cache tag invalidations haven't been purged yet.

3. Cache invalidations are purged.

4. If node:1 is invalidated before our cache item is requested, then we're back to 1 invalidation again, and the item could be considered valid.

But if we purge cache tags before bins:

1. cache tags are purged

2. Any cache tagged cache items immediately become invalid if they assume any invalidations, because tag invalidations were reset.

3. Any new cache items get written as if there are no invalidations. Or with one invalidation if one happens during this (short) window.

4. But then, all the cache bins are emptied anyway, any new items, regardless of how many cache tag invalidations there are, will be valid.

Given that, purging tags first feels like it should be 100% correct as soon as the full cache clear is complete.

Comment #61

berdir

German

Switzerland

commented 6 May 2025 at 11:21

Status:

Needs review

» Reviewed & tested by the community

The order change makes sense, back to RTBC.

Comment #62

andypost

he/him

Russian

commented 6 May 2025 at 12:48

RTBC++ Nice to see it solved via new interface!

Comment #63

alexpott

he/they

English

🇪🇺🌍

commented 13 May 2025 at 12:48

Status:	Reviewed & tested by the community	» Needs work
Issue tags:		+Needs issue summary update

The issue summary is out-of-date with the final state of the patch. It could do with being updated to match. The proposed resolution and other sections are all out-of-date.

The MR looks good. Once the issue summary has been updated can be set back to RTBC and I will prioritise this one.

Comment #64

godotislate

he/him

commented 13 May 2025 at 13:08

Issue summary:	View changes
Status:	Needs work	» Reviewed & tested by the community
Issue tags:	-Needs issue summary update

IS updated.

Comment #65

godotislate

he/him

commented 13 May 2025 at 13:08

Issue summary:

View changes

Comment #66

13 May 2025 at 14:44

alexpott committed 3df1b8d2 on 11.2.x

Issue #3097393 by godotislate, catch, berdir, jweowu, ndobromirov, wim...

Comment #67

13 May 2025 at 14:44

alexpott committed 2999d457 on 11.x

Issue #3097393 by godotislate, catch, berdir, jweowu, ndobromirov, wim...

Comment #68

alexpott

he/they

English

🇪🇺🌍

commented 13 May 2025 at 14:45

Status:

Reviewed & tested by the community

» Fixed

Committed and pushed 2999d4574c4 to 11.x and 3df1b8d2897 to 11.2.x. Thanks!

Comment #69

alexpott

he/they

English

🇪🇺🌍

commented 14 May 2025 at 09:43

Version:

11.x-dev

» 11.2.x-dev

Comment #70

28 May 2025 at 09:44

Status:

Fixed

» Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.