Problem/Motivation

All our render cache entries like page, entities, blocks, views and so on are by default cached forever.

That's great. Except when you have a lot of data + a lot of variations and a cache backend that doesn't support something like LRU like the default database backend. Then your cache table will grow and grow and it's suddenly not so great anymore :)

You could argue that sites with a lot of data should use redis/memcache and not the database but also medium-sized with not so many users could run into problems eventually, also in regards to copying the database around since only few people remember to ignore cache tables.

Proposed resolution

Either the render cache service or the database backend (per-bin) should support an upper limit for the max-age. E.g. a day or a week. It would probably be great to vary that slightly for every write, e.g. +/- 5% so not all cache entries expire at the same time.

Remaining tasks

User interface changes

API changes

Data model changes

Files: 
CommentFileSizeAuthor
#170 2526150-bounded_database_cache_bins__size-170.patch3.36 KBWim Leers
#170 2526150-bounded_database_cache_bins__count-170.patch2.45 KBWim Leers
#170 2526150-slushi_cache-170.patch4.98 KBWim Leers
#128 interdiff.txt1.86 KBcatch
#128 2526150.patch13.08 KBcatch
#125 2526150.patch13.08 KBcatch
#123 interdiff.txt3.85 KBcatch
#123 2526150.patch3.85 KBcatch
#119 2526150-119.patch12.75 KBalexpott
#119 117-119-interdiff.txt5.34 KBalexpott
#117 interdiff.txt2.06 KBcatch
#117 2526150.patch12.95 KBcatch
#116 interdiff.txt1.98 KBcatch
#116 2526150.patch12.87 KBcatch
#115 2526150.patch10.89 KBcatch
#115 interdiff.txt8.26 KBcatch
#112 interdiff.txt2.13 KBcatch
#112 2526150.patch10.44 KBcatch
#108 2526150.patch10.18 KBcatch
#108 interdiff.txt617 bytescatch
#107 interdiff.txt6.36 KBcatch
#107 2526150.patch10.14 KBcatch
#104 2526150.patch10.13 KBcatch
#104 interdiff.txt5.6 KBcatch
#102 2526150.patch8.28 KBcatch
#100 2526150.patch6.56 KBcatch
#89 2526150.patch6.54 KBcatch
#87 2526150.patch6.54 KBcatch
#82 2526150-81.patch8.26 KBcatch
#79 2526150.patch1.5 KBcatch
#64 interdiff.txt1.79 KBDenchev
#64 database_cache_backend-2526150-63.patch8.77 KBDenchev
PASSED: [[SimpleTest]]: [PHP 5.5 MySQL] 102,457 pass(es). View
#36 interdiff-33-36.txt1.2 KBhussainweb
#25 core-render_cache_max_age-2526150-25.patch5.96 KBDenchev
PASSED: [[SimpleTest]]: [PHP 5.5 MySQL] 100,675 pass(es). View

Comments

dawehner’s picture

I think making it per bin makes sense, as there might be other bins where this is useful.
Also having this as part of the cache configuration helps you to get an overview inside settings.php, in case needed.

Fabianx’s picture

Fabianx’s picture

I think if we do that, we should have also some pseudo-LRU:

If ($refresh_cache_time && $age_remaining < $refresh_cache_time) {
  // cache set again. (treat like cache miss, but do not rebuild only re-cache)
}

Proposed parameters are per bin:

- max_age: 86400
- max_age_jitter_percent: 5
- refresh_cache_time: 3600 // Do not let frequently accessed data expire, re-cache it instead.

< berdir> Fabianx-screen: that would be interesting, but then we need two max-ages.. since some items might have a "hard" max-age
< Fabianx-screen> berdir: Yes, but the cache backend can distinguish with a simple flag:
< Fabianx-screen> $data->is_default_max_age => TRUE / FALSE,

Berdir’s picture

I'm not a huge fan of introducing that complexity here. It would require storage changes in all the cache tables for one thing...

Wim Leers’s picture

Either the render cache service or the database backend (per-bin) should support an upper limit for the max-age. E.g. a day or a week.

I don't think this actually fixes the problem.

We only check if a cache item is expired when retrieving that cache item. Which means that any render cache item that is accessed infrequently (or never again) also does not get removed.

The only way I think we can solve this is by evicting expired cache items on cron, i.e. implementing LRU anyway.

Wim Leers’s picture

So berdir pointed out that we already do what I just described, in \Drupal\Core\Cache\DatabaseBackend::garbageCollection(), called by system_cron(). Ok :)

Fabianx’s picture

#5: system_cron() runs:

  foreach (Cache::getBins() as $cache_backend) {
    $cache_backend->garbageCollection();
  }

is that not sufficient?

Wim Leers’s picture

Given that, +1 to the proposed solution in principle:

Either the render cache service or the database backend (per-bin) should support an upper limit for the max-age. E.g. a day or a week. It would probably be great to vary that slightly for every write, e.g. +/- 5% so not all cache entries expire at the same time.

… but I still have some concerns:

  1. We want to prefer keeping render cache items of "bigger page components" cached, choosing to evict "smaller page components" from the cache instead. Because the smaller ones, when combined, result in the bigger ones being re-created in the cache anyway.
  2. In the future we'll probably want to send the Age header. We're able to base that on the render cache items created timestamps, taking the maximum value of those. This change would mean the Age would never exceed a week, for no good reason other than sites with huge amounts of content getting into trouble.
  3. I wonder if we can choose to not impose a maximum max-age, but instead specify a maximum number of cache items for a certain cache bin, or even the maximum bin size (see http://stackoverflow.com/questions/9620198/how-to-get-the-sizes-of-the-t...) . Given those restrictions, we'd then continue to do the same things as we do today in DatabaseBackend::garbageCollection(), but on top of that also delete the oldest cache items (by looking at the created timestamp) and delete, say, the 1% oldest cache items.

I don't think we can implement 1. I do think 2 is a useful thing to take into account, but it's not a solution. But 3 seems like a viable alternative solution?


I do think the already proposed resolution works, but then my second point is a concern. I wonder what you guys think of point 3 as an alternative solution? Just trying to make sure we end up with the best possible solution :)

Finally: what do other CMSes/web frameworks do for this?

dawehner’s picture

At least typo3 just defines a default lifetime of an hour, see http://wiki.typo3.org/Caching_framework

Fabianx’s picture

#8:

I don't think it belongs in render cache at all.

It is a pure database backend cache implementation detail and should have no affect on other metadata that we store extra.

1. I think we should at least move cache.page to its own cache_page bin again.

2. For the age header we should not depend on internal data of the cache backend (that is an implementation detail and not part of the interface), but rather provide our own in #cache. (similar with downstream-ttl and ensuring proper max-age is working correctly).

3. I still think to automatically refresh still active cache items is the sanest solution to have pseudo-LRU in database backend - configurable via settings for only that backend. (so others that are likely LRU already can support it, but don't need to.)

That would also take care of #2 then as the about-to-expire cache item would be just refreshed (and retain its age), it is also in principle similar to how If-Modified-Since works - where the reverse proxy returns a 304 when the backend returned data that is still matching the cache and just returns its cached item.
( in this case the cache tags take care of that.)

And I don't think it is complex to implement.

It is one Settings call in __construct and 5 lines of code in get() and 5 lines of code in set plus test coverage.

---

Thinking about 1. again - keep smaller fragments for longer, keep larger fragments shorter.

That kind of policy is very opposite to LRU.

I am not sure, it is something we should think about, but it is out of scope here I think.

Wim Leers’s picture

Thinking about 1. again - keep smaller fragments for longer, keep larger fragments shorter.

I said the opposite.

but it is out of scope here I think.

It is not, because it's related.

Everything I wrote in #8 was about this:

Just trying to make sure we end up with the best possible solution :)

i.e. making suring we take all things into account.

Fabianx’s picture

Okay, I actually would have thought the smaller fragments change less as they usually have less variation :p.

#11: The problem is 3) is not LRU and makes refreshing impossible, i.e. the oldest active caches are very likely the ones that have been created first on cold cache, but those are very likely front page and other things that are highly frequently used and that a LRU policy would not expire.

So we would do the exact opposite, while the goal is to expire things that have not been accessed since day X. (LRU)

But LRU is very costly, because you need a write for each cache_get().

The nice thing about the refresh-pseudo-LRU is that you only need a refresh when you are near the set expiration time.

e.g. for my own sites based on DB I would probably use:

expire=1 week (+/- 10% jitter)
working set=1 day

Everything that was accessed within 1 day would be kept permanently, which is the working set of the site.

So old GA crawled urls would expire, but my front page would not.

Edit:

This would also in general favor larger fragments (as they would get accessed more often) and expire the smaller fragments automatically.

e.g. if my front page does not change and all that is accessed is my front page, then all the block and entity caches would expire after one week.

If then my front page changes, obviously there is a larger cache miss, but that is the trade-off to make and would happen with any other LRU system. (interesting observation: LRU has problems with hierarchical cache items - need to some research if there is something in the literature on that).

However if there are uncached pages then obviously the most active blocks would still be refreshed (unless they depend on url / route obviously).

So I think the trade-off in general is okay.

---

Edit 2:

Putting a max number of items only works when there is a pressure to evict, e.g. that is making that again more complex.

Given our pseudo-LRU it would work like that, it would check the number of items configured and only expire those expired items until the threshold is reached again.

In this case however I would not restrict the working set, but rather only the number of expired items to keep.

So it is orthogonal to the approach and could help to keep some expired items around - however I don't think it is necessary.

Denchev’s picture

Status: Active » Needs review
FileSize
978 bytes
FAILED: [[SimpleTest]]: [PHP 5.5 MySQL] 32,317 pass(es), 780 fail(s), and 16 exception(s). View

In this aproach a user can define in settings.php a max age for the appropriate bin.

Status: Needs review » Needs work

The last submitted patch, 13: core-render_cache_max_age-2526150-13.patch, failed testing.

Denchev’s picture

Status: Needs work » Needs review
FileSize
1.15 KB
PASSED: [[SimpleTest]]: [PHP 5.5 MySQL] 98,655 pass(es). View
1.19 KB

Ups forgot to change some stuff from earlier. here is the new one.

Berdir’s picture

Status: Needs review » Needs work
Issue tags: +Needs tests

Ok, we also need some tests, for example in DatabaseBackendUnitTest. Set the setting, then write something and check the expiration, possibly directly in the database.

Thinking about it, we should probably not call out to Settings directly in this class. Instead, let's add an optional array $configuration = [] to the constructor, store it in $this->configuration and then pass in the settings for a specific cache bin only into a class in DatabaseBackendFactory.

Denchev’s picture

Status: Needs work » Needs review
FileSize
5.17 KB
FAILED: [[SimpleTest]]: [PHP 5.5 MySQL] Failed to run tests: PHP Fatal error encountered during run_tests.sh. See review log for details.. View
4.79 KB

Status: Needs review » Needs work

The last submitted patch, 17: core-render_cache_max_age-2526150-17.patch, failed testing.

Berdir’s picture

  1. +++ b/core/lib/Drupal/Core/Cache/DatabaseBackend.php
    @@ -10,6 +10,7 @@
     use Drupal\Core\Database\Connection;
     use Drupal\Core\Database\SchemaObjectExistsException;
    +use Drupal\Core\Site\Settings;
     
    
    @@ -150,6 +161,11 @@ protected function prepareItem($cache, $allow_invalid) {
    +    // Get defined settings from settings.php if there are any and set the cache.
    +    $settings = (isset($this->configuration)) ? $this->configuration : Settings::get('cache_database', []);
    +    if (isset($settings[$this->bin]) && ($expire == Cache::PERMANENT || $expire > $settings[$this->bin])) {
    +      $expire = $settings[$this->bin];
    +    }
    

    No fallback to Settings here.

    DatabaseBackendFactory needs to load them from Settings and then inject them.

    $configuration is not meant to by keyed by $this->bin. DatabaseFactory just passed the configuration for that bin to it. But we should add a key inside to add additional settings later.

    So: if (isset($this->configuration['max_expire']) ...

  2. +++ b/core/modules/system/src/Tests/Cache/DatabaseBackendUnitTest.php
    @@ -29,8 +29,8 @@ class DatabaseBackendUnitTest extends GenericCacheBackendUnitTestBase {
        */
    -  protected function createCacheBackend($bin) {
    -    return new DatabaseBackend($this->container->get('database'), $this->container->get('cache_tags.invalidator.checksum'), $bin);
    +  protected function createCacheBackend($bin, $configuration) {
    +    return new DatabaseBackend($this->container->get('database'), $this->container->get('cache_tags.invalidator.checksum'), $bin, $configuration);
    

    Don't change the arguments here.

    Just call new DatabaseBackend directly in the new test.

The new test looks like a good start, but test with a few different values. Check with an expiratio that's one lower and one higher.

Denchev’s picture

Status: Needs work » Needs review
FileSize
5.31 KB
PASSED: [[SimpleTest]]: [PHP 5.5 MySQL] 98,961 pass(es). View
6.78 KB

1) Moved loading of settings to DatabaseBackendFactory in constructor and now we are passing value for the bin if available.
Also changed to

if (isset($this->configuration['max_expire']) ...

2) Removed changes to this part. Calling now DatabaseBackend directly in tests and added 2 more tests like suggested

Berdir’s picture

Status: Needs review » Needs work
  1. +++ b/core/lib/Drupal/Core/Cache/DatabaseBackend.php
    @@ -42,6 +42,13 @@ class DatabaseBackend implements CacheBackendInterface {
       /**
    +   * The maximum cache expiration time.
    +   *
    +   * @var array
    +   */
    +  protected $configuration;
    
    @@ -50,14 +57,17 @@ class DatabaseBackend implements CacheBackendInterface {
    +   * @param array $configuration
    +   *   The maximum cache expiration time.
    
    +++ b/core/lib/Drupal/Core/Cache/DatabaseBackendFactory.php
    @@ -26,6 +27,13 @@ class DatabaseBackendFactory implements CacheFactoryInterface {
    +   * The maximum cache expiration time.
    +   *
    +   * @var array
    +   */
    +  protected $configuration;
    

    This should explain a bit better what exactly is in there.

    Something like "Per-bin database backend configuration."

  2. +++ b/core/lib/Drupal/Core/Cache/DatabaseBackendFactory.php
    @@ -36,6 +44,7 @@ class DatabaseBackendFactory implements CacheFactoryInterface {
         $this->checksumProvider = $checksum_provider;
    +    $this->configuration = Settings::get('cache_database', []);
    

    We also need to actually document this somewhere, often, that's done in default.settings.php, see other keys there.

    Also, instead of settings, we could also use container parameters. The difference is that they would be configured in local.yml and passed in to the service using %some_key.

    cache backends are currently in settings, that's why I suggested this. but I'm also fine with a container parameter.

  3. +++ b/core/lib/Drupal/Core/Cache/DatabaseBackendFactory.php
    @@ -48,7 +57,8 @@ function __construct(Connection $connection, CacheTagsChecksumInterface $checksu
    +    $configuration['max_expire'] = isset($this->configuration[$bin]) ? (int)$this->configuration[$bin] : null;
    

    Missing space after (int)

Wim Leers’s picture

Issue tags: +D8 cacheability
Denchev’s picture

Status: Needs work » Needs review
FileSize
5.96 KB
PASSED: [[SimpleTest]]: [PHP 5.5 MySQL] 99,791 pass(es). View
2.11 KB

1. Updated explanation
2. Added explanation in default.settings.php. Not exactly sure for this kind of settings what information is needed but here is my first try.
Since i added the explanation in default.settings.php and I already added the settings with Settings::get('cache_database', []); i left it like that. If container parameters is better i can change...
3. .

Berdir’s picture

Status: Needs review » Needs work
+++ b/sites/default/default.settings.php
@@ -691,3 +691,16 @@
+ * For example:
+ * @code ¶
+ * settings['cache_database']['render'] = 3600;
+ * @endcode
+ */
+# $settings['cache_database'][''] = 0;

Instead of a code example, just make the commented out code that example.

And we might even want to enable it by default and set it to a week or so.

To provide a data point here. We recently launched a small local news site. They didn't import their existing content so they started with a practically empty database.

However, cache_render grew to 1.3GB within days. As far as I could see, mostly due to 404 pages from google and other bots. That doesn't go away ever, without a manual full cache clear.

Denchev’s picture

Status: Needs work » Needs review
FileSize
5.96 KB
PASSED: [[SimpleTest]]: [PHP 5.5 MySQL] 100,675 pass(es). View
3.13 KB
borisson_’s picture

Issue tags: -Needs tests
FileSize
2.41 KB
5.39 KB
PASSED: [[SimpleTest]]: [PHP 5.5 MySQL] 100,696 pass(es). View

Removing the "needs tests" tag, because it already has tests. I also changed the DatabaseBackend to make sure that anything passed in $configuration is an actual array.

Also changed all array definitions to short syntaxes.

olli’s picture

Closed #1947852: Database cache backend garbage collection does not work with permanent entries as a duplicate.

  1. +++ b/core/lib/Drupal/Core/Cache/DatabaseBackend.php
    @@ -150,6 +160,10 @@ protected function prepareItem($cache, $allow_invalid) {
       public function set($cid, $data, $expire = Cache::PERMANENT, array $tags = array()) {
    +    // Check if maximum expiration settings for cache bin is defined
    +    if (isset($this->configuration['max_expire']) && ($expire == Cache::PERMANENT || $expire > $this->configuration['max_expire'])) {
    +      $expire = REQUEST_TIME + $this->configuration['max_expire'];
    +    }
    

    $expire is a timestamp and configuration['max_expire'] is a duration, so I think we need to add REQUEST_TIME to $expire > $this->configuration['max_expire']?

  2. +++ b/core/lib/Drupal/Core/Cache/DatabaseBackendFactory.php
    @@ -36,6 +44,7 @@ class DatabaseBackendFactory implements CacheFactoryInterface {
    +    $this->configuration = Settings::get('cache_database', []);
    
    @@ -48,7 +57,8 @@ function __construct(Connection $connection, CacheTagsChecksumInterface $checksu
    +    $configuration['max_expire'] = isset($this->configuration[$bin]) ? (int) $this->configuration[$bin] : null;
    +    return new DatabaseBackend($this->connection, $this->checksumProvider, $bin, $configuration);
    

    So I would add $settings['cache_database']['default'] = 86400; to settings.php? How about something like $settings['cache_database']['default']['max_expire'] = ...?

  3. +++ b/core/modules/system/src/Tests/Cache/DatabaseBackendUnitTest.php
    @@ -53,4 +53,23 @@ public function testSetGet() {
    +    $backend->set('test_cache1', 'foo');
    ...
    +    $backend->set('test_cache1', 'foo');
    ...
    +    $backend->set('test_cache1', 'foo');
    

    One of these could use a $expire param greater than the 'max_expire' setting.

#24:

cache_render grew to 1.3GB within days

Should we add a default?

borisson_’s picture

FileSize
1016 bytes
6.33 KB
PASSED: [[SimpleTest]]: [PHP 5.5 MySQL] 100,858 pass(es). View

#27

  1. I think the tests prove that the code works, so I don't think this is needed.
  2. @Denchev had already added this, but it looks like I had trouble applying his patch to settings.php locally. I added that code back to settings.php.
    No interdiff sadly.
  3. Added a new test that does exactly that.
  4. see .2, a default is added.
Denchev’s picture

Assigned: Unassigned » Denchev
Denchev’s picture

Status: Needs review » Needs work

@Ollie
1. You are right.
The tests have some flaw i will update them and will see for your other suggestions.

Denchev’s picture

Status: Needs work » Needs review
FileSize
5.61 KB
6.13 KB
PASSED: [[SimpleTest]]: [PHP 5.5 MySQL] 100,931 pass(es). View

1) Fixed set method.
2) Default value and bin now in settings and factory class.
3) Fixed tests.

Berdir’s picture

Status: Needs review » Needs work
  1. +++ b/core/lib/Drupal/Core/Cache/DatabaseBackend.php
    @@ -150,6 +160,10 @@ protected function prepareItem($cache, $allow_invalid) {
    +    // Check if maximum expiration settings for cache bin is defined
    +    if (isset($this->configuration['max_expire']) && ($expire == Cache::PERMANENT || $expire > $this->configuration['max_expire'] + REQUEST_TIME)) {
    +      $expire = $this->configuration['max_expire'] + REQUEST_TIME;
    +    }
    

    Coding style: comments need to end with a ".". Could also be a bit clearer:

    // If set, enforce an upper limit for the expiration to allow for garbage collection.

  2. +++ b/core/lib/Drupal/Core/Cache/DatabaseBackendFactory.php
    @@ -36,6 +44,9 @@ class DatabaseBackendFactory implements CacheFactoryInterface {
    +    // Set cache_database settings from settings.php and if none is set use
    +    // maximum expire for render bin for one week.
    +    $this->configuration = Settings::get('cache_database', ['render' => ['max_expire' => 86400 * 7]]);
    

    Suggestion:

    // Load settings, default to an max expiration of the render cache bin set to one week.

  3. +++ b/core/modules/system/src/Tests/Cache/DatabaseBackendUnitTest.php
    @@ -53,4 +53,23 @@ public function testSetGet() {
    +    $configuration = ['max_expire' => 2800];
    +    $backend = new DatabaseBackend($this->container->get('database'), $this->container->get('cache_tags.invalidator.checksum'), 'render', $configuration);
    +    $backend->set('test_cache3', 'foo', REQUEST_TIME + 2801);
    +    $cached = $backend->get('test_cache3');
    +    $this->assertIdentical($cached->expire, (string) (REQUEST_TIME + 2800), 'Maximum cache expire time is not exceeded.');
    

    $configuration and backend is always the same now, so you can just re-use it?

    Also, add another test for setMultiple(). that's a completely separate implementation right now that needs the same check.

  4. +++ b/sites/default/default.settings.php
    @@ -691,3 +691,11 @@
    + * Use this setting to set the maximum cache expiration time for a specific
    + * cache bin in the database.
    

    I would write this to allow for additional settings later. Make it a list (* - max_expire: ...) and also explain a bit what this is for. That some cache bins can get very big and persistent cache entries are never removed from them. Setting this will allow them to be garbage collected after the configured time, with the downside that they will have to be rebuilt if requested again.

hussainweb’s picture

Status: Needs work » Needs review
FileSize
5.55 KB
7.18 KB
FAILED: [[SimpleTest]]: [PHP 5.5 MySQL] 100,933 pass(es), 2 fail(s), and 0 exception(s). View

I have made changes as specified in #31 and some more nitpicks I found. I added a test as pointed out in 3 but I am not sure if that would work the same way as the earlier test. From what I read of the code, I am expecting this to fail.

Status: Needs review » Needs work

The last submitted patch, 33: 2526150-33.patch, failed testing.

Berdir’s picture

Yes, I expected it to fail, now we need to fix the test fail :)

hussainweb’s picture

Status: Needs work » Needs review
FileSize
1.2 KB
8.14 KB
FAILED: [[SimpleTest]]: [PHP 5.5 MySQL] 100,926 pass(es), 1 fail(s), and 0 exception(s). View

Attempting a fix. General observation: The code and tests assume that the max_expire setting would be defined in settings.php or overridden by the factory. What if both fail to set it? e.g., if settings.php just defines some other (future) element in the 'render' array (but not max_expire), then the default won't be set which means max_expire won't be set at all. In that case, there would be no upper limit at all. Is this what is intended?

hussainweb’s picture

FileSize
1.61 KB
9.24 KB
FAILED: [[SimpleTest]]: [PHP 5.5 MySQL] 100,938 pass(es), 2 fail(s), and 0 exception(s). View

Adding another test to see the effect for the scenario I described in #36.

The last submitted patch, 36: 2526150-36.patch, failed testing.

Status: Needs review » Needs work

The last submitted patch, 37: 2526150-37.patch, failed testing.

hussainweb’s picture

Status: Needs work » Needs review
FileSize
1.23 KB
9.29 KB
PASSED: [[SimpleTest]]: [PHP 5.5 MySQL] 100,937 pass(es). View

Fixing failures.

olli’s picture

  1. +++ b/core/lib/Drupal/Core/Cache/DatabaseBackend.php
    @@ -150,6 +160,12 @@ protected function prepareItem($cache, $allow_invalid) {
    +      $expire = $this->configuration['max_expire'] + REQUEST_TIME;
    

    What do you think about renaming 'max_expire' to 'max_lifetime'?

  2. +++ b/core/lib/Drupal/Core/Cache/DatabaseBackend.php
    @@ -205,11 +221,21 @@ public function setMultiple(array $items) {
         foreach ($items as $cid => $item) {
    +      // Check if maximum expiration settings for cache bin is defined. If set,
    +      // enforce an upper limit for the expiration to allow for garbage
    +      // collection.
    +      $max_expire = isset($this->configuration['max_expire']) ? $this->configuration['max_expire'] + REQUEST_TIME : Cache::PERMANENT;
    

    $max_expire assignment can be moved outside the loop.

  3. +++ b/core/lib/Drupal/Core/Cache/DatabaseBackendFactory.php
    @@ -36,6 +44,9 @@ class DatabaseBackendFactory implements CacheFactoryInterface {
    +    $this->configuration = Settings::get('cache_database', ['render' => ['max_expire' => 86400 * 7]]);
    
    +++ b/sites/default/default.settings.php
    @@ -691,3 +691,17 @@
    +$settings['cache_database']['render'] = [
    +  'max_expire' => 86400 * 7,
    +];
    

    Why do we need both? Isn't it enough to set it in default.settings.php?

Berdir’s picture

3. Good question, I'm not sure. Agreed that having both is a bit confusing. The common pattern is to have the example in default.settings.php commented out, we could do that?

Wim Leers’s picture

#41.1 makes a lot of sense to me.

+++ b/core/lib/Drupal/Core/Cache/DatabaseBackend.php
@@ -150,6 +160,12 @@ protected function prepareItem($cache, $allow_invalid) {
+    if (isset($this->configuration['max_expire']) && ($expire == Cache::PERMANENT || $expire > $this->configuration['max_expire'] + REQUEST_TIME)) {

@@ -205,11 +221,21 @@ public function setMultiple(array $items) {
+      if ($max_expire != Cache::PERMANENT && ($item['expire'] == Cache::PERMANENT || $item['expire'] > $max_expire)) {

Let's use strict equality (===).

hussainweb’s picture

FileSize
8.55 KB
PASSED: [[SimpleTest]]: [PHP 5.5 MySQL] 101,560 pass(es). View
hussainweb’s picture

FileSize
2.11 KB
8.69 KB
FAILED: [[SimpleTest]]: [PHP 5.5 MySQL] 101,561 pass(es), 4 fail(s), and 0 exception(s). View

Fixing for comments in #41.

For point 3, I think we have the default in the constructor to make sure the value is there in case it is removed from settings.php. As @Berdir pointed out, we could comment it out.

For point 1, I am not sure if max_lifetime makes more sense than max_expiry but I have renamed it.

hussainweb’s picture

FileSize
704 bytes
8.72 KB
FAILED: [[SimpleTest]]: [PHP 5.5 MySQL] 101,552 pass(es), 4 fail(s), and 0 exception(s). View

The patch in #45 wasn't complete (missed default.settings.php), but the interdiff was correct. I had missed the comment in #43 and fixing that now.

olli’s picture

FileSize
8.71 KB
FAILED: [[SimpleTest]]: [PHP 5.5 MySQL] 101,574 pass(es), 4 fail(s), and 0 exception(s). View
1.36 KB

Sorry I meant just rename the settings 'max_expire', variable $max_expire is fine.

Wim Leers’s picture

Title: Render cache (or database backend per-bin?) needs support an upper limit for max-age » Database cache backend needs an optional max lifetime
olli’s picture

#42 So change the line in default.setting.php to

# $settings['cache_database']['render']['max_lifetime'] = 86400 * 7;

?

I think that setting (commented or not) should be before

# if (file_exists(__DIR__ . '/settings.local.php')) {
#   include __DIR__ . '/settings.local.php';
# }

, right?

The last submitted patch, 45: 2526150-45.patch, failed testing.

Wim Leers’s picture

  1. +++ b/core/lib/Drupal/Core/Cache/DatabaseBackend.php
    @@ -42,6 +42,13 @@ class DatabaseBackend implements CacheBackendInterface {
       /**
    +   * Per-bin database backend configuration.
    +   *
    +   * @var array
    +   */
    +  protected $configuration;
    
    +++ b/sites/default/default.settings.php
    @@ -691,3 +691,17 @@
    +/**
    + * Per-bin database backend configuration.
    + *
    + * Configure settings for database backend. Available settings are:
    + * - max_expire: Specify the maximum expiry for an individual cache item in the
    + *   selected bin. Some cache bins can get very big and persistent cache entries
    + *   are never removed. Setting this value will allow them to be garbage
    + *   collected after the configured time, and will need to be rebuilt if the
    + *   item is requested again.
    + */
    +$settings['cache_database']['render'] = [
    +  'max_lifetime' => 86400 * 7,
    +];
    

    So this is an array, just in case additional settings are added in the future, right?

    Are we sure that's a good idea?

  2. +++ b/core/lib/Drupal/Core/Cache/DatabaseBackend.php
    @@ -191,14 +201,24 @@ public function setMultiple(array $items) {
    +    // Check if maximum expiration settings for cache bin is defined. If set,
    +    // enforce an upper limit for the expiration to allow for garbage
    +    // collection.
    ...
    +      // If maximum expiration is set, enforce an upper limit on the expiry.
    

    These comments are effectively saying the same thing.

    I think we can keep just the first comment.

    And I think this would be clearer:
    Check if a 'max_lifetime' setting is set for the current cache bin. If set, the lifetime of a cache item is limited to the specified 'max_lifetime', thus preventing unused cache items to linger forever.

  3. +++ b/core/lib/Drupal/Core/Cache/DatabaseBackend.php
    @@ -191,14 +201,24 @@ public function setMultiple(array $items) {
    +    $max_expire = isset($this->configuration['max_lifetime']) ? $this->configuration['max_lifetime'] + REQUEST_TIME : Cache::PERMANENT;
    

    s/$max_expire?$expire/

  4. +++ b/core/lib/Drupal/Core/Cache/DatabaseBackendFactory.php
    @@ -36,6 +44,9 @@ class DatabaseBackendFactory implements CacheFactoryInterface {
    +    // Load settings and if not set, default to maximum expiration of the render
    

    s/expiration/lifetime/

  5. +++ b/core/modules/system/src/Tests/Cache/DatabaseBackendUnitTest.php
    @@ -53,4 +53,78 @@ public function testSetGet() {
    +    // Empty configuration would force DatabaseBackend to select permanent
    +    // lifetime for cache objects.
    

    No configuration (the default) allows cache items to live forever.

  6. +++ b/core/modules/system/src/Tests/Cache/DatabaseBackendUnitTest.php
    @@ -53,4 +53,78 @@ public function testSetGet() {
    +    // Test set as well.
    

    s/set/set()/

The last submitted patch, 46: 2526150-46.patch, failed testing.

Status: Needs review » Needs work

The last submitted patch, 47: 2526150-47.patch, failed testing.

olli’s picture

Related to #51.1, there is one issue that would add a similar per bin setting #1944740: Do not garbage collect as soon as items expire.

Wim Leers’s picture

#54: that's a good argument! :) However, that issue is going with a very different approach to creating a per-bin setting. Which one is preferable?

Fabianx’s picture

#54 #55:

I do think those two issues are orthogonal actually:

- This issue ensures that CACHE_PERMANENT or > expire is mapped to a max-expire value
- That issue ensures that expired items are not directly removed

While the other issue would potentially pro-long the time items of this issue are kept in the cache, I do not think it would help with the original problem.

Also staleness currently is not yet (widely?) used in core.

Train of unrelated thoughts

It would be very useful to return an old value, then setup a queue or after-request item to re-populate data, but thats not done currently.

Also likely would want to use a max-age=0 (or -2) then anyway, to avoid having an expired item be stored in another cache item.

Anyway.

TL;DR: This issue should continue as is.

Berdir’s picture

I think we had ideas for like 4 more possible settings in this issue alone, see the jitter/refresh in #3 for example that would make sure that not all the cache items expire at the same time and we keep caches that are actively used. I think it's good to just focus on this single feature in this issue, contrib can add all kinds of additional features in alternative implementations, not sure the others are core worthy.

@Fabianx: The rebuild in a queue behavior would probably more likely be implemented using $allow_invalid = TRUE and not in the backend, since you actually need to know what you have to rebuild and the backend doesn't.

Fabianx’s picture

#57: I agree with #57.

I also agree on that and that was my idea. I was just saying adding even more support for $allow_invalid = TRUE is likely not that helpful for core itself as there is no(t much) usage in core of that feature right now.

Wim Leers’s picture

Oh, yes, I was not asking to bring more settings in! I was only making sure we made the right design choice for this setting: A) will we likely add another setting (yes), B) the other issue uses a different setting structure, which one do we want (this one).

So I think that effectively just leaves #51 to be addressed?

borisson_’s picture

Status: Needs work » Needs review
FileSize
6.19 KB
8.66 KB
PASSED: [[SimpleTest]]: [PHP 5.5 MySQL] 101,576 pass(es). View

Fixed remarks from #51 and the test failures introduced in #45.

Wim Leers’s picture

Looks great!

I'd like to see one more person's thoughts on max_lifetime vs. max_expire. It was olli's idea, I +1'd it, but I wonder if some people are not in favor.

Berdir’s picture

  1. +++ b/core/lib/Drupal/Core/Cache/DatabaseBackend.php
    @@ -191,14 +201,23 @@ public function setMultiple(array $items) {
         foreach ($items as $cid => $item) {
           $item += array(
    -        'expire' => CacheBackendInterface::CACHE_PERMANENT,
    +        'expire' => $expire,
             'tags' => array(),
           );
     
    +      if ($expire !== Cache::PERMANENT && ($item['expire'] === Cache::PERMANENT || $item['expire'] > $expire)) {
    +        $item['expire'] = $expire;
    +      }
    

    That double check (in case no expire was set) is a bit confusing, we set the default and then set the value again.

    The thing is that in that loop, we actually build two arrays. So we just put expire in there to get it out again. I tried for quite some time to come up with a better check here but wasn't successful.

    What we should do anyway is move at least part of the comment down to the main logic and explain what it is doing.

  2. +++ b/sites/default/default.settings.php
    @@ -678,6 +678,21 @@
    + $settings['cache_database']['render'] = [
    +   'max_lifetime' => 86400 * 7,
    + ];
    

    based on above, I think we agreed to comment this out by default?

Wim Leers’s picture

#62.1: the first comment covers both, currently. But let's let the first then say "set the correct default value: the maximum lifetime", and the second can give the details of the overriding that are in the first comment currently.

Denchev’s picture

FileSize
8.77 KB
PASSED: [[SimpleTest]]: [PHP 5.5 MySQL] 102,457 pass(es). View
1.79 KB

removed debug code.
#62.2 commented out.
and some comments for #62.1

Berdir’s picture

Looks good to me, +1 to max_lifetime.

Keeping the first comment makes sense to me, that explains the overall concept, the new one just those specific lines.

Wim Leers’s picture

Status: Needs review » Reviewed & tested by the community

Alright, thanks for the many rerolls! :)

catch’s picture

Status: Reviewed & tested by the community » Needs work
  1. +++ b/core/lib/Drupal/Core/Cache/DatabaseBackend.php
    @@ -191,14 +201,25 @@ public function setMultiple(array $items) {
    +    $expire = isset($this->configuration['max_lifetime']) ? $this->configuration['max_lifetime'] + REQUEST_TIME : Cache::PERMANENT;
    

    When reading this it looks like it will be CACHE_PERMANENT unless configured.

  2. +++ b/core/lib/Drupal/Core/Cache/DatabaseBackendFactory.php
    @@ -36,6 +44,9 @@ class DatabaseBackendFactory implements CacheFactoryInterface {
    +    $this->configuration = Settings::get('cache_database', ['render' => ['max_lifetime' => 86400 * 7]]);
    

    But for cache.render we're always setting it - should we reference that in the comment above.

    If I was seeing the 7-day expiry, I'd probably check the backend first, then settings.php, then only after that find this in the factory if I was lucky.

    Also reading this, I wonder whether we couldn't put the default configuration into the container definition for cache.render, then it'd be kept closer to where the bin is defined - the factory doesn't know anything about specific bins in HEAD.

hussainweb’s picture

@catch: There is a small catch in the point 2 you have mentioned. It looks like we are always setting 'max_lifetime' even if not set in settings.php. That's not true though. If in settings.php, we set some other keys in that array but not max_lifetime, that value will no be set. That is where the CACHE_PERMANENT backup setting you pointed out in point 1 comes in.

catch’s picture

@hussainweb - yes it's possible to not set it, I just think the factory is a strange place to put the default setting that most sites will end up using.

Berdir’s picture

@catch: Putting it in the definition (which is something we already do for the backend configuration) wouldn't really make sense for configuration that's specific for one backend IMHO. Or maybe if we name it database_configuration or something like that?

Wim Leers’s picture

I'd very much prefer this to have in the container definition for a specific cache bin also. But if I remember correctly from an IRC conversation, that was considered impossible. I wish I wrote down on this issue why.

Berdir’s picture

  cache.bootstrap:
    class: Drupal\Core\Cache\CacheBackendInterface
    tags:
      - { name: cache.bin, default_backend: cache.backend.chainedfast }
    factory: cache_factory:get
    arguments: [bootstrap]

I think we could extend it to also support anothe stuff in that tag, not sure how complex it can be.

But as I said, that would make it a generic cache concept and not something that's specific to the database backend.

The last submitted patch, 64: database_cache_backend-2526150-63.patch, failed testing.

catch’s picture

catch’s picture

Coming back to this:

- I really think it should be per-bin. For example we'd want cache_bootstrap to stay with CACHE_PERMANENT, but maybe bring cache_render down to 12 hours.

- it's not really database specific, it's valuable for any cache backend that doesn't support LRU.

This means we need a term for non-LRU cache backends, I don't have one though.

ndobromirov’s picture

There you have it: NonLruCacheInterface :).

catch’s picture

Well there's that, but LRU is normally used as shorthand for 'storage size capped'.

APCu has a capped storage size, but this doesn't mean you can just throw items at it. Once it's full the behaviour depends on the ttl but the default is just to empty itself out entirely.

The way we deal with that is to not recommend APCu for any unbounded cache bins - instead it's restricted to ones where we expect the total size of cache items to be within a reasonable limit, so that it never gets full at all.

catch’s picture

Version: 8.0.x-dev » 8.1.x-dev
Priority: Major » Critical
Status: Needs work » Needs review
FileSize
1.5 KB

OK here's a suggestion that makes less assumptions about storage: 'configurable permanent ttl'.

If it's set, then cache storage can set 'permanent' cache entries to that length or higher, up to permanent again.

Here's an untested draft with the interface and CacheBackendInterface implementing it.

If the concept seems fine, we still need to sort out the configuration key for the bins, what it gets set to, and add tests.

I'm also going to bump this to critical, since the meta was triaged as critical and I don't think that meta gets fixed without this in some form.

Status: Needs review » Needs work

The last submitted patch, 79: 2526150.patch, failed testing.

Fabianx’s picture

+++ b/core/lib/Drupal/Core/Cache/DatabaseBackend.php
@@ -194,8 +199,10 @@ protected function doSetMultiple(array $items) {
+        $item['expire'] = $this->maximumPermanentTtl;

minimum vs. maximum mixup ...

----

I re-read my comments again:

1. I think this is a good case for a decorator class: e.g. BoundedDatabaseBackend + using a trait. then every cache backend can implement that by implementing the Interface + using the trait.

2. I still think that:

- a) Random Jitter is a good idea (it is horrible if all items expire at the same time)

- b) A mechanism for refreshing cache items that are accessed before the TTL expires is added (pseudo-LRU)

As the setting is per bin, a site owner can e.g. choose to use cache_page as Expire + Refresh to ensure larger items that are frequently accessed, stay in the cache, while smaller items that are not accessed at all would expire. (as Wim Leers wanted)

On the other hand someone that wants to cache smaller items, could just use a larger ttl for cache_render and a smaller one for cache_page.

To give my code some perspective.

Jitter is as simple as:

    foreach ($items as $cid => $item) {
+      if (!isset($item['expire'] || $item['expire'] === CacheBackendInterface::CACHE_PERMANENT))  {
+        $item['expire'] = $this->addTtlJitter($this->minimumPermanentTtl);
+      }
       $item += array(
-        'expire' => CacheBackendInterface::CACHE_PERMANENT,
         'tags' => array(),
       );
// ...

protected function addTtlJitter($ttl) {
  $diff = $ttl * $this->ttlJitter;
  return rand($ttl - $diff, $ttl + $diff);
}

While refreshTIme would work like:

protected function prepareItem($cache, $allow_invalid) {
  // ...

+  if ($this instanceof TtlAwareCacheBackendInterface && $cache->ttl === CacheBackendInterface::CACHE_PERMANENT && $cache->expire + $this->refreshTtl >= REQUEST_TIME) {
+   // Refresh the item in the cache, could also just update $expire.
+   $this->setMultiple([$cache]);
+ }

  return $cache;
}

I see the problem though that we have nowhere to save that the item should be cached permanently instead of an item that truly is time dependent - however I think we should save the ttl in any case in addition to the expire timestamp. That would solve that for all cache backends.

e.g. practical example: homepage should never be expired, is accessed several times per hour, permanent items are cached for a week, e.g.:

  • - Set minimum Expire Ttl == 1 week
  • - Set jitter = 10% = 0.10 (so that not all items expire at the same time)
  • - Set refreshTime = 1 week - 1 day == 6 days (working set: 1 day)

So every item accessed within the delta of: minimumExpireTtl - refreshTime and originally cached CACHE::Permanent will be refreshed.

But it also happens only when the item is close to be expired.

And I do think this is superior than just expiring all content after 7 days when the last cache clear was, which is like a cache clear all (for cache_render e.g.).

And we all know that cache_clear_all can be problematic for little more frequently visited sites.

catch’s picture

Status: Needs work » Needs review
FileSize
8.26 KB

. I think this is a good case for a decorator class: e.g. BoundedDatabaseBackend + using a trait. then every cache backend can implement that by implementing the Interface + using the trait.

I struggle to see many people using this except the database backend, so not sure that's useful in practice. If the idea is to keep a reference implementation of the database backend then it seems good for that.

Random Jitter is a good idea (it is horrible if all items expire at the same time)

I also think that's a good idea, but it's 100% implementation - except for the new methods mentioning 'minimum' permanent ttl so it can vary upwards. I think we should add that before RTBC though yes.

The refresh idea is a good one, and I've done similar last-minute-update logic in custom code before, but it's going to require an additional column in the database schema and an update. Also we'd have to do that in a way that it doesn't blow up before the update runs which is not going to be straightforward - the column will be missing on any ->get() call.

Patch update that fixes the minor issues above, and doesn't explode doing drush cr.

If this passes, still needs tests, still needs jitter, and then we need to figure out whether to try the refresh or not here or move that to a follow-up.

Also not sure what to do about the change to CacheBackendFactoryInterface. We can't change the actual interface - as you can see from the patch I had to update the various factories including ones that don't need this.

We could allow the individual factories to add the optional parameter to the method, but then there's no interface to communicate what the param means.

Or possibly a new interface and something like addMethodCall..

Status: Needs review » Needs work

The last submitted patch, 82: 2526150-81.patch, failed testing.

Fabianx’s picture

I think addMethodCall is the way to go via the 'calls' property.

I see that it would be nicer to add as a parameter, but changing the interface is a no-go obviously.

And adding getWithTtl() feels just wrong.

Maybe we could use a 'tag', which is like a named parameter and a compiler pass to add the tag information as a method call?

--

On the other hand could we not introduce a new FactoryInterface that all core factories that want to support this would implement?

catch’s picture

I think an entirely new factory interface might be the way to go. End of the day here but will have a look tomorrow.

catch’s picture

Status: Needs work » Needs review

Small update:

- added the new factory interface
- switched to an $options array for the service argument and new factory interface. That would potentially allow us to make things like the refresh and jitter configurable if we wanted to.

If that looks OK, will start looking at some of the rest from #81.

catch’s picture

alexpott’s picture

Status: Needs review » Needs work
+++ b/core/core.services.yml
@@ -207,7 +207,7 @@ services:
+    arguments: [default, ['maximum_permanent_ttl', 86400]]

Shouldn't this be minimum_permanent_ttl?

catch’s picture

Status: Needs work » Needs review
FileSize
6.54 KB

Yes it should.

alexpott’s picture

Issue tags: +Needs tests

The setting of the expire time to the minimum ttl looks like it can be tested.

Fabianx’s picture

#88, #89: But is it not in fact a maximum?

e.g. PERMANENT == unlimited, while this is the maximum ttl time that is allowed to be set.

Though that is wrong, too. As you could set a higher value ...

I am still confused by the wording.

Agree with the tests.

Wim Leers’s picture

I'm also confused. I'd also expect this to be called "maximum". But, really, it's not even maximum or minimum, it's just mapping PERMANENT to some configurable number. Perhaps a clearer name would therefore be overridePermanentMaxAge.

I'd also go with max-age rather than TTL because Cache-Control: max-age and because https://www.drupal.org/developing/api/8/cache/max-age.

catch’s picture

Definitely needs tests - should be easy to add to the database backend tests.

On $minimum_permanent_ttl - I don't like the name either, clearly it's confusing even me.

In practice:

- it should be configurable via the container, as in the patch.
- backends should set the ttl no lower than the configured ttl, but for example if we use jitter, then they're going to set it higher than that (so it's not a maximum)
- backends could do other things like whitelist particular cids as permanent
- they could also ignore it and use permanent.

What about $overridePermamentTtl?

I'm not at all sure about max-age - we use expire in the database backend already which is already confusing, and this isn't going to be used for Cache-control header anywhere. When it's applied to the cache_render bin that may not be obvious at all.

Wim Leers’s picture

TTL === max-age, no?

catch’s picture

Yes it's the same thing, I just associate it with the Cache-control header more. Whereas ttl I think more of apcu or similar. I guess it's a case of whether that's a good thing or not that it has the association with the header.

catch’s picture

Or $override_permanent ;)

Wim Leers’s picture

   * @param int $expire
   *   One of the following values:
   *   - CacheBackendInterface::CACHE_PERMANENT: Indicates that the item should
   *     not be removed unless it is deleted explicitly.
   *   - A Unix timestamp: Indicates that the item will be considered invalid
   *     after this time, i.e. it will not be returned by get() unless
   *     $allow_invalid has been set to TRUE. When the item has expired, it may
   *     be permanently deleted by the garbage collector at any time.

Unfortunately the cache back-end interface is designed for absolute timestamps, so it's neither TTL nor max-age. Otherwise just following its existing terminology would've been perfect…

catch’s picture

Well except the argument is a ttl - we convert it to an expiry in the backend. Similarly CACHE_PERMANENT is treated as a ttl - it really means never as opposed to just being a far future expiry.

Wim Leers’s picture

Similarly CACHE_PERMANENT is treated as a ttl

It's not, it's an absolute time: "infinity". But PHP doesn't support infinity, so we map it to -1.

Well except the argument is a ttl - we convert it to an expiry in the backend.

True. But that's my point: we wouldn't need to discuss "TTL" vs "max-age" if we could just have be the same kind of thing as what we have, which is absolute numbers: timestamps.

So, we simply can't be consistent with what we have. If you then prefer TTL, works for me.

catch’s picture

It's not, it's an absolute time: "infinity". But PHP doesn't support infinity, so we map it to -1.

Is now + infinity different to just infinity?

Added tests. Also added some basic jitter to the database backend.

Not doing anything with the wording yet - I'm not tied to ttl at all, but not sure we have a better option yet.

No interdiff - will start posting them from this point though since this is more or less minimally viable now and the patch is getting bigger.

Status: Needs review » Needs work

The last submitted patch, 100: 2526150.patch, failed testing.

catch’s picture

Status: Needs work » Needs review
FileSize
8.28 KB

Arggh missing new files.

Wim Leers’s picture

Arggh missing new files.

Newb! :P

  1. +++ b/core/core.services.yml
    @@ -207,7 +207,7 @@ services:
    +    arguments: [default, ['maximum_permanent_ttl', 86400]]
    
    +++ b/core/lib/Drupal/Core/Cache/DatabaseBackend.php
    @@ -42,6 +42,11 @@ class DatabaseBackend implements CacheBackendInterface {
    +  protected $minimumPermanentTtl = CacheBackendInterface::CACHE_PERMANENT;
    

    Still inconsistent.

  2. +++ b/core/lib/Drupal/Core/Cache/ChainedFastBackendFactory.php
    @@ -68,21 +68,16 @@ public function __construct(Settings $settings = NULL, $consistent_service_name
    +        // @todo: check interface.
    

    Do you mean an assert($service instanceof SomeInterface)?

  3. +++ b/core/lib/Drupal/Core/Cache/ConfigurablePermanentTtlInterface.php
    @@ -0,0 +1,21 @@
    + * collection. The actual TTL used may be the same or higher than configured.
    

    This last sentence is meant to communicate that there may be jitter? Not very clear to me.

    Also, why >=, and not <=? Documenting the rationale for that would be good I think?

  4. +++ b/core/lib/Drupal/Core/Cache/ConfigurablePermanentTtlInterface.php
    @@ -0,0 +1,21 @@
    +  function setMinimumPermanentTtl($ttl);
    +}
    

    Missing newline between these two.

  5. +++ b/core/lib/Drupal/Core/Cache/DatabaseBackend.php
    @@ -240,6 +252,14 @@ protected function doSetMultiple(array $items) {
    +   * Calculate permanent expiry.
    

    Missing @return.

  6. +++ b/core/lib/Drupal/Core/Cache/DatabaseBackend.php
    @@ -240,6 +252,14 @@ protected function doSetMultiple(array $items) {
    +    return rand($minimum, $minimum * 1.5) + REQUEST_TIME;
    

    Shouldn't this 1.5 be a constant on the class as well, to make it clear what it's for?

  7. +++ b/core/modules/system/src/Tests/Cache/DatabaseBackendUnitTest.php
    @@ -51,6 +52,21 @@ public function testSetGet() {
         $this->assertIdentical($cached_value_short, $backend->get($cid_short)->data, "Backend contains the correct value for short, non-ASCII cache id.");
    +
    +  }
    

    Extraneous \n.

catch’s picture

Should cover everything in #104.

Status: Needs review » Needs work

The last submitted patch, 104: 2526150.patch, failed testing.

Fabianx’s picture

I think my vote is for override_permanent_maxage or override_permanent_ttl.

That says much clearer that this is an override for permanent cache items.

catch’s picture

Status: Needs work » Needs review
FileSize
10.14 KB
6.36 KB
-    $consistent = $this->container->get($this-consistentServiceName);

+    $consistent = $this->container->get($this->consistentServiceName);

Was the test failure.

Switched to $override_permament_ttl for now.

catch’s picture

Need to use the constant in the test also.

Opened a follow-up to discuss the refresh - that's not necessary to fix the critical for me, and having to do a schema update to cache tables feels very 8.2.x-only. #2705993: Add auto-refresh/quasi-LRU to database backend

catch’s picture

Adds two new interfaces and updates the container, so tagging for rc target triage.

catch’s picture

Title: Database cache backend needs an optional max lifetime » Make CACHE_PERMANENT configurable in the database cache
Fabianx’s picture

+++ b/core/lib/Drupal/Core/Cache/ChainedFastBackendFactory.php
@@ -68,26 +68,34 @@ public function __construct(Settings $settings = NULL, $consistent_service_name
+    $consistent = $this->container->get($this->consistentServiceName);
+    if ($consistent instanceof ConfigurableCacheBackendFactoryInterface) {
+      $consistent_backend = $consistent->get($bin, $options);
+    }
+    else {
+      $consistent_backend = $consistent->get($bin);
+    }

I think this would be much easier to read with a helper function.


$backend = $this->getCacheBackendWithOptions($name, $options);

The rest looks great to me.

catch’s picture

Yes that's better.

Fabianx’s picture

Status: Needs review » Reviewed & tested by the community

RTBC - Looks great to me - except perhaps test coverage might need to be extended a little?

Wim Leers’s picture

Status: Reviewed & tested by the community » Needs work
  1. +++ b/core/core.services.yml
    @@ -207,31 +207,31 @@ services:
    +    arguments: [default, ['override_permanent_ttl', 86400]]
    ...
    +    arguments: [entity, ['override_permanent_ttl', 86400]]
    ...
    +    arguments: [menu, ['override_permanent_ttl', 86400]]
    ...
    +    arguments: [render, ['override_permanent_ttl', 86400]]
    ...
    +    arguments: [data, ['override_permanent_ttl', 86400]]
    

    Why not make this 86400 a container parameter itself, so you only have to override that one parameter on your sites if you need a different TTL?

  2. +++ b/core/lib/Drupal/Core/Cache/ChainedFastBackendFactory.php
    @@ -68,26 +68,46 @@ public function __construct(Settings $settings = NULL, $consistent_service_name
    +   * Get a cache backend with options if it supports it.
    

    s/Get/Gets/

  3. +++ b/core/lib/Drupal/Core/Cache/ChainedFastBackendFactory.php
    @@ -68,26 +68,46 @@ public function __construct(Settings $settings = NULL, $consistent_service_name
    +   * @return
    

    Missing type.

  4. +++ b/core/lib/Drupal/Core/Cache/ConfigurableCacheFactoryInterface.php
    @@ -0,0 +1,23 @@
    +interface ConfigurableCacheFactoryInterface {
    

    Why not let this extend \Drupal\Core\Cache\CacheFactoryInterface?

  5. +++ b/core/lib/Drupal/Core/Cache/ConfigurablePermanentTtlInterface.php
    @@ -0,0 +1,26 @@
    + * configured ttl could potentially be set very low for some use cases.
    

    s/ttl/TTL/

  6. +++ b/core/lib/Drupal/Core/Cache/DatabaseBackend.php
    @@ -42,6 +49,11 @@ class DatabaseBackend implements CacheBackendInterface {
    +   * The overridden permanent TTL.
    

    "override" in variable name vs "overridden" in documentation.

    Note this is not even the overridden value, this is literally the value to use. It just might be overridden.

    So, I think $permanentTtl would be a better name.

  7. +++ b/core/lib/Drupal/Core/Cache/DatabaseBackend.php
    @@ -240,6 +259,16 @@ protected function doSetMultiple(array $items) {
    +   * Calculate permanent expiry.
    

    s/Calculate/Calculates/

  8. +++ b/core/lib/Drupal/Core/Cache/DatabaseBackend.php
    @@ -240,6 +259,16 @@ protected function doSetMultiple(array $items) {
    +   * @return an expires timestamp based on the override TTL with added 'jitter'.
    

    Wrong doxygen.

catch’s picture

Status: Needs work » Needs review
FileSize
8.26 KB
10.89 KB

#1 done.

Minor points also done.

#6 done and wish we'd thought of that about 20 comments ago, much better.

#4 because not sure if it's clearer since they both have a single method with the same name and different arguments. But PHP allows it so why not, don't think it's less clear, just a bit neutral.

@Fabianx on tests. I thought about testing that the expires doesn't return a valid cache entry, but we already have tests for that, and it's tricky to test with REQUEST time without manipulating the value, which would then invalidate the test here. Could potentially test ChainedFast explicitly though.

catch’s picture

Added explicit testing of the ChainedFastBackend when both the fast and consistent backends have permanent overridden.

catch’s picture

That test passed but didn't test anything very interesting, better one.

Fabianx’s picture

Assigned: Denchev » catch
Status: Needs review » Reviewed & tested by the community

That is enough on tests for me and all feedback has been addressed.

I think this should be ready now for core committer review.

alexpott’s picture

+++ b/core/lib/Drupal/Core/Cache/DatabaseBackend.php
@@ -240,6 +259,17 @@ protected function doSetMultiple(array $items) {
+    return rand($minimum, $minimum * self::JITTER_MULTIPLIER) + REQUEST_TIME;

From https://secure.php.net/manual/en/function.rand.php

Note: On some platforms (such as Windows), getrandmax() is only 32767. If you require a range larger than 32767, specifying min and max will allow you to create a range larger than this, or consider using mt_rand() instead.

I think we need to use mt_rand() here. Patch attached does this.

Patch includes a few coding standards fixes too.

Fabianx’s picture

RTBC on the interdiff

jibran’s picture

Seems ready to me as well but I think we should improve some documentation.

  1. +++ b/core/lib/Drupal/Core/Cache/ConfigurablePermanentTtlInterface.php
    @@ -0,0 +1,27 @@
    + * This allows cache backends with restrictions on storage space to set a
    + * shorter TTL for 'permanent' cache items, so that they get expired by garbage
    + * collection. The actual TTL used may be the same or higher than configured,
    + * to allow 'jitter' to be applied so that items set immediately after a cache
    + * clear do not all expire at the same time. Cache backends should assign jitter
    + * higher rather than lower to avoid the risk of cache stampedes, since the
    + * configured TTL could potentially be set very low for some use cases.
    

    I must say some awesome documentation. Can we add some of it above the JITTER_MULTIPLIER or at least add @see?

  2. +++ b/core/lib/Drupal/Core/Cache/DatabaseBackend.php
    @@ -235,6 +254,17 @@ protected function doSetMultiple(array $items) {
    +    return mt_rand($minimum, $minimum * self::JITTER_MULTIPLIER) + REQUEST_TIME;
    

    We can also explain this line as well as per #119.

olli’s picture

Status: Reviewed & tested by the community » Needs work
  1. +++ b/core/core.services.yml
    @@ -207,31 +208,31 @@ services:
    +    arguments: [default, ['override_permanent_ttl', '%cache.configuration.override_permanent_ttl%']]
    

    ', ' -> ': '?

  2. +++ b/core/core.services.yml
    @@ -207,31 +208,31 @@ services:
         factory: cache_factory:get
    -    arguments: [render]
    +    arguments: [render, ['override_permanent_ttl', '%cache.configuration.override_permanent_ttl%']]
    

    This doesn't work with code that uses cache_factory directly like render_cache.

  3. +++ b/core/lib/Drupal/Core/Cache/DatabaseBackend.php
    @@ -37,6 +44,11 @@ class DatabaseBackend implements CacheBackendInterface {
       /**
    +   * The permanent TTL.
    +   */
    +  protected $permanentTtl = CacheBackendInterface::CACHE_PERMANENT;
    

    missing @var?

  4. +++ b/core/lib/Drupal/Core/Cache/DatabaseBackendFactory.php
    @@ -34,16 +34,14 @@ function __construct(Connection $connection, CacheTagsChecksumInterface $checksu
    +      $class->setOverridePermanentTtl($options['override_permanent_ttl']);
    

    ->setPermanentTtl(

catch’s picture

Status: Needs work » Needs review
FileSize
3.85 KB
3.85 KB

Addressing #121 and #122.

However #122-2 is a good spot, and not sure what to do about that yet.

Status: Needs review » Needs work

The last submitted patch, 123: 2526150.patch, failed testing.

catch’s picture

Status: Needs work » Needs review
FileSize
13.08 KB

Rebased.

Status: Needs review » Needs work

The last submitted patch, 125: 2526150.patch, failed testing.

The last submitted patch, 125: 2526150.patch, failed testing.

catch’s picture

Status: Needs work » Needs review
FileSize
13.08 KB
1.86 KB
catch’s picture

To be honest the more I work on this patch, the less I like it. For a site that doesn't get many 404s, it could degrade performance rather than improving it, and the back and forth here shows that the concept is quite hard to grasp. There are also other options here such as setting an arbitrary limit on cache items and pruning in garbageCollection if it goes over - I like that option less than this one though.

I'd be tempted to bump #2699613: Set a shorter TTL for 404 responses in page_cache module and #2699627: url.path cache context for breadcrumbs is unnecessarily granular to critical and mark this postponed on those. They'll benefit sites running any cache backend and if we can get them tested on a site like @casey's that's suffering from this issue, it'd then allow us to see whether this issue is really critical by itself.

catch’s picture

Priority: Critical » Major
Status: Needs review » Postponed

Doing that for now.

Berdir’s picture

Status: Postponed » Active

That happened, this is still a problem. 404 is just one example, you can also just use arbitrary query arguments to fill up the cache.

xjm’s picture

Issue tags: -rc target triage
Fabianx’s picture

What might make more sense is to add a "lurker" cron job, which expires cache entries that are actually expired per the cache tags.

Though I am not sure if our structure allows for that - given we only have a checksum in the DB backend. However for serialized render arrays we could do it at least as we there have the cache tags information.

Version: 8.1.x-dev » 8.2.x-dev

Drupal 8.1.9 was released on September 7 and is the final bugfix release for the Drupal 8.1.x series. Drupal 8.1.x will not receive any further development aside from security fixes. Drupal 8.2.0-rc1 is now available and sites should prepare to upgrade to 8.2.0.

Bug reports should be targeted against the 8.2.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.3.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

geerlingguy’s picture

I just ran into this after launching a new D8 site that uses the database backend (currently memcached or redis isn't an option); see #2814015: cache_render table becomes extremely large over time.

We don't have anything out of the ordinary, but there are a couple views exposed filters in blocks, and those blocks have 100,000+ rows in the cache_render table (one per filter set and url permutation—the block only appears on one route, but the path can have filter arguments like ?abc-xyz), which never expire unless I run drush cr manually (then they just fill up again until the next cache clear).

We were dealing with a database around 10-15 MB total before launching the prod site; now that there's live traffic, the database balloons out to 10+ GB every day until we clear the cache. Backups and full database copies take 30+ minutes instead of 30s...

I believe I have seen maybe 5-10% of all Drupal sites I've ever worked with using a cache backend other than the database, and I see no reason that will change with Drupal 8—this could be a huge problem for anyone trying to run a small-to-medium sized Drupal 8 site on shared hosting with even pretty basic sites.

Berdir’s picture

If a single day is enough to generate such an amount of cache recoreds, this issue alone will not solve it for you unless you plan to set the expiration to just hours.

Having that many variations sounds like you want to prevent those blocks from being cached at all because they're probably by user or so and caching implies more overhead than not doing that. Try just disabling the cache in the views settings, that should bubble up. Page cache will still cache them and for anonymous users, dynamic page cache + optionall bigpipe will allow to cache the rest of the page and just inject those blocks, while bigpipe will allow you to serve the rest of the page while still building those views.

Berdir’s picture

That said, I agree that this is a critical scalability issue and I also still think that we should try to get a minimal* fix in and document that nicely somewhere, possibly even have a sane default in core.

* I still believe that jitter and that isn't necessary for a minimal solution, as that IMHO occurs naturally on most sites, as you get the requests and cache writes distributed over time. And sites big enough for that to become relevant probably should look into alternative cache backends anyway.

With the rise of container based hosting like pantheon, platform.sh, acquia and so on which afaik all offer redis/memcache, I do think that using such a backend will be more common than it used to be.

geerlingguy’s picture

Closed #2814015: cache_render table becomes extremely large over time as a duplicate.

And yes, more common... but at least 96% of the sites I've ever dealt with are not on Pantheon, Platform.sh, Acquia, etc. There's a silent majority of Drupal sites out there running on Hostgator, Dreamhost, preconfigured DigitalOcean VPSes, etc., and this problem (along with some others) doesn't help silence the mantra of "Drupal 8 is not for shared hosting"...

I agree with a minimal fix for now—the jitter doesn't seem (to me) like it would be a major issue/blocker.

Fabianx’s picture

#138: I do agree that some kind of expiration would be nice.

However please be aware that there is a working contrib backend by larowlan already:

https://github.com/larowlan/slushi_cache

I still think that using jitter is simple and can be achieved easily (and not sure what catch' last patch is missing anyway except for reviews and tests?) and we should:

- a) make simple changes
- b) ensure it will work for more complex cases, too.

Core has a higher responsibility in that regard.

--

To the issue above:

Likely it is indeed best to max-age=0 the view result with that many permutations (are they all valid?) and have the whole block be placeholder-ed automatically.

sajiniantony’s picture

I have upgraded the site to 8.2 version, it shows issue with the size of cache table and also deadlock error.Any help is appreciated

xmacinfo’s picture

With the rise of container based hosting like pantheon, platform.sh, acquia and so on which afaik all offer redis/memcache, I do think that using such a backend will be more common than it used to be.

This is a bold statement!

100% of the Drupal sites I am working with are not on those platforms.

catch’s picture

We don't have anything out of the ordinary, but there are a couple views exposed filters in blocks, and those blocks have 100,000+ rows in the cache_render table (one per filter set and url permutation—the block only appears on one route, but the path can have filter arguments like ?abc-xyz)

Do you have 10s/100s of thousands of requests changing those filters then? I wonder a bit whether caching with views exposed filters wouldn't benefit better from some flood control or something - stop entries getting there in the first place. Would have to do the flood control when rendering the view and then set max_age=0 when flood control kicks in.

Wim Leers’s picture

Backups and full database copies take 30+ minutes instead of 30s...

Well, that is a huge flaw in your backup system: you should never ever back up the cache_* tables.

But of course we should still solve the problem described in this issue.

catch’s picture

Assigned: catch » Unassigned

Unassigning me since I'm not actively working on this at the moment.

Wim Leers’s picture

AFAICT we have agreement this is a good idea.

It seems the last point of disagreement is the inclusion of jitter or not?

Wim Leers’s picture

Title: Make CACHE_PERMANENT configurable in the database cache » Make CACHE_PERMANENT configurable in the database cache — currently allows unlimited growth: cache DB tables of gigabytes!
Related issues: +#2828639: [BUGFIX] getRelated is using wrong cache tags

This problem also came up in #2828639: [BUGFIX] getRelated is using wrong cache tags.

And I know that some Acquia customers have run into this very problem too.

larowlan’s picture

FWIW Slushi cache solves this via contrib in the meantime.

malcomio’s picture

I observed a similar issue on shared hosting, which did not seem to be addressed by installing Slushi cache.

Some shared hosting providers will shut sites down with minimal warning if database sizes grow too large - as geerlingguy says in #138, issues like this are likely to push more small sites away from Drupal.

Version: 8.2.x-dev » 8.3.x-dev

Drupal 8.2.6 was released on February 1, 2017 and is the final full bugfix release for the Drupal 8.2.x series. Drupal 8.2.x will not receive any further development aside from critical and security fixes. Sites should prepare to update to 8.3.0 on April 5, 2017. (Drupal 8.3.0-alpha1 is available for testing.)

Bug reports should be targeted against the 8.3.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.4.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Jelle_S’s picture

We experienced this error when a google bot was crawling our facetted search, the cache_render table was growing to a couple of GB within minutes. We had to put our site in maintenance mode and exclude the facetted search in robots.txt to deal with it. Site is up and running again now, with a cronjob clearing the cache every 6 hours just to be sure... Not exaclty ideal. We thought about putting varnish in front of it, but since the varnish bin is shared with other websites it might compromise those if it grows this fast...

catch’s picture

@malcomio Slushi cache should work fine for shared hosting, what makes you think it wouldn't? You'd have to configure the TTL time quite low if you're getting crawled though.

malcomio’s picture

@catch on the side project site where I experienced this issue, it still occurred after installing Slushi cache, although looking at settings.php now, I think that it may not have the correct configuration. I'll check and if I'm still having problems, I'll create an issue on the module.

haasontwerp’s picture

Subscribing, experiencing this for a site i've built. Slushi cache won't help (or i'm probably setting it wrong). DB without cache is about 75 MB big, it grows to 250+ MB in one day. And then the hosting provider just shuts down database user access, so i get a understandably angry client on the phone and i'm getting tired of manualy clearing cache x times a day. Bigger database won't fix this, it will just take a little longer to fill up the DB to max size.

Wim Leers’s picture

@Jelle_S: Faceted search is the sort of module that should take responsibility to not cache every single variation. Because it's a module that allows insane numbers of variations. So of course it does not make sense to cache them all.

anavarre’s picture

Just registering a real-life example with a 8.2 site. PHP 5.6, core modules + Honeypot and a custom multimedia-related module, no memcache or alternate cache backend.

+--------------------------+--------------+------------+--------+-------------+---------------------+
| Tables in abcdefgh_db    | Size on Disk | Size in DB | Rows   | Index in DB | Index-to-data ratio |
+--------------------------+--------------+------------+--------+-------------+---------------------+
| cache_render             | 8312.00 MB   | 7978.88 MB | 220276 | 49.84 MB    | 0.01                |
| cache_dynamic_page_cache | 2924.00 MB   | 2706.72 MB | 43703  | 24.00 MB    | 0.01                |
+--------------------------+--------------+------------+--------+-------------+---------------------+

Rebuilt caches / Truncated tables

+--------------------------+--------------+------------+--------+-------------+---------------------+
| Tables in abcdefgh_db    | Size on Disk | Size in DB | Rows   | Index in DB | Index-to-data ratio |
+--------------------------+--------------+------------+--------+-------------+---------------------+
| cache_render             | 176.00 MB    | 156.70 MB  | 7444   | 2.55 MB     | 0.02                |
| cache_dynamic_page_cache | 96.00 MB     | 78.52 MB   | 1406   | 1.52 MB     | 0.02                |
+--------------------------+--------------+------------+--------+-------------+---------------------+

4 days after:

+--------------------------+--------------+------------+--------+-------------+---------------------+
| Tables in abcdefgh_db    | Size on Disk | Size in DB | Rows   | Index in DB | Index-to-data ratio |
+--------------------------+--------------+------------+--------+-------------+---------------------+
| cache_render             | 5120.00 MB   | 4689.84 MB | 116175 | 33.98 MB    | 0.01                |
| cache_dynamic_page_cache | 2348.00 MB   | 2207.66 MB | 39439  | 18.86 MB    | 0.01                |
+--------------------------+--------------+------------+--------+-------------+---------------------+
Wim Leers’s picture

The Local Actions and Local Tasks blocks also vary by URL, and they're enabled for all users by default, even though most sites only need those blocks for authenticated users (because it's possible that they're used on a wiki-like site that allows anonymous users to use them).

Those two blocks in particular can cause enormous amounts of cache entries. Which further confirms the need for this issue to be solved.

anavarre’s picture

Version: 8.3.x-dev » 8.4.x-dev
anavarre’s picture

#156 should the default for Local Actions and Local Tasks be to not show up for anonymous users, then? If yes, perhaps that should be a follow-up issue.

geerlingguy’s picture

I just noticed one site (the one that originally surfaced this issue for me) hit a full disk on the database server (100 GB) from the cache_render table expanding again. A drush cr fixed it for now, but this happened because for a few weeks, no code deployments were done (thus no cache rebuilds took place).

The entire database (besides cache_render) is under 1 GB :-/

mariagwyn’s picture

We have set Local Actions and Local Tasks blocks on the site mentioned by Aurelian above to 'authenticated' only. I am monitoring to see if this actually fixes the issue.

Berdir’s picture

No, it doesn't "fix" it. They result are a lot of useless cache entries, but they're practically empty and hardly take up any size. Any page that has those small blocks also has internal/dynamic page which are a lot bigger. It helps a little bit, and we're working on a few other things that all also help, but each only so much.

The only really reliable solution is using a cache backend with a fxed size like redis or memcache. A max age size as proposed here would help at least small to medium sized sites to not grow too large over time. And while this is not yet fixed, I recommend everyone who can't use redis/memcache to use https://www.drupal.org/project/slushi_cache.

What I still don't understand is why we can't just move forward with one simple setting as initially proposed, that's not perfect but it's simple and fairly reliable. I think that sites that grow so quickly that this isn't enough really shouldn't use the database backend anyway. i can work on updating the patch for that if we can agree on going forward with that.

catch’s picture

I'm OK with this going in without jitter, we can just move that to a follow-up and it's only an implementation detail in the database cache backend.

Wim Leers’s picture

The only really reliable solution is using a cache backend with a fxed size like redis or memcache.

This. I already wrote this in #8.3, almost two years ago.

Caching with unbounded storage makes no sense. In practice, every cache has a limited size. A database isn't designed to store caches, but if it is, then we should limit the number of items we store.

What I still don't understand is why we can't just move forward with one simple setting as initially proposed, that's not perfect but it's simple and fairly reliable. I think that sites that grow so quickly that this isn't enough really shouldn't use the database backend anyway. i can work on updating the patch for that if we can agree on going forward with that.

+1. If you update the patch, I promise swift reviews. Let's finally get this done. This is biting many D8 sites.

lukas.fischer’s picture

+1 for #163

hkirsman’s picture

+1 Just discovered my fairly small site with database of 15gb when it's really just 50mb.

xpersonas’s picture

Agreed. +1 for #163

phanosd’s picture

Amazing that this issue is actually running for so many years.

+1 just discovered my brand new flashy bling bling high traffic optimized website of 42.9mb on Friday night went onto 1GB over the weekend. Thanks cache render

Do we actually have a way to solve this that is practical and long term?

mariagwyn’s picture

Can someone clarify which of these patches may be workable at this point and what need to be fixed on them to get a working solution? Happy to apply a patch and test, but the existing patches are old.

Wim Leers’s picture

Assigned: Unassigned » Wim Leers

No movement in a month :( I promised swift reviews in #163 of a reroll. So I'm going to turn it around: I'll do a reroll, hopefully swift reviews will follow!

Wim Leers’s picture

To do what I said, the "reroll of the simple approach", I had to first find the simple approach. I went back all the way to the beginning.

  • The patch that I originally RTBC'd (on August 12, 2015) in #66 seems to be close to what we want. It was un-RTBC'd because there was discussion A) about it being specific to the database cache back-end, B) to make it a per-bin setting.
  • To address A, @catch rolled #79, as a PoC. He made it an actual working patch in #82 and #87.
  • Then @Fabianx started talking in #81 about "bounded DB backend", jitter and pseudo-LRU.
  • @catch added jitter support in #104.
  • @catch continued to refine the patch in subsequent comments, but it remained fundamentally the same. It was RTBC'd in #118 by @Fabianx.
  • @olli spotted a critical flaw that all of us missed, in #122.2. All prior patches fail for cases where code is using the cache factory directly!
  • Then in #129 @catch comes to the realization that he doesn't like this approach/patch. He thinks postponing this on #2699613: Set a shorter TTL for 404 responses in page_cache module and #2699627: url.path cache context for breadcrumbs is unnecessarily granular would be good,
    because those would already solve the problem for many sites.
  • Since then, both #2699613 and #2699627 landed, and the problems still exist. We see this on Acquia Cloud. And @Berdir also said it in #131.
  • @Berdir in #137 and again in #161 asks why we can't solve this with a simple setting. But of course, @olli spotted that critical flaw. So I started looking at how slushi_cache solves this. Turns out it solves this in a very simple way: by accepting a single "permanent TTL" setting instead of one per bin (Slushi Cache calls this the "melt time" 👏), it avoids the problem that @olli pointed out. And it is a simple decorator of the existing code. We could literally adopt Slushi cache as-is IMO. Or we could merge its code into DatabaseBackend.
    It does NOT meet two requirements that @catch posed though: A) per-bin configurability, B) being generic, instead of specific to the DatabaseBackend. But that's the thing, this problem is pretty much specific to the DatabaseBackend!
  • However, #152 and #153 show that they don't fix the entire problem. But they sure are reducing the problem's impact. The problem is that we don't know it will be enough per se.

So, if we want "the simple approach", I think we basically want to move Slushi Cache into core? 2526150-slushi_cache-170.patch does this. With two added features: per-bin overrides (but in a way that it won't break like @olli pointed out for earlier patches) + allowing to specify 0 or -1 to opt out (that was a flaw in Slushi Cache).


However … it looks like @Berdir and I are thinking something similar: the root cause is much simpler. @Berdir in #161:

The only really reliable solution is using a cache backend with a fxed size like redis or memcache. A max age size as proposed here would help at least small to medium sized sites to not grow too large over time.

I in #163:

Caching with unbounded storage makes no sense. In practice, every cache has a limited size. A database isn't designed to store caches, but if it is, then we should limit the number of items we store.

So here's a provocative thought… what if we do limit the amount of space that database tables can consume? LRU requires bookkeeping, so that's off the table. If you want LRU, use Redis or Memcached. But we can easily achieve fixed-size DB cache bins:

  1. Add a serial field to DatabaseBackend::schemaDefinition()
  2. Do something like DELETE FROM cache_data WHERE id <= ((SELECT MAX(fixed_size_id) FROM cache_data) - 1000)

So here's a patch implementing that. Including an upgrade path even. 2526150-bounded_database_cache_bins__count-170.patch

But then I wondered whether we could do even better: what if we could actually assign size limitations not by count but by space?
Turns out that MySQL at least allows this: https://stackoverflow.com/a/18238810 — you can query information_schema. And then if we construct the right query, we can get the number of megabytes consumed by a given table: https://stackoverflow.com/a/9620273. What remains then of course, is removing a number of rows to make it fit in that size again. We can't know the size of an individual record, so we'd have to assume they're all sized the same, and then remove the percentage that would make it fit within the budget again. Before you get too excited: this would then be MySQL-specific, which would be … sad. Anyway, very very rough PoC for this attached.

Wim Leers’s picture

So, #170 contains 3 approaches, all fairly simple:

  1. slushi_cache: moves https://www.drupal.org/project/slushi_cache into core, with minimal adjustments — this is probably the "simple" approach that was being asked for in
  2. count: makes the DB cache bins bounded (fixed size), by row count
  3. size: makes the DB cache bins bounded (fixed size), by disk space

Options 1 and 2 are database engine-agnostic. Option 3 is database engine-specific, and therefore an unlikely choice.

Option 1 does not protect against surges in activity, e.g. bots crawling your site, or malicious users attacking your site, option 2 does.

Wim Leers’s picture

And finally, a provocative related question: #2888838: Remove render caching?.

Wim Leers’s picture

Priority: Major » Critical

And I just realized this was made major instead of critical in #130 because it was though that two other issues would help (see #129), but they did not. Therefore the original criticalness still applies. Back to critical.

Wim Leers’s picture

Title: Make CACHE_PERMANENT configurable in the database cache — currently allows unlimited growth: cache DB tables of gigabytes! » Database cache bins allow unlimited growth: cache DB tables of gigabytes!

The current title implies one particular solution. As of #170, there is a very different option. So, retitling for clarity & simplicity.

The last submitted patch, 170: 2526150-slushi_cache-170.patch, failed testing. View results

catch’s picture

So here's a provocative thought… what if we do limit the amount of space that database tables can consume? LRU requires bookkeeping, so that's off the table. If you want LRU, use Redis or Memcached. But we can easily achieve fixed-size DB cache bins:

I was thinking about this issue this morning, and had a similar but different thought:

- have a per-bin settings-configurable number of rows
- when the size of the bin exceeds the number of rows, find the median created timestamp per-bin and delete all items created before that timestamp (i.e. about 50% of the table).
- do this on cron so there's no cost with cache sets.

This will clear some space without resulting in a complete cache clear, tables will be able to go over the limit for up to three hours.

My concern with both that approach and the one suggested by Wim is that it could lead to thrashing, but again memcache is the real answer to that.

Haven't reviewed the patches yet.

effulgentsia’s picture

For what it's worth, I like #176 a lot!

My concern with both that approach and the one suggested by Wim is that it could lead to thrashing

Really? What's a scenario in which #176 would result in thrashing? It's true that frequently used items could get purged shortly after their last use, but never shortly after their first use.

Wim Leers’s picture

#176 is a more advanced (and complicated) version of 2526150-bounded_database_cache_bins__count-170.patch. Why do we want to do stuff with median timestamp etc? Why not just look at the row count versus allowed row count, and nothing else?

I'm probably missing something :)

tedfordgif’s picture

@effulgentsia, I think @catch is saying that large sets of cache items could get cleared at once, and then regenerated at once (see jitter). It would probably depend on the cron interval, content, and access patterns.

The median timestamp approach in #176 appears intended to better preserve recently-created cache entries, but could also still allow large cache table size, and doesn't really approximate LRU. For example, if you get traffic that maxes the cache out during each cron interval, you cycle between ~50% and 100% of the peak cache size, assuming evenly-distributed timestamps. The cache hit ratio and % cleared might be unpredictable if the requests during a cron interval are front- or back-loaded. If you're getting the kind of traffic that makes the above concerns valid maybe you are using another cache backend, though.

The simple row-count approach provides a more predictable upper bound on DB size, which is the main issue here.

@catch, Wim's row-count approach only runs on cron, as you suggested.

catch’s picture

Really? What's a scenario in which #176 would result in thrashing? It's true that frequently used items could get purged shortly after their last use, but never shortly after their first use.

A bad but relatively common scenario I have in my head is something like this:

Cron run every ten minutes.

There 2 requests per second to unique pages from a crawler.

Mean of 20 render cache items per page.

This means:

1,200 unique pages visited in ten minutes.

24,000 cache items created in ten minutes.

Cache item limit of 10,000.

Whether you go down to 10,000 or 5,000 items, you're going to be deleting tens of thousands of rows per hour then recreating them again.

If we take a Drupal.org node with 300 comments by say 50 unique users, there could easily by 4-500 individual render cache items per page on a cold render cache, so you could hit a high row limit from a much smaller number of pages.

@tedfordgif thanks for answering Wim's 178 that's exactly the design and drawbacks - this wasn't a competitor to Wim's suggestions it was just what I was mulling when I saw this morning, and haven't really thought through beyond that (hence why I didn't post anything).