Move views data cache to cache_discovery [#3587797]

Problem/Motivation

Views data has a cache item for all database tables + one per table, based on views hooks, this is a good candidate for cache_discovery because it's always going to be finite.

Steps to reproduce

Proposed resolution

Remaining tasks

User interface changes

Introduced terminology

API changes

Data model changes

Release notes snippet

Issue fork drupal-3587797

Show commands

Start within a Git clone of the project using the version control instructions.

Add & fetch this issue fork’s repository

Or, if you do not have SSH keys set up on git.drupalcode.org:

Add & fetch this issue fork’s repository

3587797-move-views-data changes, plain diff MR !15618
Check out this branch for the first time

Check out existing branch, if you already have it locally

About issue forks

Comments

Comment #1

30 April 2026 at 08:07

catch created an issue. See original summary.

Comment #2

catch

he/him

English

commented 30 April 2026 at 08:09

Status:

Active

» Needs review

Comment #3

30 April 2026 at 08:09

catch opened merge request !15618

Comment #4

smustgrave commented 30 April 2026 at 15:05

Status:

Needs review

» Needs work

Started a new pipleine to run tests and javascript seem to be consistently failing.

Comment #5

yujiman85 commented 30 April 2026 at 19:09

Assigned:

Unassigned

» yujiman85

Comment #6

yujiman85 commented 30 April 2026 at 19:42

Assigned:	yujiman85	» Unassigned
Status:	Needs work	» Needs review

Comment #7

smustgrave commented 30 April 2026 at 21:49

Status:

Needs review

» Reviewed & tested by the community

Test updates seem good

Comment #8

alexpott

he/they

English

🇪🇺🌍

commented 30 April 2026 at 23:29

Version:	main	» 11.x-dev
Status:	Reviewed & tested by the community	» Patch (to be ported)

Committed d4014a7 and pushed to main. Thanks!

Needs a new MR for 11.x

Comment #9

30 April 2026 at 23:29

alexpott committed d4014a7d on main

task: #3587797 Move views data cache to cache_discovery

By: catch
By:...

Comment #10

berdir

German

Switzerland

commented 1 May 2026 at 04:51

It is finite, but can be very large. I'm concerned this will fill apcu on large sites and cause a regression due to more frequent resets

Comment #11

catch

he/him

English

commented 1 May 2026 at 07:06

I think that's another reason to do #3586418: Set explicit ttl in the apcu backend and allow a default to be set for chained fast but not necessarily to avoid making this change.

Comment #12

berdir

German

Switzerland

commented 1 May 2026 at 09:16

The problem with apcu is that when it is full, it just resets (expunge counter). So I'm not sure how much the ttl will help.

We saw that on a few larger sites when we added the \Drupal\monitoring\Plugin\monitoring\SensorPlugin\ApcuSharedMemoryExpungesSensorPlugin sensor, apcu just continuously emptied all cache and started from the beginning. It stopped/happens very rarely when we adjusted the max size to fit the project. That considerably hurts performance.

The core status checks does check for 75/90% full, but it doesn't check expunges, if you're >90% then it's quite likely that you're getting frequent expunges and you might not actually see it reach 90%. blackfire apm reports on this as well. It's not trivial to monitoring for this as you need to acccount for how it's been running. I think apcu complains if you have more than 1 expunge per hour or so.

The full bin should be used rarely to never on runtime. What if we just add the smaller per-table entries to apcu but not the big one?

Comment #13

catch

he/him

English

commented 1 May 2026 at 09:48

The per table entries is a good idea. We'll need to inject two cache bins but we can do that.

Comment #14

1 May 2026 at 09:49

alexpott committed dcb51286 on main

Revert "task: #3587797 Move views data cache to cache_discovery"

This...

Comment #15

alexpott

he/they

English

🇪🇺🌍

commented 1 May 2026 at 09:50

Version:	11.x-dev	» main
Status:	Patch (to be ported)	» Needs work

Discussed with @catch - #12 is a great idea. Thanks @berdir

Comment #16

berdir

German

Switzerland

commented 1 May 2026 at 10:00

(cross post)

I put together a pretty basic script, at first the numbers were surprisingly low then I remembered that I'm using igbinary which is significantly better on those kind of data structures as they are heavily repeating:

the script only works on redis and currently assumes that igbinary is used:

<?php

use Drupal\Core\StringTranslation\ByteSizeMarkup;
use Drupal\redis\RedisPrefixTrait;

class PrefixHelper {
  use RedisPrefixTrait;

  public function getPrefixHelper() {
    return $this->getPrefix();
  }
}

$client = \Drupal::service('redis.factory')->getClient();

$prefix = (new PrefixHelper())->getPrefixHelper();

$keys = $client->keys($prefix . ':default:views_data*');

$full_entries = [];
$table_entries = [];
foreach ($keys as $key) {
  $hash = $client->hgetall($key);
  $uncompressed = !empty($hash['gz']) ? gzuncompress($hash['data']) : $hash['data'];

  $unserialized = \Drupal::service('serialization.igbinary')->decode($uncompressed);
  $php_serialize = serialize($unserialized);

  if (substr_count($hash['cid'], ':') === 1) {
    $full_entries[] = [
      'raw' => strlen($hash['data']),
      'uncompressed' => strlen($uncompressed),
      'serialized' => strlen($php_serialize),
    ];
  }
  else {
    $table_entries[] = [
      'raw' => strlen($hash['data']),
      'uncompressed' => strlen($uncompressed),
      'serialized' => strlen($php_serialize),
    ];;
  }
}

echo "full entries:" . count($full_entries) . "\n";
echo "full entries size raw: " . ByteSizeMarkup::create(array_sum(array_column($full_entries, 'raw'))) . "\n";
echo "full entries size igbinary: " . ByteSizeMarkup::create(array_sum(array_column($full_entries, 'uncompressed'))) . "\n";
echo "full entries size php serialize: " . ByteSizeMarkup::create(array_sum(array_column($full_entries, 'serialized'))) . "\n";
echo "table entries:" . count($table_entries) . "\n";
echo "table entries size raw: " . ByteSizeMarkup::create(array_sum(array_column($table_entries, 'raw'))) . "\n";
echo "table entries size igbinary: " . ByteSizeMarkup::create(array_sum(array_column($table_entries, 'uncompressed'))) . "\n";
echo "table entries size php serialize: " . ByteSizeMarkup::create(array_sum(array_column($table_entries, 'serialized'))) . "\n";

The output on a site with 4 languages:

full entries:4
full entries size raw: 495.49 KB
full entries size igbinary: 2.04 MB
full entries size php serialize: 13.76 MB
table entries:41
table entries size raw: 59.82 KB
table entries size igbinary: 146.53 KB
table entries size php serialize: 582.42 KB

(disclaimer: when I tested, one language was actually not in the cache, it just reported 3 full entries, so I manually visited it to fill that)

with just 4 languages, on default php serialize, the 4 views_data cache entries use up almost 14MB. there are only ~10 table entries per language (a second site had 145 entries and 1.6MB in redis, so I assume redis had been cleaning up some not frequently used caches)

The table entries make up only 1/20th here, it's more on the other site, but still only a relatively small and manageable portion.

There are sites with 40 or 100 languages, so you can multiply that accordingly.

I think we should only store the table entries in bootstrap.

Comment #17

catch

he/him

English

commented 1 May 2026 at 10:09

Replied to #12 in the issue summary of #3586418: Set explicit ttl in the apcu backend and allow a default to be set for chained fast - short version is as long as the docs are right, that issue should help.

But with or without that issue, only storing the tables in chained fast is a great idea.

table entries size igbinary: 146.53 KB full entries size php serialize: 582.42 KB Should this last line say table?

And is it smaller because most of the table entries are never used during runtime?

Discussion reminded me of #3454277: Allow fields to be opted out of views data.

Comment #18

berdir

German

Switzerland

commented 1 May 2026 at 10:19

Should this last line say table?

Yes, it should you can also see the typo in the php script, I'll edit that.

And is it smaller because most of the table entries are never used during runtime?

Yes exactly. due to the generic integration, views_data contains definitions for whole entity types that don't have any views. paragraphs would be a typical example where you have lots of bundles and fields but most sites don't use any views on paragraphs (err views integration is still completely broken). Opt-out is tricky, as whether or not they are used typically depends a lot on the specific site. there _are_ sits with paragraph views.

It might be more feasible to switch to an approach where we build the views data per entity type and remove the global cache entry completely. But BC around that sounds very challenging. Then entity types that aren't used are then never built, at least not at runtime (we'd need to still understand for which entity types you can add views for the add view selector for example).

Comment #19

catch

he/him

English

commented 1 May 2026 at 10:31

Building by entity, it's hard because views data is a giant array of tables and random other non-table things, that doesn't inherently know about entity types.

In #3380145: ViewsData should not cache by language I was trying to figure out a way not to cache by language, which would dramatically reduce the total cache size if not the individual entries, got a bit stuck there though.

I think it would be reasonable for views data to explicitly build in the concept of entities - instead of converting to and from database tables and back again, which would also help with #3395848: Entity Query views backend, but not sure where to even start with what that might look like.

One vague idea I had was adding hook_views_data_table() as a new hook, and then allow the main views data entry to contain only a stub for the table. When it's requested, we'd then call hook_views_data_table($table_name) to get the data for that table. We'd still need to support hook_views_data_alter() but that could be passed an array with one table in it (and alter implementations would need to check for the existence of array keys which they really ought to anyway).

hook_views_data_table() then would need an entity/field implementation that can identify the entity type and field name from the table name, either via reverse-engineering it or if we added minimal metadata to the main entry which we then expand to the full definition.

Move views data cache to cache_discovery

Problem/Motivation

Steps to reproduce

Proposed resolution

Remaining tasks

User interface changes

Introduced terminology

API changes

Data model changes

Release notes snippet

Issue fork drupal-3587797

Comments

Comment #1

Comment #2

Comment #3

Comment #4

Comment #5

Comment #6

Comment #7

Comment #8

Comment #9

Comment #10

Comment #11

Comment #12

Comment #13

Comment #14

Comment #15

Comment #16

Comment #17

Comment #18

Comment #19

Related issues

News items

Our community

Documentation

Drupal code base

Governance of community