Problem
On every web request, DrupalKernel::getCachedContainerDefinition() loads the compiled container definition from the database via the cache.container service.
Loading the container is pretty slow: it not only involves fetching 500KB+ of serialized data, but also deserializing it into PHP arrays.
This database query happens during early bootstrap, before the main service container is available, before ChainedFastBackend is initialized, and before APCu-backed cache bins are accessible.
This means the container loading cannot benefit from APCu even though the container definition is ideal for APCu caching (large, rarely changes, read on every request).
The cache.container service was deliberately excluded from ChainedFastBackend in #2248767: Use fast, local cache back-end (APCu, if available) for low-write caches (bootstrap, discovery, and config) because of the chicken-and-egg problem: APCu cache services aren't available before the container that defines them is loaded.
Note: sites using Redis (#3007374: Document bootstrap_container_definition in README) or Memcache (#2882755: Move container to memcache) can override bootstrap_container_definition in settings.php to use those backends, but this still requires a network round-trip to the cache server on every request.
Solution
So I did an experiment (the serious kind) and added an "APCu fast path" in getCachedContainerDefinition() that call the raw APCu functions. We're basically skipping over the Drupal services, which I think is okay.
I’m getting back up to speed with the caching layer, so to be on the safe side, I modeled the design after ChainedFastBackend.
Benchmarks
HTTP TTFB improvement (Umami, 200 requests per scenario, APCu-only change):
Anonymous pages:
| Page | Baseline p50 | APCu p50 | Baseline p95 | APCu p95 | Baseline mean | APCu mean |
|---|---|---|---|---|---|---|
| Front | 23.0ms | 20.2ms (-12%) | 38.6ms | 26.3ms (-32%) | 25.5ms | 20.8ms (-18%) |
| Article | 27.9ms | 22.4ms (-20%) | 41.4ms | 35.1ms (-15%) | 29.2ms | 24.1ms (-17%) |
| Recipes | 21.7ms | 19.9ms (-8%) | 32.4ms | 25.3ms (-22%) | 23.1ms | 20.4ms (-12%) |
Authenticated pages (admin user):
| Page | Baseline p50 | APCu p50 | Baseline p95 | APCu p95 | Baseline mean | APCu mean |
|---|---|---|---|---|---|---|
| Front | 38.9ms | 34.3ms (-12%) | 50.7ms | 40.7ms (-20%) | 40.4ms | 35.1ms (-13%) |
| Article | 37.7ms | 34.8ms (-8%) | 51.6ms | 43.3ms (-16%) | 39.5ms | 35.8ms (-9%) |
| Recipes | 38.8ms | 34.6ms (-11%) | 48.7ms | 41.6ms (-15%) | 40.5ms | 35.7ms (-12%) |
These are HTTP TTFB measurements through DDEV's proxy (TLS + nginx + FPM) on my localhost. I'm seeing consistent 8-12% p50 improvement across all scenarios, with even larger p95 gains (15-32%). It feels important enough to evaluate more closely.
Disclaimer: I used Claude Code throughout, though much of the work was mine and I reviewed everything carefully.
Some extra details
- Test coverage also follows the patterns from
ChainedFastBackendTestbut I added additional tests for various edge cases: corrupted APCu entries, APCu-unavailable fallback, cold start, and explicit invalidation.ChainedFastBackend's unit tests cover 5 scenarios; and our actually cover 11. We could consider porting some of my new tests toChainedFastBackend, but I'm not sure that is needed. - Added an in-memory APCu emulator in
DrupalKernelApcuTestKernelto validate the flow without a real APCu extension (CLI lacks APCu in this environment). A second test kernel (DrupalKernelNoApcuTestKernel) overridesgetContainerApcuKey()to return NULL, simulating environments without APCu.
Edge cases
I tried to cover all the following edge cases:
| Scenario | Behavior |
|---|---|
| CLI (Drush, tests) | APCu skipped (apcu_enabled() returns FALSE). Falls to DB. |
| Installation | InstallerKernel uses allow_dumping=FALSE. No caching. |
| Update.php | UpdateKernel::cacheDrupalContainer() returns FALSE. No caching. |
| Container rebuild mid-request | APCu cleared in invalidateContainer(). DB fallback. |
| APCu memory full | apcu_store() returns FALSE silently. DB fallback. |
| Multisite | Site path in APCu key prevents collisions. |
| Multi-server deploy | Cache key includes VERSIONS_HASH. Key changes on deploy. |
| Multi-server drush cr | Timestamp in DB is removed by deleteAll(). Other servers detect missing timestamp and skip APCu on next request. |
Related issues
- #2497243: Replace Symfony container with a Drupal one, stored in cache: Original decision to store container in DB cache (2015). Multi-webhead compatibility was the primary reason.
- #3488989: Replace Drupal container with a Symfony one, stored in opcache: Active proposal to return to Symfony's compiled PHP container.
- #2868446: Caching the Container: Proposal for per-webhead PHP file from DB definition.
- #2248767: Use fast, local cache back-end (APCu, if available) for low-write caches (bootstrap, discovery, and config): Established ChainedFast for bootstrap/config/discovery. Container deliberately excluded.
Issue fork drupal-3583040
Show commands
Start within a Git clone of the project using the version control instructions.
Or, if you do not have SSH keys set up on git.drupalcode.org:
Comments
Comment #3
dries commentedTests are passing, so moving this to "Needs review".
Comment #4
dries commentedAddressed both points:
testReturnsNullWhenBothEmpty,testApcuHitWithEqualTimestamp, andtestWriteThenReadRoundTripStubCacheBackendwithMemoryBackend. Eliminates the need for CID tracking and the custom test backend, as @longwave suggested.Less is more.
Comment #5
catchCouple of things:
1. Because we need to check the persistent backend for the invalidation timestamp, we don't save a database round trip. It does mean a much smaller entry to get from the database so it's possibly worth it despite this.
A possible solution would be to use cache_bootstrap for the invalidation timestamp and cache this somewhere for re-use in the real cache_bootstrap chained fast service.
2. Would it be possible to use the chained fast backend wired into the bootstrap container instead of hardcoding apcu calls in the kernel?
Comment #6
berdirIf everything in ChainedFast is fully injected then I think it would work to set up the bootstrap container definition for that. Doing it with a different backend is tricky though since that isn't supported, it has to be the same bin.
I guess we assumed that we still need the persistent bin lookup and that's why it was skipped, but I was wondering about that before. redis also optimizes the persistent backend lookup to be a simpler direct check without per-bin deletion lookups, see: https://www.md-systems.ch/en/blog/2025-01-26/redis-startup-performance-i....
redis also supports https://relay.so/, that essentially includes its own ChainedFast-like implementation with shared invalidation, see https://relay.so/docs/1.x/introduction. The default/recommended configuration for that includes the container bin, so this is already possible there.
Comment #7
dries commentedRefactored based on feedback from @catch and @berdir. The result is both cleaner code and less code. That said, I'm not 100% sure I did it correctly. The new
ChainedFastBackendversion is a tiny bit slower (~1-2ms on my laptop) due to the extra indirection. The overall performance gains is still significant though.