Problem/Motivation
#2926309: Random fail due to APCu not being able to allocate memory found out that we'd disabled apcu_shared_prefix as a stop-gap for random test failures, fixed the cause of the test failures by disabling garbage collection, then left apcu_shared_prefix disabled resulting in random test failures again.
Re-enabling this brings down the number of apcu entries on the test infrastructure from 5M to 1M, and 300k using APCu file backend.
Opening this follow-up to discuss this issue further since 1M or 300k is still a lot, and traditional multi-site configurations, or sites with a /lot/ of config entities and translations could still run into some version of this issue.
Comments
Comment #2
andypostAlso it may affect shared hostings
Maybe betrer to update test runners to manage prefixes?
Comment #3
alexpottJust to note that purging the cache in test teardown will require an extra request to the webserver. The teardown method is being run via CLI and it has no access to the APC memory used by the webserver.
Comment #4
mpdonadioCan it be in the __destruct() for the test base? Or the backend class?
Comment #5
alexpott@mpdonadio as per #3 that's not going to work. We need to trigger something on the web server. Doing something in the PHP cli process is not going to work and it can't happen on every request.
Comment #6
catchYes we'd need to have a route that checks we're in a test (possibly is conditionally added only when inside a test?), and flushes the apc prefix when requested.
Another thought:
The shared apcu prefix makes loads of sense for FileCache, but does it make any sense at all for config, can't see how it ever could with uuids. Should we look at making that more granular so that only shareable items get shared and then when we flush a prefix only install-specific items might get flushed?
Comment #7
Anonymous (not verified) commentedWe do bit research (#148 - #172). TL;DR:
APCUIteratorhas led to a hangup (#150). Anddrupal_flush_all_caches()did not help too (#167)apcu_store()withttl(#171). Max number of entries: 68k (ttl 10s), 120k (ttl 200s).Comment #10
wim leers@catch FYI: #2984232-10: APCu class loading does not automatically distinguish between sites in a multisite that has per-site code.
Comment #21
catchThis is less of an issue now we're on gitlab because we split tests between multiple test runners, compared to when all the tests ran on one big runner. However there could still be something going on here. Downgrading to normal though.