Problem
In #1577902: [META] Remove all usages of drupal_static() & drupal_static_reset() there are lots of different ways of using in-memory caching, mostly via a protected property on the service class.
However, this raises issues such as how to flush the cache, which is mostly used in tests.
A common pattern emerging is to add a flushCaches() method on the interface. However,
- This is exposing the underlying implementation detail on the interface.
- Storing 'state' on stateless services should be discouraged.
Proposed Solution
One suggestion is to use the memory backend cache, and inject it into these services.
This removes the need to expose a flushCaches() method. We can also flush caches in tests, or inject a null cache backend.
| Comment | File | Size | Author |
|---|
Issue fork drupal-3047289
Show commands
Start within a Git clone of the project using the version control instructions.
Or, if you do not have SSH keys set up on git.drupalcode.org:
- 3047289-standardize-how-we
changes, plain diff MR !13158
Comments
Comment #2
kim.pepperComment #3
catchQuoting @claudiu.cristea from the other issue.
With the entity cache, there's also #1199866: Add an in-memory LRU cache so to me it made sense to keep that separate from anything else we might add - for the same reasons we have different persistent cache bins.
I think we could add another service that can be used more generically, rather than one per-static-cache type.
Comment #4
kim.pepper@catch so you are in favour of using an in-memory cache backend over using properties on the service class?
Comment #5
catch@kim.pepper for several thing definitely, all the time I'm not sure yet. For the entity static cache we really needed to be able to reset it (outside of a specific entity operation) in order to reclaim memory, and/or add the ability to swap it out with an LRU cache for long running processes. The BookManager cache given it's caching what is essentially entity data in the end seems like a good candidate.
There was also an issue where a plugin cache was making use of cache tags, but the cache tags would only be cleared for a memory cache implementation, not the protected property. I can't find that issue now but it'd be a good example to reference.
The other reason to do it in more places would be to concentrate state handling on services into fewer places rather than having it spread around or generally for consistently.
Question is though does this apply to everything that's a protected property for caching purposes and I'm not sure about that yet.
Comment #6
mxr576I bumped to this issue several times. If a property is being used for caching it is almost impossible to clear it "manually" within a page request if it is needed - unless you know that there is a method called resetCache() (or something like that) on the service. (There is no way to automatically discover and call _all_ these methods to flush all static caches.)
You may say and you are also right that a service itself should handle the invalidation of cached entries in its own static cache, but sometimes an external change requires from a static cache entry to be invalidated.
This is what the Drupal\Core\Cache\MemoryCache\MemoryCache tries to address by allowing tag based invalidation on its own internal cache property. We are levering this class on the Apigee Edge module very much, I even created a factory to create and retrieve context specific instances of memory caches.
Comment #7
kim.pepperOK, so the general guidelines I'm hearing are:
...use a Memory cache backend.
Else...
use properties on the service class.
Is that right?
Comment #8
catch@kim.pepper I think that's a good plan for now and we can keep reviewing if we find more issues.
Comment #9
claudiu.cristeaComment #10
catchIt's no worse than using a generic cache bin like cache_data, but yes needs to be prefixed with the module name or _CLASS_ or similar.
Comment #11
andypostI vote to avoid using class properties for cache (see forumManager as its sleep() magic) - I bet we have a lot of similar issues with serialization all over core.
Next point is about bringing more state to service will cause lot more issues towards using php-pm and similar runners
Comment #12
claudiu.cristeaIn this case I prefer the solution proposed by @mxr576 in #6. It's cleaner and it allows the code to ensure that it runs in its own space. Here's a proposed patch. Take it as a PoC, I will add tests only if there's a minimal agreement.
Comment #13
claudiu.cristeaOops!
Comment #15
claudiu.cristeaComment #16
catchI only meant that if we used a single service for multiple unrelated caches, then we'd need to use a prefix. The patch doesn't seem to do any more separation than the existing service definitions which will all instantiate their own class instance anyway?
Comment #17
claudiu.cristea@catch, that’s correct. Using a factory is just a way to standardize the multi-instance solution. I see it safer than relying on CIDs
Comment #18
kim.pepperMemory cache backend CR for reference https://www.drupal.org/node/2973262
Comment #19
claudiu.cristeaBased on a disscusion I had with @catch on Slack, there was a decision not to create a factory for memory cache services. @catch, could you, please, confirm?
Comment #20
claudiu.cristeaComment #21
catchCurrent entries for memory cache services look like this:
I don't really see how it could be made less complex than that, it's literally two lines.
The MemoryCache doesn't write to persistent storage, nor use statics, and because of this unlike database and similar caches it doesn't need a bin argument.
Comment #22
yogeshmpawarComment #23
yogeshmpawarResolved two coding standard issues & added an interdiff.
Comment #24
mxr576Tried to explain again to my colleague why we do not have a clear policy regarding usage of
$this->containerand\Drupal::service()in functional tests so I bumped into this issue again. The main problem with some Drupal services is that they are not stateless as they should be thanks for these "cache properties", as @andypost also highlighted:At this moment, I do not see a better way to resolve this issue than introducing this memory cache service - or maybe it should be called "non-persistent cache (NPC)" that must be used by services instead of cache properties that want to store something for the current page request only.
If somehow we could enforce the usage of this new service instead of cache properties... If all services would use this new NPC service as storage then we could replace the original implementation of this service in tests with some kind of distributed, non-persistent storage implementation. Why? Because in functional tests it regularly happens that everything is working as it should be, the Drupal instance that runs in the browser invalidates caches as it should, but (in-memory) cached data still available in the Drupal instance that belongs to the test (retrieved from
$this->container) and therefore a test fails. This is the reason why you have to call\Drupal::service()sometime.I checked and PHP does have a distributed, shared-memory extension that we could use for this purpose in tests (instead database that would be much slower, redis or something else): https://www.php.net/manual/en/shmop.examples-basic.php although it is not perfect yet: https://bugs.php.net/bug.php?id=65987.
Comment #27
hardik_patel_12 commentedRe-rolling against 9.1.x-dev.
Comment #33
bradjones1Per #21 I think the factory is no longer part of the solution, this should be instantiated directly?
And further, this issue is thus policy-only, as there is already a memory cache class and it doesn't need a factory.
Comment #34
geek-merlin#11 (don't make services stateful) is getting much more urgent these days.
Comment #36
jweowu commentedRegarding (1), I don't think the amount of data should ever make a difference to the accessibility of the data.
Regarding (2), we cannot predict whether or not contrib or custom code might have a reason to inspect or manipulate or reset any given cache, so we shouldn't try -- it should simply be possible, as standard, and in a standard way.
I am therefore against the use of cache properties and/or plain
staticvariables. A key benefit ofdrupal_static()was having a standard API which enabled other code to interact with these caches as necessary. Regressing to an approach which hides that data away is a step backwards, as is the removal of a standard approach to working with such caches generally.drupal_static()was simple, functional, and extremely standard across core and contrib code. I think that it should be replaced with something fitting that same description.Comment #37
Ratan Priya commentedRe-rolled patch #27 for 11.x-dev.
Comment #38
catch@jweowu yeah I don't think it's about the amount of data, but it should only be used for actual caches that will definitely get refreshed automatically.
Comment #39
needs-review-queue-bot commentedThe Needs Review Queue Bot tested this issue. It no longer applies to Drupal core. Therefore, this issue status is now "Needs work".
This does not mean that the patch needs to be re-rolled or the MR rebased. Read the Issue Summary, the issue tags and the latest discussion here to determine what needs to be done.
Consult the Drupal Contributor Guide to find step-by-step guides for working with issues.
Comment #40
donquixote commentedIf the main concern is the reset problem, we can keep the object properties.
All we need is a mechanism for the reset, that does not pollute the public interface.
The service with the runtime cache property subscribes to reset events, like this:
https://git.drupalcode.org/issue/drupal-3368812/-/commit/27b046020ac6f91...
(link might die)
There are different ways to implement this, which can coexist:
These cache properties are special, because they only need reset events after the object has been created.
E.g. symfony event dispatcher needs to "wake up" all the subscribed services. The runtime dispatcher only needs to message those services that have subscribed at runtime. This is why it is much easier to create and work with.
The reset dispatcher does not need to send any data, it only has to "ping" the subscriber, which already has the data source as a dependency.
-
Comment #41
kristiaanvandeneyndeFollowing this issue, but my 2c regarding the factory:
Sidenotes:
From the in-code documentation:
Comment #42
larowlanLooks like there's patches/MRs here so removing the [policy] prefix from the title.
Comment #43
donquixote commentedRe myself in #40:
We don't need to invent a new event mechanism. We can use the regular Drupal/symfony event dispatcher we use elsewhere, just use
->addListener()instead of the usual subscription mechanism.We only need to subscribe a service that is already instantiated.
Comment #44
chi commentedIf we were using the Symfony core, we would already have such a mechanism.
Anyway, I suppose it's not a big deal to enable this service pass in Drupal kernel.
\Symfony\Component\HttpKernel\DependencyInjection\ResettableServicePass
Comment #45
donquixote commentedWe want to avoid that a service gets instantiated only to reset its internal cache property.
I did a little experiment and it seems this works.
The ServicesResetter does skip lazy services.
The ResettableServicePass adds IGNORE_ON_UNINITIALIZED_REFERENCE to the service references which it adds to the IteratorArgument.
This seems to do the magic.
What I don't see is how we could use this to reset different services for different events.
It's all or nothing.
Comment #46
kristiaanvandeneyndeSmall disclaimer that #41 and other comments mentioning MemoryCache are potentially no longer up-to-date as we now have full support for using MemoryCache. See this CR: https://www.drupal.org/node/3409455
With that said, I don't see why we can't use MemoryCache for what used to be cached in properties. The CacheTagsInvalidator service collects all cache bins and services that tag themselves as invalidators. We only need to focus on the first part here.
Any cache bin collected is instantiated because we use service_collector rather than service_id_collector for the CacheTagsInvalidator. So if we would create a dedicated bin for every class that used to have a "property cache", we would indeed have to instantiate a bunch of services.
However, I would argue we could have a "default" cache for memory like we do for the DB and encourage everyone to use that cache for their simple property caching. Then there's no overhead whatsoever because there will always be something using said memory cache and therefore the service will already be instantiated.
Re #45
Would that not apply only to static properties being used as an internal cache? Most of these internal caches I've seen use regular properties, meaning the service is instantiated either way.
Having said all of that, the kernel.reset reset tag does look really cool. Thanks for bringing that to our attention.
Comment #47
donquixote commentedE.g. in drupal_flush_all_caches() we call `\Drupal::theme()->resetActiveTheme();`, which does get an instance of ThemeManager, only to set `$this->activeTheme = NULL;`.
But if at the time this is called, ThemeManager had not been instantiated yet, `$this->activeTheme` would already be NULL / uninitialized.
The same would happen if we use regular event subscription to call a ->reset() method to reset an object property.
I think most of the solutions proposed here avoid this problem in one or another way.
Which is good :)
There is overhead in reading and writing these cache items, which is always going to be more php operations than with a property cache.
All of it is just basic php operations. So it will only become relevant for caches where we read or write many small values.
Comment #48
kristiaanvandeneyndeNow imagine if ThemeManager had a CacheBackendInterface dependency instead. And let's also assume that this would be a "default" MemoryCache which is used by many services, not unlike the "default" DatabaseBackend.
The point is that it then no longer matters if ThemeManager is instantiated or not because, in order to flush its caches, we would invalidate a certain cache tag. That would go over all cache bins, which are already instantiated, and then clear the data. At no point would ThemeManager be instantiated by our actions. Nor any other service for that matter.
True, but that's a tradeoff we should be willing to make as the benefits of this approach far outweigh the drawbacks. We'd get the above plus cache tag, context and max-age support. It would also mean that all of the classes using this cache would immediately benefit from any optimizations we can realize in the caching layer itself.
And as you said:
Fully agree. We can still opt not to use a MemoryCache on a case by case basis in those edge cases where it really does make a difference.
Comment #49
catchYes this makes sense and should work great. We just need a convention/pattern for cache IDs.
We're talking one or two extra function calls when reading/writing items to the cache, a bit more if/when we have LRU support from #1199866: Add an in-memory LRU cache but that would only need to apply to unbounded stuff like entities, so the overhead should be minimal compared to everything else we do.
Comment #52
berdirWe have the factory now already and in #3035288: Deprecate theme_get_setting() I suggested adding a generic memory cache bin instead of more one-off ones. We decided to pull that back out into a separate issue, so lets try to do this here. If necessary, we can spin off a child issue, but I hope we can just pull it through here.
I also reviewed the current cases of MemoryCache services:
* entity.memory_cache is now an LRU cache
* cache.asset_memory is part of a memory chain, it's non public, so while technically you could inject it into something, we can possibly just remove it?
* node.view_all_nodes_memory_cache, kept this because the service stores non-namespaced things and does a deleteAll(). could use a cache tag and prefix the ids, but seems out of scope?
* system.module_admin_links_memory_cache, replaced and deprecated.
Comment #53
berdirReverted the node service. It's causing problems during install and kernel tests because the type hint there is on MemoryCacheInterface, but during kernel tests ,we replace the cache factory, so we only get MemoryBackend's out, always. would require non-trivial changes to deal with that.
Comment #54
nicxvan commentedThere is still a type error:
Comment #55
berdirI fixed that by changing the type there, or we end up not using this at all, which has it's own complications.
Unsure what's in scope for this and how to approach it exactly. I did think for a second about changing KernelTestBase, but that sounds like a dedicated issue that is likely not trivial and could also break stuff.
Comment #56
kristiaanvandeneyndeMR looks good to me, although we currently alias
for autowiring, but people might expect that to point to the new cache.memory from the MR.
Also @catch had a point with:
So far, in core, we have:
I know Group has cache.group_memberships_memory
So there's no clear naming pattern, but we should ideally have one to indicate the different between regular caches and memory caches. Keep in mind that memory caches aren't real caches because they return the same object over and over again rather than new ones on every get.
Comment #57
berdirThe MemoryCacheInterface is certainly weird and wrong. I think we should require that caches are explicitly specified with an attribute when using autowire. Changing it seems harder than just deprecating it. Lets do that and see if that causes any issues with autowired injections?
IMHO caches should be prefixed with cache, especially if they're meant to be an API. Not sure if we should touch entity.memory_cache and if it's worth standardizing that. Now that it's an Lru cache, I think there's very little to no reason to interact with that directly and it's not really an API. Caches can't be non-public though, or they are not shared between services.
You said not a "real" cache, I think that's not a very meaningful description. It is a real cache when used deliberately, and IMHO a memory cache behaving like that is only useful when it's meant to be a drop-replacement for a persistent cache, like we do in kernel tests for example. You shouldn't deliberately use cache.static for in-memory-caching. Looking at core, the only use seems to be PermissionsHashGenerator and we can just switch that over and get a tiny improvement out of that. Therefore, I'm considering to deprecate cache.static here. I think memory cache is a much better name than static cache, static cache as a concept feels like a leftover of drupal_static() and static properties in functions, which is something we very rarely do these days and want to remove completely.
I want to work on a change record and possibly update some api docs as well
Comment #58
kristiaanvandeneynde+1, we should probably not alias cache interfaces for autowiring as too many things can implement the same interface.
I agree, yet it's the distinction we had to make when we made MemoryCache more widely available and why they are tagged as cache.bin.memory rather than cache.bin. For all I care and for how we use them, they are caches and they replace static property caching perfectly. It's just that they don't fully behave as the other caches in core and, for that reason alone, we have to keep the distinction clear to avoid confusion and mistakes when people try to use them.
Tentative +1, could be a pain in the ass for tests.
Comment #59
catchIn kernel tests we're using the test-only memory cache (MemoryBackend) we added before any of this that does serialize()/unserialize(), this is because otherwise the behaviour of setting/getting an item from the cache is different to if it's in the database or apcu etc. which would not be good for kernel tests.
For the actual memory cache we don't want to serialize/unserialize, but this means it isn't a 1-1 drop in replacement for persistent caches - obviously you can swap them around if you realise something does/doesn't need persistent caching but it does behave differently, which is why we added MemoryCacheInterface.
#2973286: Clean up MemoryCache and MemoryBackend naming issues is open to clean all of that up.
cache.backend.memoryshould not be using MemoryBackend, that's only reference by cache.backend.memory - we could probably clean that up here - make cache.backend.memory.memory an alias and deprecate it, so we're not using the testing code in runtime.Comment #60
berdirRe #59: Yes, that's exactly what I meant, MemoryBackend is only meant for tests where we want to simulate a persistent cache and have the same behavior.
I deprecated cache.static and MemoryCacheInterface, lets see what this does.
I'm not sure. If we'd rename it then other cache bins that specify it as a default backend would also use the MemoryCache then. Annoying but might be safer to deprecate and remove and then rename and deprecate memory.memory if we care?
For now, I just deprecated it without renames.
Still pondering about the way KernelTestBase alters the cache backend and thinking about a way to improve that. Maybe a custom factory that does either MemoryBackend or MemoryCache depending on whether it's cache.bin or cache.bin.memory.
Comment #61
nicxvan commentedThe deprecation is being hit a ton causing failures.
Comment #62
berdirYeah, the main issue is that cache.static is a tagged service, so it's injected into the cache invalidator service. Removing that for testing purposes, but that does mean that cache tag invalidations wouldn't work anymore there.
Comment #63
berdirI don't see how we can explicitly deprecate cache.static annd cache.backend.memory. the tag is needed or it's not actually a memory cache and we need those settings to be injected.
I also checked core.api.php cache documentation, it doesn't mention memory caches yet, I added a bit for the memory cache bin and also mentioned default_backend and cache.bin.memory.
Only a handful of contrib projects use cache.static, several of which were in the meantime refactored to use something else (old jsonapi and group versions. Most are just generated references in ide/test helper s.
I think we should just move forward with the change record and then remove it.
I also changed KernelTestBase so we only alter defined cache bins and switch them over to cache.backend.memory instead of replacing the whole factory. Then memory caches work and can still be typed.
Comment #64
catchCould we not add the cache.static deprecation to the ignore list?
Comment #65
nicxvan commentedYeah let's use the ignore list then, that's what it's for right?
Comment #66
berdirI thought we don't ignore own deprecations but I see there are quite a few in the list right now.
Comment #67
catchOverall we shouldn't be ignoring deprecations because we still have usages of something - we used to do that, then we'd discover we weren't actually ready to remove it.
But cases like this where the mere existence of the deprecated thing can end up triggering a deprecation message, there's not really another way around it.
Comment #68
needs-review-queue-bot commentedThe Needs Review Queue Bot tested this issue. It no longer applies to Drupal core. Therefore, this issue status is now "Needs work".
This does not mean that the patch necessarily needs to be re-rolled or the MR rebased. Read the Issue Summary, the issue tags and the latest discussion here to determine what needs to be done.
Consult the Drupal Contributor Guide to find step-by-step guides for working with issues.
Comment #69
berdirComment #70
catchWe should probably get rid of
cache.field_config_memoryin field module here too, which I just added in a different issue..Comment #71
kim.pepperDeprecated
cache.field_config_memoryand replaced withcache.memory.Do we need to deprecate? or can we just remove it?
Comment #72
berdirI think we can just remove because it just got added and was not part of a release yet.
Comment #73
kim.pepperOK removed.
Comment #74
catchI left one comment on the MR about the docs where I persuaded myself it's about where we want it now and we can consider making it stricter later. I can't see anything else to complain about at all, this will make a lot of issues easier to work on.
One thing we might want to do later, trying to think ahead to frankenphp et al, is have an request-safe vs. non-request-safe version of the service (e.g. static caches that can persist between requests vs ones that can't) but that can be its own issue and discussion, we're very far away from actually needing that still.
Comment #75
alexpottcache.static shouldn't change to database cache with this though. That feels wrong.
Comment #76
berdirRestored the missing tag, thanks for catching this. That's the whole reason we have to add the ignored deprecation, not sure why I missed that.
Comment #77
catchLet's go with that.
Feel slightly dodgy about a deprecation on every drush cr that we have to ignore, but it is probably harmless. @alexpott briefly mentioned trying to use aliases instead in slack, but if we figure that out, we could switch to that later on.
Comment #78
kristiaanvandeneyndeLooks good to me, a bit annoyed that we're now stuck with cache.backend.memory.memory for all eternity because at one point we used to have "cache.backend.memory" and got rid of it.
Comment #79
catchI think we could eventually rename cache.backend.memory.memory to something else, maybe once there's been an intervening release without cache.backend.memory, or even deprecate it and only add it back during tests.
Comment #80
alexpottFrom @catch
From @jweowu
Have we got the bit about a convention/pattern for cache IDs in the CR and docs? I can't see it. Because I'm concerned about every thing to the same memory cache and getting some very unexpected results. It feels like we could do something smart here and somehow use the service ID as a prefix when we inject the cache.memory service into another service. Like some sort of very thin wrapper. This kinda of automatic namespacing could solve some very tricky bugs when this is widely adopted by contrib. Say what we like about
drupal_static(__FUNCTION__, FALSE)- it handled namespacing really really well.Comment #81
geek-merlin>... cache namespacing wrapper
+1 for that. Symfony has it too.
But i see no reason to not add it in a followup. Or do i miss sth?
Comment #82
berdirI added a note to the CR, I'm not sure it really needs more docs or even enforcement. We standardized on common and shared cache bins like default, data, bootstrap, discovery early on in D8 instead of per-module bins like we used to have before. I don't think I ever had an issue with a conflict in any of these between different modules. This is the same API, so I think this is a well established pattern and people are aware of this. drupal_static() didn't enforce it, and it would result in different behavior for aan injected service vs \Drupal::cache() vs using the cache factory directly. I'm not sure that's a good thing.
Comment #83
alexpottFWIW here's a link to Symfony's namespaced caches https://symfony.com/blog/new-in-symfony-7-3-namespaced-caches neat and a little inspiring here...
@berdir yes but... moving to using a cache for static properties is going to make clashes much much more likely, think property names and service names.- they are very likely to clash. And while nothing was enforced with drupal_statiic convention meant that the use of __FUNCTION__ lead to automatic namespacing.
@geek-merlin I think you are right we can do this as a follow-up. I think we could:
Happy for this to go back to RTBC - once we've adding docs about namespacing the keys something with a recommended convention - I'll open the follow-up to see if we can build this into the container automatically.
Comment #84
catchThe namespacing that Symfony does isn't specifically to avoid ID clashes, although it might do that too.
The reason they offer it is because they don't implement cache tags, it allows a simpler equivalent of Drupal 7-style cache prefix clearing where the prefix is defined in advance. So even if we add it we'd need to be very clear we're not offering the same thing, cache tags provide the use case (and many more) for the feature they actually offer.
Comment #85
claudiu.cristeaAlternative to CID namespacing is to create separate bins with a factory. Services that need a memory cache will just fabricate a new bin. I've suggested this in #17
Comment #86
berdirThe Symfony API maps more to our cache contexts than cache tags (they have cache tags too, I don't if they scale as well as our checksum based logic). They say invalidation, but there is no invalidation going on there. The clarified that in the docs that are linked in the comment. The idea is that you'd dynamically calculate a namespace based on the current user or it's changed date or something that. So you don't invalidate, you just vary cache keys, which is what we call cache contexts.
Symfony does not enforce/automate namespacing, it just provides the API. They say "This is handled transparently" in the blog post, but it's only transparent *after* you manually and explicitly get your namespaced cache. I disagree that we should do this automatically and enforce it. It is IMHO valid to share memory caches between services and it would be impossible with this. Maybe with an explicit notation, I'm open to discussing that, providing an API for it to make it more convenient, but I'm against enforcing it.
Comment #87
catchYes let's open a follow-up to discuss. I think @alexpott is right that the process of conversion from properties to memory cache IDs has a higher chance of collisions than normal cache API usage, but we can document it. Magically namespacing feels very complicated even if it was desirable, which it might not be.
Moving back to RTBC (again :P).
Comment #88
alexpottCommitted 69c9b19 and pushed to 11.x. Thanks!
Comment #91
alexpott@berdir can you provide an example where this is true. From my perspective this would be a code smell and either point to one service over-reaching it's responsibilities or missing API.
Comment #92
berdirAgreed, there aren't many good cases for that. The theme_get_settings() issue specifically struggled with the drupal_static() usage in ThemeInstaller. We have a neat solution now using cache tags, without that it's IMHO a grey zone and a shared cache between two services in the same component might be the lesser evil than having a public API that also exposes internals. Some languages have concepts for this (like protected in Java isn't just for the same class and child classes but classes in the same package). PHP doesn't.
drupal_static() comes with drupal_static_reset(), We have \Drupal\Core\Test\RefreshVariablesTrait::refreshVariables(), which specifically calls out to some methods that we keep out of interfaces, as they are not meant to be called by anyone else. More grey zones. Should a component like cache factory know about the testing system, or should the testing system know about cache factory internal static caches.