Problem/Motivation
When loading large numbers of entities, especially during migrations and updates, it's quite easy to run out of PHP memory.
Migrate works around this by trying to clear the entity static cache, but this also results in a persistent cache clear.
Static caching is hard coded in storage classes which prevents swapping it out unless you change the whole class.
Proposed resolution
Add a 'static cache' service, which does not serialize at all, but which conforms to CacheBackendInterface.
Rework entity storage to rely on this service instead of a raw class property.
This will allow for two things:
- once this issue has landed, migrate will be able to reset the static cache via the service, therefore not affecting the persistent entity cache.
- in a follow-up, we'll be able to add a simple Least Recently Used implementation, allowing the static cache to have a fixed maximum number of items.
- beyond this, there is the possibility to go even further:
- have a cache backend limited by the actual size of the items in it (will require estimating memory usage of entities)
- add a cache backend, probably in contrib, using PHP weak references http://php.net/manual/en/intro.weakref.php which theoretically will allow cached entities to be removed by garbage collection.
Remaining tasks
User interface changes
API changes
Data model changes
Spinning this out of #1302378: Use multiple specialized entity controllers.
The base EntityController class currently handles keeping it's own 'static' cache internal to the class, or always fetching from the database all the time.
In Drupal 7, this means that http://drupal.org/project/entitycache needs to provide it's own controller class to provide persistent caching for entities.
We can make this a lot more flexible by just replacing the custom caching within the Entity controller class, with a configurable cache controller key instead of the current 'static cache' key. There's no need for entity-specific cache controllers, since any class implementing CacheBackendInterface will do fine.
To allow for 'static' caching, I've added a new ArrayBackend which simply stores cache items in a php array. Apart from expires which didn't seem worth working on until #730060: Replace CACHE_TEMPORARY, cache_clear_all() and minimum cache lifetime with cache tags support is in, this (untested) backend supports everything the database backend does, including tags etc., meaning it should be useful for unit testing eventually.
For no static caching at all, the NullBackend works fine.
To actually implement entitycache in D8 (whether core or contrib), you'd need a backend that proxies the MemoryBackend + whichever one is implemented for the cache_entity bin, since we're placing responsibility for both in a single class doing things this way. There's some inconsistencies with resetCache() etc. that would need to be sorted out for that.
Completely untested patch that will likely blow up, but just to show what it looks like.
One feature gets dropped here, which is fetching from cache when $conditions is passed to entity_load(). That's already a bit wonky as it is, and the $conditions parameter is already on its way out and documented as deprecated in 7.x per #1184272: Remove deprecated $conditions support from entity controller.
| Comment | File | Size | Author |
|---|---|---|---|
| #172 | interdiff-171.txt | 493 bytes | amateescu |
| #172 | 1596472-171.patch | 37.44 KB | amateescu |
| #169 | 1596472-169.patch | 36.96 KB | amateescu |
| #167 | interdiff-167.txt | 1.49 KB | amateescu |
| #167 | 1596472-167.patch | 37.73 KB | amateescu |
Comments
Comment #3
catchWithout the syntax error..
Comment #5
pounardSounds like a good improvement, the ArrayCache implementation is a good idea!
class DatabaseBackend implements CacheBackendInterface {Wrong name, right? Should beArrayBackend?'cache controller' => 'Drupal\Core\Cache\MemoryBackend',Sounds wrong too, is itArrayBackendorMemoryBackend? I'm fine with both, but you have to agree with yourself about this one!Comment #6
pounardIf you don't react before I'm getting back home, I'll try to fix your patch.
Comment #7
sunI think this needs more architectural planning.
The 'static cache' flag in the entity (crud) controller is vastly more intelligent than a completely separate memory cache backend.
It is true that we could attempt to split and separate that backend, but we don't want to lose the static caching within the crud controller entirely.
This patch adds "cache controller", but actually means "static cache controller".
Let's also bear in mind that there's a cost of instantiating one more object (including config + DI) for every single entity type.
Comment #8
pounardWe can chain backends if we want both static and a real cache, sounds like an easy win: we could get rid of this awful loading logic that do both, and rely on a unique object with a clean interface that would do both at the same time. It's a pattern we could also re-use at various places in Drupal too.
Instanciating a single object, you won't notice it on your runtime, it's not significant, even with 10 or 20 different entity controllers, this won't hurt a bit your execution time, it's really nothing compared to what does a single Views, yet most people continue to put up to 10 Views on the same page.
In Drupal 7 the static cache is quite dumb, all you have is a cacheGet() and cacheSet() internal methods, which acts like a basic key/value store (if we omit the deprecated conditions array). If it is smarter than this, please enlighten me because I probably missed it.
Comment #9
catchHmm I thought I'd fixed the silly naming issues already but apparently only in my head rather than the actual patch. I think we should go for ArrayBackend since then there's no chance of confusing it with an actual static, or APC or something, MemoryBackend sounds a bit more clever than it is.
@sun: the intention here is to allow the cache controller to do either no caching, static caching, persistent caching, or static + persistent caching (via chaining the two together), and also to start the process of pushing small bits of the entity code back into core/lib where they're generically useful. I'm not sure how viable that will actually be - often static and persistent caches need to be treated differently, but when I looked at offering the potential for both, at least within the controllers the access points are the same.
Comment #10
pounardSome other ideas arround the idea:
- We could, in the future, share cache chains to entity controllers that needs the same chain (we should then prepare some kind of mass deregistration mecanism in the array cache for static entries to act upon controller destructor)
- This pattern, with a clean cache chain definition mecanism, could be used in a lot of other places where the code actually emulate this behavior, for example a lot of info hooks
Using it massively would remove a lot of duplicated code everywhere, it might even be useful in conjunction with the actual DrupalCacheArray implementation that could evolve to use this base.
Using the chain of command (for saving), chain of responsability (for loading) and proxy patterns (basically using a chain towards the objects behind) for this seems really clean in term of architecture because the final object only rely on the cache interface, without worrying about what's behind, it's actually maintainer's even somehow.
Comment #11
fagoI like the idea of having a dedicated cache system, which has a single consistent interface to developers. Having a NullCacheBackend + a static cache backend that proxies another permanent backend makes much sense too me too.
I don't like having the EntityStorageController directly talking to the cache system though. Imho, we loose some flexibility here.
I'd prefer the approach taken by #1302378: Use multiple specialized entity controllers: Have a dedicated entity cache controller class, which is invoked by entity_load() and handles everything related to caches - see #1302378-4: Use multiple specialized entity controllers (comment #4).
Reasons:
Comment #12
fagoComment #13
catchThe patch already has a dedicated entity cache controller, it's just it only requires that to match the core cache interface. Custom entity-type-specific controllers would still be possible of course. For a controller that does persistent caching that's the most likely to need something custom. I'm not sure from your post what you think that's needed that's not in the cache interface already. Apart from the conditions stuff but I'd prefer to press on with the issue that removes it. Almost did that before starting this one but wanted to get this out for discussion.
Comment #14
pounardI think that the conditions (or properties later) loading wouldn't be affected that much by replacing the internal cacheGet() with a backend directly: you still control what happens in your load() function and you are not forced to cache entities by id, you can do something else instead. You can still add a layer for frequently used queries in specific controllers.
But, to be honest, I think that trying to implement such layer would be more than difficult, the load() method may still be optimized later for very specific cases if needed. I even have a hunch that if the controller has direct access to its own cache backend, it would be easier to implement such frequently used queries caching: the controller knows its own schema and fields.
The approach in #1302378: Use multiple specialized entity controllers suppose that there is an upper layer that needs to combine itself the cache and "main" controller, which means that the entry point of the API is in another piece of code (in the exemple in procedural code), I think this would be prone to errors: some developers will continue to hit the controllers directly (which makes a lot of sense) instead of calling entity_load(). I don't think this a good idea until there is a front controller for all entity operations.
I think that in the end, both provide the same feature and generalize caching method, but I prefer catch's method: it allows the entity "main" controller to implement its own load/save etc.. methods and have a finer control over its own cache if needed, while most of the others would probably continue to use the generic one.
Regarding specific entity type cache configuration, weither it's a cache controller or a chain set inside the main controller, I think both solutions are equivalent in term of configuration, especially if we can create the cache chain outside and inject it when building the controller instance: instead of creating a cache controller from configuration, we create a cache chain from configuration, doesn't seem too different in the end, it just moves some responsabilities.
And, latest note, I think that even if the cache controller is the way to go, this isntance can still benefit from using the cache chain catch is proposing, it's just a matter of moving it into another objet.
Comment #15
fagoThat gets the point. I've aimed for moving that cache-control into a separate class, such that it stays separated from the storage class.
However, I just realized this thinking is deprecated, as we won't have "pluggable" storage controller anyways (sry, I'm still used to that thinking). As discussed in the entity-interface introduction issue, they already implement the per entity-type storage logic, so any simple storage-pluggability (=without repeating entity-type specific customizations) would have to happen on a lower level. In that light, having the cache-control in the storage-class makes sense to me too. :)
Great, I'm about to look at that as well :)
@patch: Having a permanent cache renders hook_entity_load() useless. If possible I'd prefer to drop it in the long road anyways, an entity really shouldn't be altered by page request. If it needs to be changed, re-save it with new property values.
Comment #16
pounardThere's a point I agree with is that we shouldn't alter this kind of stuff on every load, it can get quite CPU intensive if you have a lot of modules playing with this. But, in the other end, having the hook_entity_load() can be useful for some modules to set business data which are not properties, but that needs to be on the entity and cached along with it. Concrete use case: I could load data from business tables and set them on a node, but I don't want to store this data in actual node properties because I don't want to duplicate it and handle the full synchronization between my source business data and node properties. If I can still attach arbitrary data on the object I can then just handle nice synchronisation by clearing a few cache on my business table updates. Right now, I actually achieve this using the virtual_field module, that provides a null field storage backend, but defining some virtual fields allows me to actually store precomputed data in field cache along with my commerce products, line items and stuff, without worrying about a complete field storage layer.
EDIT: Sorry, a bit off topic, and that's an akward and quite ugly way of doing this, but in my case, it's mainly for denormalizing some data for performances. I don't know where the property system is going to right and if it's going to help me with those kind of use cases.
What matters in here, is that by caching entities, we cache the hook_entity_load() result with it, and that's why I really like the idea, because if we do this I can then remove my need for hacks such as virtual field etc.
Comment #17
fagoYeah, if there remains a hook_entity_load() it has to be cacheable. Altering stuff or changing stuff on a per request base there is a bad idea, and we should stop supporting that.
The discussion of removing it probably belongs more into #1497374: Switch from Field-based storage to Entity-based storage though, as it depends on whether we want to support a "modular entity storage". However, even with a centralized entity storage I agree that it might be worthwhile to keep it for easily adding non-queryable stuff as you've depict in your use-case.
Comment #18
pounardYes, I honestly think those kind of non queryable stuff are really ugly, but sometime you just don't really have other straightforward choices. I don't know what will be final API for properties, but properties could solve that somehow.
Comment #19
fubhy commentedThe ArrayCache implementation is a great idea indeed. However, I want to propose some changes and ideas that came up today while we were discussing the Entity Property API and waiting for fago to write down the initial interface definitions :).
The CacheChainInterface
Since we have multiple caching layers we might end up with various combinations of them (static cache only, static + persistent cache, persistent cache only, etc.). We could prevent that from happening if we had a way to chain cache backends in some way. This could be implemented in a generic fashion as a CacheChainInterface that would extend the CacheBackendInterface. This interface would extend the constructor definition for the CacheBackendInterface with an additional argument: An array of cache backends that the should be queried sequentially in order to retrieve an object from the cache.
The CacheChain class would iterate over the CacheBackend implementations that it has attached to it and forward the get / getMultiple queries to them. Once we have a full result for a query we exit the loop and, if there are any cache backends that could not return a result for the query before, we use the set() method on them to fill them with everything that they don't have cached yet. The next time the same query hits the CacheChain the first backend in the chain will now have element(s) that are subject to the query and return them right away.
This way we would be able to easily stack caching layers on top of each other and also replace single layers independently.
Comment #20
catch@fubhy that sounds just like http://drupal.org/sandbox/catch/1153584 which I've never got around to implementing. I agree that'd be a good idea but I think we should add a separate issue for it. I also want to break the ArrayCache backend into a separate patch probably since that's useful for unit testing (and we also need to look at re-organising the cache tests so they're more re-usable with alternate cache backends).
Comment #21
fubhy commentedOh great. I will take a look at your sandbox then. I think this can work really well with the PropertyContainerInterface that is part of the 3rd iteration of the Entity Property API for defining the tag-properties of the cache items (Note: the PropertyContainerInterface is generic and Entity agnostic so it would be completely generic).
Anyways... I agree this should be in a separate issue. I will create one later today and maybe already provide some code.
Sorry for off-topic :)
Comment #22
pounardNot that off-topic, the cache chain interface is the idea behing this issue, and IMHO it should be done before we can actually continue this issue. It is a good reusable pattern that could be used at many places.
Comment #23
catch@fubhy, the sandbox is empty, I really didn't do anything with it.
I'll try to make the array backend issue soon and upload a separate patch, new issue for cache chain sounds good too, then marking this postponed on them also sounds good :)
Comment #24
catchOpened #1637478: Add a PHP array cache backend.
Comment #25
pounardOpened and implemented and unit tested #1651206: Added a cache chain implementation.
Comment #26
alan d. commentedA small changes here could have significant ramifications (for better or worse), so I am adding a tag as a reminder to check performance.
I nearly committed a very basic EntityCache controller for the Countries module before doing some simple tests to discover that this caused a 20 fold decrease in speed. On the plus side of this testing, reading in an array of Country entities from the database was 3 times faster than including iso.inc and doing a single drupal_alter() on the countries list. So performance gains could be significant here :)
Comment #27
catchI just committed #1651206: Added a cache chain implementation.
Comment #28
fubhy commentedAwesome. Now we can get this rolling again. I worked on this a bit during DCon Munich already but was not really satisfied with the outcome because it was still hard-wired with the storage controller. My wish would be to make the caching layer independent of the storage system. I am not entirely sure how we would approach that yet, though. I do have a crazy suggestion though which I have been playing around with in my head lately but it would be a bit drastic to say the least. Will post more about that soon.
Comment #29
moshe weitzman commentedAnyone up for reviving this? Next steps?
Comment #30
podarok#3 woops
a fast look to all the changes already in repository says this implementation very different against repo one so not possible for a fast reroll
Comment #31
fagoRelated #1497374: Switch from Field-based storage to Entity-based storage.
Comment #32
xjmComment #33
berdirTagging.
Comment #34
berdirWeird, I could swear the the tags list was empty when I added those tags...
Comment #35
fagoSo exactly how does this related to #597236: Add entity caching to core ?
Given #1497374: Switch from Field-based storage to Entity-based storage and #1969728: Implement Field API "field types" as TypedData Plugins we really need to replace field cache with entity cache.
Comment #36
effulgentsia commentedBoth of those are major, so raising this one to major as well.
Comment #37
fubhy commentedHow about adding the caching layer as a decorator around the storage controller? We would then basically have a CacheController (which wraps the storage controller) and implements the StorageControllerInterface. Said CacheController would get a cache backend injected and try to retrieve entries from it and upon cache miss forward to the decorated storage controller.
Edit: This would help us to decouple storage from caching and and thereby remove the need for every storage controller to implement caching separately.
Comment #38
fagoIndeed, this sounds clean and is straight-forward!
Comment #39
damiankloip commentedGoing to work on this.
Comment #40
damiankloip commentedActually, not right now!
Comment #41
berdirTagging.
Comment #42
jhedstromThe issue summary needs updating if there is remaining work here.
Comment #43
berdirJust like I said years ago, I don't think it is realistic to replace the special "static" cache, it's more complicated than that. We do have persistent entity caching now, so I don't think this is still major.
Comment #44
andypostComment #45
catch#2501117: Add static caching for PermissionsHashGenerator::generate() added a cache.static backend which could be used for this.
Comment #47
catchThings have changed a lot, starting again from scratch.
It's actually very straightforward doing this in EntityStorageBase, here's a proof of concept that doesn't explode on my local.
Comment #49
berdirmissing closing } ?
Comment #50
catchI think that's just a diff artifact.
But I did forget what I called the method apparently.
Comment #51
catchOh ouch, apparently my eyes are a diff artifact.
Comment #54
berdir\Drupal\Core\Entity\EntityTypeManager::clearCachedDefinitions():
$this->handlers = [];
And some more cases as well. That invalidates the static cache (and injected definition) of those handlers. That doesn't work anymore now.
Comment #55
catchYeah had to be too good to be true.
Fixed subclasses of EntityStorageBase.
Implemented __destruct() for #54 which I'm not happy about but see if this gets closer to a green run.
Comment #56
berdirI think this should read:
$this->staticCacheBackend = $cache_backend ?: \Drupal::service('cache.static')
(right now you never assign if the backend is actually injected)
Also, I was just thinking..
a) $cache_backend should probably be $static_cache_backend?
b) I know that default_backend is overridden by setting a global default cache backend. Means you have to explicitly specify fast chained backends when using e.g. redis as default backend. Also means that redis will be used for cache.static in that case, oops... I think default_backend should override the default, need to open an issue about that.
Comment #58
catchWas going to say it's too early for #56, but with 34 fails that's perfect timing to tidy things up. Addresses both points for now.
Comment #60
catchFixed ConfigStorageBase and sub-classes as well as a couple of unit tests. Uploading to make sure this didn't introduce any silly regressions in web tests. Otherwise it's mostly updating unit tests from here to get to green.
Comment #62
catchComment #64
catchRebased, not looking at test failures yet with this one.
Comment #66
catchThat __destruct() doesn't help, trying without.
Comment #68
catchComment #70
berdirThis is a fun test fail. static isn't supposed to create a table. are kernel tests overriding something somewhere? looking into it.
Comment #71
berdirOk, that is easy, we just need to override that services as that test defaults to database backend.
Here's why the views test is failing and why this change is a performance problem and behavior change:
MemoryBackend::set():
The memory backend on purpose serializes the data to behave the same as non-memory backends and break references. That means this is an overhead and it also breaks the current behavior that two load calls for the same entity get the same object. You can argue whether that is good or bad, but we can't just change it. That's going to cause nasty bugs and side effects in code that currently works.
Specifically, what happens here is that we load the same node multiple times, then switch to the requested language. And then we save all of them. But that means that saving the second language will not have the changes of the first x, so saving that overrides the changes that the first language saved.
Comment #72
berdirComment #74
catchNot a real patch, but here's a new interface for static caches, and an implementation extending memorybackend without the serialization. It'll need a new service definition then all the code/tests updating to use it, but theoretically should get us to green here after that.
Comment #75
catchThis will fail but uploading to have a new base to work from.
Comment #78
catchComment #80
catchinterdiff against #74.
Comment #82
catchComment #84
catchComment #85
catchSorry for noise, shoulda done a patch testing issue.
Comment #88
catchComment #90
catch@Berdir pointed out in irc that $this->container is risky in tests since it can get out of sync. Switching to \Drupal fixes it.
Comment #92
catchNot sure what happened with the patch upload.
Comment #93
mile23This issue looks a lot like #2260187: [PP-1] Deprecate drupal_static() and drupal_static_reset()
Where are they out of sync? Is it in a test base class or is it in the entity system?
Comment #94
catch@Mile23 that issue is trying to replace drupal_static() uses, this issue is a very specific approach for dealing with class properties, it makes most sense when read in conjunction with #1199866: Add an in-memory LRU cache and #375494: Restrict the number of nodes held in the node_load() static cache. It does offer some of the same features (like ability to reset the cache from outside the class, in this case via calls methods on the static cache service), but the use case is very different otherwise, and the node static cache issue predates even the introduction of drupal_static() by several years.
The container goes out of sync in the test - i.e. the return of \Drupal::container() and $this->container are not the same.
Comment #95
mile23Could it be the same problem as this? #2641518-38: Replace deprecated usage of entity_create('config_test') with a direct call to ConfigTest::create()
Your patch modifies
EntityReferenceIntegrationTestwhich usesconfig_test, and #2641518: Replace deprecated usage of entity_create('config_test') with a direct call to ConfigTest::create() addressesconfig_test, so maybeconfig_testneeds some love.ConfigTest::create()ends up being\Drupal::entityManager()->getStorage()->getEntityTypeFromClass(ConfigTest::class). So one of the proscribed ways of instantiating an entity relies on\Drupaland the injected container being in sync, but that behavior doesn't work under some circumstances, which we can't seem to figure out.So I'll wager that the thing that's out of sync is within the entity API.
Here's a patch that fixes
EntityReferenceIntegrationTestby havinggetTestEntities()use$this->containerto build the fixture entities.Follow-up: Why does this patch fail if you switch to using the
Entity::create()pattern ingetTestEntities(), or use\Drupalin the test?Comment #97
catchRe-uploading the passing patch from #92, I don't want this to be blocked on $this->container going out of sync with the tested site.
Comment #98
mile23Ugh. #96 passed locally.
Comment #99
berdirshould we make this entity specific, so that you can have a LRU implementation for entities but maybe not for other things?
what database? :)
I know, copied, but does make even less sense here than in the parent, should we probably drop teh array nique and sort calls?
and this tag should then be entity type specific?, maybe a property that is defined in __construct(), so you could override it?
duplicated?
we don't need to duplicate the fallback logic here?
Comment #100
catch#1. Yes agreed. Nearly did that before. Renamed the service to entity.static_cache
#2: yeah why not, also less to do so will help remove a bit of overhead compared to just sticking stuff in a property.
#3 done.
#4 removed
#5 removed.
Comment #102
catchRebased.
Comment #104
catchMissed a service rename.
Comment #105
claudiu.cristeaDid some cleanup, added some @todos.
However, I'm not too happy with this change, especially, with the fact that $this->container gets out-of-sync. I suspect that the problem might be in this patch. Probably entity cache gets out-of-sync?
Comment #107
claudiu.cristeaOuch!
Comment #110
eclipsegc commentedbah wrong issue... sorry ignore my post.
Comment #111
berdirReroll, quite a few conflicts but mostly just on array syntax.
Comment #114
andypostComment #115
catchHere's a re-roll. Entity and cache unit tests are green.
Comment #117
catchWhen git status is cleaner /
Patches are greener
Comment #119
catchThat'll teach me to write jokes instead of checking the diff.
Comment #120
pounardMeanwhile, the oups was funny :)
Comment #122
JvE commentedHow does this allow me to clear a node from the static cache without also clearing it from the persistent cache?
Or better yet, load nodes without them being cached anywhere at all?
Use case: iterating over many nodes without running out of memory.
Comment #123
catch@JvE so there's two ways this should help:
1. The service itself can be used to clear the cache the same way that cache bins can be cleared. This is similar to what migrate in Drupal 7 used to do with drupal_static_reset().
2. Once this is in, we can work on #1199866: Add an in-memory LRU cache too, so there's no need for explicit cache clearing at all.
Comment #124
JvE commented1. So like
\Drupal::service('entity.static_cache')->reset();or->deleteAll();or->removeBin();?It would be nice to have a more granular approach available.
Maybe change
\Drupal\Core\Entity\ContentEntityStorageBase::buildCacheId()from protected to public? Then we'd be able to reset specific$cids.2. Neat, hadn't seen the LRU ticket yet.
Comment #125
catchFixing the test failure. The test is getting a config entity in the testing site, changing the config entity in the child site, then fetching it in the parent site again - so I'm just clearing the cache before doing the second fetch in the test.
Comment #127
catchRe-rolled.
Comment #128
claudiu.cristeaFew minor issues.
About the terminology. We used to use the term "static cache" because this cache is stored in static variables. But I don't think that the "static" word tells too much about the storage/backend. And this is a cache backend implementation. IMO, we need something that tells about the backend where the cache is stored and that would be "memory", so yes, "memory_cache" (and MemoryCache for class) would be more relevant.
The parent class ensures that
$tagscontain unique values. Shouldn't we do the same here?This line is redundant. If the caller doesn't send the service instance,
NULLis passed to the parent constructor. And the parent constructor knows how to deal withNULL.|null
The protected property
$entitiesshould be also removed. And I think we can get rid also of the__sleep()implementation.Unrelated changes.
Comment #129
claudiu.cristeaComment #130
catchThanks for the review!
#1. Agreed. We use static cache elsewhere for protected properties, but we should move away from that.
#2. The patch probably predates us doing this, added it.
#3. Good spot, removed these lines.
#4. Done.
#5. Yes, and nice spot.
#6. I wish these were unrelated changes, but there is a bug in the testing system where \Drupal::container() and $this->container are not always identical instances, which was causing test failures in this patch. Uploading two versions of the patch in case this has changed though.
Comment #131
catchComment #134
amateescu commentedThis won't work :)
Comment #135
catchMade a mistake with git mv.
I think that was partly due to thinking about the naming similarity between MemoryBackend and MemoryCache. We might want to rename MemoryBackend to TestingBackend or similar.
Comment #136
hchonovI've found only nitpicks...
Incorrect indentation.
Build or Builds?
Comment #137
alexpottWe can throw a silenced deprecation if $memory_cache is NULL so everyone knows to start passing in entity.memory_cache in D8.
Incorrect indentation.
If we say that objects must be the same instance what does that mean for swapping this out with something like Redis? Ie. https://www.drupal.org/project/entitycache in D8. Does that mean once you've created the object all further retrievals for the same object must return the same instance?
Comment #140
hchonovRe #137.4:
@alexpott, this is how the current implementation of the static cache behaves and I think there is code relying on this. I don't think that we could use Redis here, because the static entity cache is meant to be per request and not shared between requests. Additionally by using Redis we'll be losing the references, as for the Redis cache we have to serialize the data. @see
\Drupal\redis\Cache\CacheBase::createEntryHash().P.S. Redis can be used for the persistent entity cache.
Comment #141
mile23It's not a bug in the testing system. It's an anti-pattern in how we do dependency injection, which is revealed by tests: #2066993-46: Use magic methods to sync container property to \Drupal::getContainer in functional tests
Way out of scope here, though.
Comment #142
catchWe already have a persistent cache that you can swap out with Redis. This is an entirely new thing that replaces 'using an array of stuff on a protected property as a static cache'.
As such it's only possible to use memory here - even APCu isn't an option - and you wouldn't want to use APCu anyway because it can't handle this much data.
However we do want to be able to use alternative implementations, because one already exists #1199866: Add an in-memory LRU cache - I've also had more or less since opening this issue a plan to do an implementation using http://php.net/manual/en/class.weakref.php so that PHP's own garbage collection handles removing entities from the cache (if that works).
From the test fails in #145, the problem with \Drupal::container() vs. $this->container is no longer an issue, so very happily ditching that hunk from the patch!
P.S. Nearly the 6th birthday of this issue..
Comment #144
alexpottDidn't know about weakref. Neat.
Needs a link to the change record as per policy. I guess this issue needs one anyways.
Really nice to see this disappear. I'm a bit torn about adding entity.memory_cache to the constructor since Config doesn't support it atm but I think keeping Config and Content entity storages aligned seems like a good idea. The other option would be to introduce a setter on EntityStorageBase but that feels more risky. Maybe someone somewhere has enabled static caching of config entities.
This versus \Drupal\Core\Entity\ContentEntityStorageBase::buildCacheId() is confusing. Shall we pull buildCacheId() up to StorageBase() so it's consistent?
Are we sure about changing these names - might some custom / contrib entity storages rely on these?
Really nice to see this go. I think it shows this change is in the right direction.
Comment #145
amateescu commentedThis parameter doesn't seem to be optional.
We could use
static::CACHE_PERMANENThere instead. And then the use statement at the top can be removed as well."The memory cache..."
The second line needs to go :)
There are quite a few places that need to be updated to use MemoryCache instead.
Comment #146
catch#144:
Yeah it'll have to be contrib because it requires a PECL extension but still.
Added a change record.
#144.2. Yes moved it up a level to remove the duplication.
#144.3. I forgot they weren't added by the patch, changed them back. We can always do a rename/deprecate later.
#144.4. - Yeah makes me wonder if there are other places we could use this once it's in beyond entities, just to save the serialization issues.
Patch should fix everything in #145 too, and hopefully most or all of the deprecation notices that are now getting picked up.
Comment #148
amateescu commentedI think the intended default value for this param is
FALSE? Should also be mentioned at the end of the docblock: "Defaults to FALSE."Here and a few other places: if we call it "the memory cache *backend*", it can be easily confused with
\Drupal\Core\Cache\MemoryBackend.But that's already the case, so we should have a followup to clean up this naming problem.
This change needs to be reverted as well.
Maybe this should be renamed to
memoryCacheTag?Removing this comma should show us some real results from the testbot :)
Comment #149
catch#148.1 - doh, fixed.
#148.2 - Agreed, I'm not sure what a good solution is yet, but opened #2973286: Clean up MemoryCache and MemoryBackend naming issues with a couple of notes.
#148.3 - done.
#148.4 - agreed, done.
#148.4 - oh dear, done.
Comment #150
catchMessy interdiff.
Comment #152
catchComment #154
amateescu commentedCleaned up the rest of the places that were triggering the deprecation warnings, as well a couple of coding standard issues.
Comment #155
amateescu commentedGetting there :)
Comment #157
amateescu commentedThird time is the lucky one?
Comment #159
claudiu.cristeaI would RTBC but still needs profiling.
Comment #160
sam152 commentedI spent some time profiling this patch. Here is what I came up with using the following conditions:
profiling-script.patch, for 500 nodes load each one 1000 times.The results were:
I don't have much experience analysing these results, but the initial outlook shows the patch being quicker and using less memory.
Comment #161
catchI'm seeing 392 977 calls to Node::load() in one of those profiles, and 367 724 calls in the other, so they probably aren't comparable results unless we're either introducing or fixing a bug (which seems unlikely).
To be honest I don't think we need to profile here compared to other issues that get into core, but maybe a simpler test would be to compare how many function calls there are for calling Node::load() twice on the same node, or something like that.
Comment #162
sam152 commentedI thought loading the node a bunch of times would help highlight any performance regressions, but I suppose the script may have inadvertently timed out or the reporting isn't perfect for a really large number of calls. I might retest this and make sure the calls to ::load at least equal out across tests.
Comment #163
sam152 commentedNew profiling results:
Loading a single node twice truncated the function calls to actually load the node off the report, so I bumped it up to make sure they appeared. I can confirm we're calling load 1000 times in both sets of results at least.
Comment #164
claudiu.cristeaI agree with @catch on #161. Also the results, how significant they would be, are not showing to much difference. I see all other concerns were addressed.
Comment #165
alexpottThe patch looks great. Just wondering whether or not...
we should do
MemoryCacheInterface $memory_cache = NULLon all of these just in case contrib / custom has extended one of these. They will get the deprecation messages (once make it easier for people to test this outside of core) but at least nothing with break for 8.6.x.Comment #166
alexpottDiscussed #165 with @catch yesterday. @catch pointed out that as storages are constructed by their own createInstance method if someone has overridden the constructor they are going to need to change it anyway. So we can ignore #165.
My only other question is why...
... do we need to do this?
Comment #167
amateescu commented@alexpott, that's because
UserUpdateOrderPermissionsTest::testPermissionsOrder()is using the entity API to load a user role before running the updates, so the role entity objects ends up in the static cache.However, we don't support using any API functions in update tests before running the update, so here's an alternative fix for that problem.
Comment #169
amateescu commentedI managed to screw up the patch somehow... This one should be better, it's #157 with the interdiff from #167.
Comment #170
berdir@amateescu: +1 on that fix. That said, the role was in the static cache before as well, so something isn't quite transparent compared to before. The difference is that runUpdates() calls
$needs_updates = \Drupal::entityDefinitionUpdateManager()->needsUpdates();, which itself calls useCaches(FALSE)/useCaches(TRUE), which has the side effect of dropping any entity handlers including their static caches.Can we keep that behavior for the new approach as well? I've actually used those methods in projects before exactly for that purpose (clearing the entity static cache), so if that doesn't work anymore then those projects would get into troubles until the the memory cache includes an LRU or some other fixed-size/purging mechanism.
Comment #171
alexpott@amateescu my question is really how come HEAD passes - roles are being statically cached there too.
@Berdir ah yeah... that's what I was asking - thanks!
Comment #172
amateescu commented@Berdir, very good point! Let's not break BC unnecessarily.
Comment #173
catch#172 is a very nice fix. Since it's a one line change and this was already RTBC, I think it's OK for me to put it back to RTBC again.
Comment #174
alexpottUpdating review credits. Crediting @pounard, @fago, @alexpott, @fubhy, and @hchonov for comments that affected the patch or made contributions to the discussion.
Comment #175
alexpottCommitted d8029d7 and pushed to 8.6.x. Thanks!
Comment #177
catchOooh it's great seeing this go over the line, thanks all!
Comment #178
amateescu commentedYay! #2924218: Clearing the persistent entity cache every time we switch between workspaces is super wasteful is now blocked only on #2928888: Add a hook_entity_preload() for modules that need to load a different revision than the default one :)
Comment #179
tacituseu commented+ assert('\Drupal\Component\Assertion\Inspector::assertAllStrings($tags)', 'Cache Tags must be strings.');Makes it fail on 7.2.x with
assert(): Calling assert() with a string argument is deprecated, see:https://www.drupal.org/pift-ci-job/978615
https://www.drupal.org/pift-ci-job/978614
Comment #181
alexpottCommitted 1e37fc8 and pushed to 8.6.x. Thanks!
I just pushed a follow-up quick fix by copying \Drupal\Core\Cache\ApcuBackend::set()
Comment #182
tacituseu commentedYikes, haven't attached the patch to my comment on purpose because it felt wrong piggybacking like that on a 6 year old awesome issue I had no part in ;).
Comment #183
alexpott@tacituseu it felt more important to just fix it. And since you reported I credited you on the commit but not this issue.
Comment #184
mxh commentedSince storage handlers are basically free to choose their own memory cache backend, shouldn't
EntityStorageInterfacebe complemented with a::getStaticCache()method or at least have a::resetStaticCache()method? Same requirement applies for the persistent cache too.Comment #185
mxh commentedAttaching a patch which can be applied to 8.5, as this might be urgent for production.
Comment #186
berdirAs a workaround on 8.5.x, you can use $entity_type_manager->useCaches(FALSE) and then again useCaches(TRUE), that resets the entity handlers, you can get a new one and it will have a reset static cache (that assumes that it won't be injected somewhere, but it will at most have the double static cache if you're doing this in a long loop.
Not opposed to having a method on the storage for resetting a single entity type's static cache, I'd suggest you open an issue.
Comment #187
mxh commented#186 Thanks a lot, you saved my day.