Problem/Motivation
Long running processes - migrations and any big drush command can run into memory issues, especially with entity static caching.
Proposed resolution
#375494: Restrict the number of nodes held in the node_load() static cache added a memory cache - putting the 'static' cache into a service that can be swapped out, and cleared independently from entity persistent caching. This will allow drush's memory reclaim logic to work the same way that it did in Drupal 7 (when it cleared drupal_static).
However we can go a step or two further, and add an LRU implementation of the memory cache API, which holds a maximum number of items and cycles out once it goes over. This would allow drush or migrate to replace the service, and just load entities as much as it likes without requiring manual memory reclamation.
Remaining tasks
User interface changes
API changes
Data model changes
#375494: Restrict the number of nodes held in the node_load() static cache adds an LRU cache for the entity cache, I wanted to see if we could make a generic in-memory LRU cache that could maybe be used elsewhere.
Turns out to be reasonably simple to write. I haven't benchmarked this or anything yet, just wanted to get something that works first. It does not yet allow for adding multiple items to the cache at once, that may be a useful feature to add for the entity case.
Patch only adds the class and a unit test, but marking CNR for the bot and hopefully feedback.
| Comment | File | Size | Author |
|---|---|---|---|
| #139 | 1199866-nr-bot.txt | 2.39 KB | needs-review-queue-bot |
| #137 | 1199866-nr-bot.txt | 2.62 KB | needs-review-queue-bot |
Issue fork drupal-1199866
Show commands
Start within a Git clone of the project using the version control instructions.
Or, if you do not have SSH keys set up on git.drupalcode.org:
- 1199866-memory-lru-cache
changes, plain diff MR !10336
Comments
Comment #1
franzJust tested, seems to be working neatly.
I was mislead by this comment to believe the item was being replaced with the new one - which is not the case and would be nonsense. Maybe something like "Drop the first item - least recently used - to make room for the new item." would be clearer.
So currently entity controllers do like '$cache += $array_of_items;' which is not supported by ArrayObject. I guess a simple method that gets the items array as argument and iterates over them adding could do for now, right? Unless there is some performance impact in adding one by one.
Comment #2
catchI've been pondering multiple add but haven't yet figure it out.
There's two issues:
- iterating is probably going to be a bit slower than +=, although since it only happens on (static) cache misses it might be fine.
- there is a chance that more items get added to the cache than it has capacity for - say you have a 30 limit but a list of 50 items all loaded at once.
Comment #3
franzI thought about the second issue. And if you merge both - iterating a number of items above limit - we would most likely get a performance issue because of all the removals.
We could have a private property "$this->bypass" that defaults to FALSE. If it is true, offsetSet() would only call parent::offsetSet(). So this property can be used by an addMultiple() method so that it would handle removals and additions to $this->positions[] on itself and only once, avoiding endless iterations.
Concerning the second issue again, a simple solution would be to just cache a chunk of the maximum size, could be the last n items of the loaded items.
Comment #4
pounardMaybe this should be implemented using the cache backend interface: we could then have an in memory-lru working implementation ready to be used there #1596472: Replace hard coded static cache of entities with cache backends, which would then make #375494: Restrict the number of nodes held in the node_load() static cache a duplicate that should be closed.
One stone, two birds.
Comment #5
catchYep, beat me to it :)
Comment #6
alexpottMoved catch's LRU implementation to a cache backend.
There's more work to do sort out getMultiple, deletePrefix, expire, garbageCollection, invalidateTags...
Comment #7
pounardThis should probably implement the full cache backend interface, I see some missing methods.[EDIT: Sometime I read too quickly sorry.] This also probably would mean it'd need to be decoupled from the LRU class and use its own implementation instead (because of advanced features such as tags etc)?Comment #8
alexpottMerged work from #1637478: Add a PHP array cache backend to fully implement CacheBackend
Comment #9
pounardI think you are doing the work in the wrong way, we should first finish the work of #1637478: Add a PHP array cache backend then continue this issue on a sane basis?
Comment #10
alexpottJust for fun and because I'm not always sane :) ... here's is an implementation that extends Pounard's latest patch on #1637478: Add a PHP array cache backend. This will fail tests as that issue is not committed yet.
Comment #11
pounardYou're as sane as me, just the code from the other issue definitely wasn't.
Comment #13
alexpottThis patch is a reroll and extends the
GenericCacheBackendUnitTestBasefrom #1637478: Add a PHP array cache backendComment #14
pounardSerious stuff can now happen, good to see this.
Comment #15
alexpottGo testbot go!!!!
Comment #16
catchWe should use straight ArrayAccess for the LRU cache, doesn't need to be an ArrayObject I think, can't remember why I did that in the first place.
Or do we want to go as far as inlining the logic into the backend itself?
Thanks for keeping this updated!
Comment #17
lars toomre commentedFYI ... LRUArrayObject.php file is missing @file docblock at the top of the source code.
Comment #18
pounardI'm OK with straight array access, but even better we could just implement it as a normal cache backend and use it into the CacheChain as soon as it can land. Then, the LRU usage would be fully transparent and its interface as a clear as the cache interface itself.
Comment #19
alexpott@pounard it is
a CacheBackend :) - because MemoryBackend is
Comment #20
alexpottdreditor changed the status... not fair.
Comment #21
alexpottIf I change it to ArrayAccess I get the following error:
Implemented #17
Comment #22
catchShould just be implements instead of extends?
Comment #23
catchHmm not quite though, we might need to add a bit more that ArrayObject gives us for free currently, but shouldn't be a big change I'd hope.
Comment #24
alexpottIt's extending the new MemoryBackend so it doesn't have to have exactly the same functions... otherwise it'd be implementing the CacheBackend interface.
Comment #25
fubhy commentedThis might be a really stupid question but does it maybe make sense to go with memory_get_usage() and restrict the cache by memory usage instead of slots? The slots don't really resemble the actual memory usage usually.
Comment #26
fabianx commented#25: That IMHO would only work if several items are added to one slot, but the problem then is that those can be updated independently so I don't see how this could work.
Comment #27
catchFor actual memory usage I've been wanting to try using weak references (http://php.net/manual/en/book.weakref.php), I don't really see a way to take into account the memory limit without something like that. It's a PECL module so would need to be contrib though.
Comment #28
fubhy commentedWe would have to track the memory usage of an item when we add it to the cache... Like so:
Will get a bit more tricky once we start kicking out items once we hit the limit. But yeah, it's possible by simply using memory_get_usage(). The only question is how well that would perform in general.
Comment #30
fabianx commentedSo what is nice about #28 is that we could say:
- Store only 8 MB of node objects in the static node cache, but allow a minimum of 8 slots.
- It is getting a little tricky if a reference is dropped however, as the node's object memory would be 0 effectively at time of insertion
But if we assume that its cached when loaded fresh, that should work reasonabiy well and with low memory it
To the patch itself:
Should unset() not update the positions array as well, to free up slots?
Comment #32
catchHere's a re-roll just under four years later. No point doing an interdiff.
Depends on #1596472: Replace hard coded static cache of entities with cache backends.
Adds a new class LruStaticCache which extends from StaticCache. This implements the same LRU logic as the previous patches. Added support for invalidate/invalidateTags etc. and new unit tests.
Next step would be to swap the service definition for static_cache to use the LRU version.
To see if this works we really want a script that creates then loads a few thousand nodes recording the memory usage before/after.
Comment #34
catchComment #35
catchUnit test is green.
Here's a patch that uses this for the entity cache, if the whole test suite passes then memory test should be viable.
Comment #36
dawehnerWhat happens if you set a value which is already in the slots itself? Couldn't you optimize to not store one value twice?
What happens if the item doesn't exist yet. Shouldn't we check whether there is an item first?
It would be nice to explain the difference between MemoryBackend and StaticCache
Do we need the BC in every of those 3 levels?
Comment #37
catchPatch should fix the first three points of #36, those were bugs in the original patch from four years ago too. Added some extra test coverage for the first case.
With #4, you could extend from any one of those classes individually, so I think we do?
Comment #38
dawehnerWon't ConfigEntityStorage and ContentEntityStorageBase fallback to EntityStorageBase, given that we pass along
$static_cacheto it?Comment #40
dawehnerIs this really a jenkins issue?
Comment #42
catchIf you instantiate ConfigEntityStorage (which we do in unit tests), then you want to pass in the dependencies to the constructor. Without the unit test updates to pass it in, the unit tests fail on the service not being available in the \Drupal call. Not sure exactly what you think can be ditched here tbh.
Comment #43
dawehnerI was simply thinking of this ...
Comment #46
andypostComment #47
jofitzRe-rolled.
Comment #50
catch#1596472: Replace hard coded static cache of entities with cache backends is in. Here's a (hit and hope) re-roll.
Comment #51
catchComment #52
amateescu commentedI recently found this implementation of a in-memory LRU cache: https://github.com/aws/aws-sdk-php/blob/master/src/LruArrayCache.php and it seems a bit simpler than the one from the current patch. We would simply work on the
\Drupal\Core\Cache\MemoryBackend::$cachevariable, without the need for$positionsand$availableSlots. What do you think?Comment #53
catchThat seems much simpler, but I'm sure there was a reason we didn't go for that approach in the first place though. Wondering if it's consistency between numeric and string cache IDs?
Maybe there wasn't a very good reason for the current approach though. We have the test coverage here so it should be easy enough to verify.
Comment #54
catchOK so I can only think of two reasons for the current complexity, neither are good:
1. The current code avoids using count(), but the extra complexity involved is not worth it and it's only needed on set anyway.
2. The ::invalidate() logic needs some extra care because an array_unshift() on a numeric array will reindex the array, but we can use array_reverse() to put an item at the beginning of an array without a re-index.
Broke the tests a couple of times when implementing this, so slightly improved the test assertions to make debugging easier.
So here's a new patch which is closer to the implementation in https://github.com/aws/aws-sdk-php/blob/master/src/LruArrayCache.php
Comment #56
catchinterdiff was right, patch was the wrong file :(
Comment #57
amateescu commentedNice, that looks much better!
Just a few more nitpicks left:
Extra empty lines here :)
We should also specify which methods are covered by this test.
Should have
Also, I don't see any test coverage for
invalidateTags(). Do you think it would be useful or not really?Comment #58
catchFixed 1-3.
Invalidate tags... to me the combination of the existing coverage for MemoryBackend::invalidateTags() and the coverage for ::invalidate() here seems like enough, we'd be testing one method calls another method in terms of actual logic.
On the other hand it wouldn't hurt. Did not include that for now.
Comment #59
catchinterdiff without the cruft.
Comment #60
catchgrrr
Comment #61
amateescu commentedI don't feel strongly about covering
invalidateTags(), so I think this is ready :)Comment #62
hchonovUnder which circumstances could diff be bigger than 1?
Comment #64
alexpott(Optional) - I'd default this to 100 and always set it. That way we'd lose a condition and the default value is exposed by IDEs when you construct.
Comment #65
catch@hchonov so it should be zero, but that reminded me we should probably handle the case where there is a setMultiple() of more items than the allowed number of items. Do we want to temporarily go over the limit (but enforce it next time there's a set) or enforce it even if it means immediately discarding things?
@alexpott that makes sense.
Comment #66
berdir* Do we have an issue where we're actually going to use this for the entity cache (there's the one about node_load() mentioned, we could use that I guess?)
* Would 100 make sense for the entity cache.. we have to consider that this is for all entity types in the system AFAIK, so 100 isn't *that* much, especially if we start to statically cache config entities more than we do now? If not, should increase the default and/or expose this as a container parameter/service argument (possibly both?)
* Issue summary says this hasn't been profiled yet, do we need to do that? Or will we profile this instead when used in action as the entity cache backend?
Comment #67
catchFor the entity cache the limit ought to be at least 300 since that's the maximum default comments per page, and we started static caching them in 8.x iirc.
However I'm not convinced we want to use this for the entity cache by default, I was thinking more that drush and similar could replace the service specifically for long-running operations.
Comment #68
dawehnerHow about expand the documentation to tell why we are dong that. "Move the item to the end of the array, so it'll be cleaned up last".
May I ask why this code cannot just be using array_slice?
Comment #69
ndobromirov commentedArray slice makes a copy of the array for big arrays it could be a memory issue but
array_splicecould work-around that though. Overall agree with #68.2Comment #70
amateescu commentedWe can't use
array_splice()either because it doesn't preserve numeric keys.Comment #71
catchThis addresses #64 and #68 (see interdiff). I've made the default limit 300, since we static cache comments and 300 comments are allowed per page via configuration.
It does not address hchonov's #62.
Here's what I think we should do, but would like a second opinion before making the change either way:
At the moment the patch allows the number of items to be progressively reduced down to the limit when an item is set. However we have no way to end up going over the limit, so the while loop there is redundant. It could be simplified to "one in, one out".
There's an alternative though, which is to allow a setMultiple() to add all the items passed in to the cache (even if it's 1000). We could do this by overriding setMultiple, temporarily setting the items limit to the count of items passed in, then setting it back at the end. Then the next item added to the cache would bring things down to the limit again. I was tempted by this originally, but leaning towards just sticking to the limit and simplifying the code - next comment will have that patch.
Comment #72
catchOK here's the simplified version, removes the while loop, slightly amends the comment but retains pointing out why we don't use the various array_*() functions to take the item off.
Comment #73
amateescu commentedI think going over the limit could lead to out of memory errors, right? So I agree that we should keep it simple for now and stick to the limit until we have some evidence that going over the limit is actually useful at some point :)
Comment #77
grahlThis looks great, what's missing for it to be added?
Comment #78
ghost of drupal pastThanks, this is awesome and useful. Two possible enhancements:
https://www.php.net/manual/en/function.array-key-first.php it has a polyfill, too.
If I gather correctly you want to move the item to end and it is already
unset. What about$this->cache += [$cid => $item]instead of the doublearray_reversehttps://3v4l.org/82ZV8Comment #79
hchonovarray_key_first()is only available for PHP 7 >= 7.3.0.------
Why invalidating an entity makes it the least recently used one? If the invalidated entity is loaded by setting
$allow_invalid = TRUE, then it will become the most recently used one. Therefore I don't really see a reason for the overhead of reordering the entities when they are being invalidated. Yes, one could argue that we want to remove first the invalidated entities when there is no place left, but then this isn't pure LRU implemention anymore and I can see people complaining about this inconsistency. I would rather leave the invalidation part away from the LRU implemention.Comment #80
ghost of drupal pastSorry I didn't elaborate enough: array_key_first is 7.3.0 but that's okay because the manual gives us a very simple polyfill and if we add said polyfill then all code under Drupal can enjoy that useful feature before 7.3 is required and don't need to change once it is. (As a footnote, I really hope we won't rush with 7.3 requirement, that serialize business is very problematic.)
Comment #81
vijaycs85I am not sure if it is true. here is current vs
+=: https://3v4l.org/JYCGaComment #82
ghost of drupal pastAh. Thanks for the demo.
$this->cache = [$cid => $item] + $this->cacheis still better than two array_reverses :) bonus: you don't need the unset. https://3v4l.org/6uEX1Comment #83
vijaycs85#82: +1. nice!
Comment #84
fgmRegarding #67 / #67 for the number of keys, lots of D8 sites are using Paragraphs, which tends to cause an explosion of the number of entities actually loaded on node loads. So 100, or even 300, is probably erring on the low side. I'd likely add at least one zero, especially if this is only an optionally switchable service for drush and other non-default environments, which have higher memory resources than the minimal web-oriented defaults.
Comment #85
vijaycs85100 could be removed here as constructor has default 300.
Comment #86
ndobromirov commentedIs there a new service that will allow custom code to use that? For example
cache.memory.lru. I was not able to see a service definition on the patch.What are the plans to use the LRU for, except the discussed entity cache?
Have you considered the new PHP's (7.4) weak references?
This way GC will handle the clean-up when there is memory pressure.
Comment #87
andypost@ndobromirov nice point about php version, it sounds like needs polyfill but mostly reminds me about "hot cache preload" discussions years ago, means it very probably will be implemented in contrib first like chainedFast and so on
Comment #88
ghost of drupal pastWhile
WeakReferencewould be nice, I can't imagine how that could be polyfilled so I think that is not useful for core for the time being (7.4 is not even out yet).Comment #89
catch@ndobromirov on weakreferences, this issue is a spin-off from #1237636-7: Lazy load multiple entities at a time using fibers and I wanted to do a weakreferences version since the start, but had been waiting for the non-weakref version (i.e. this patch) to land before discussing this. Cool that it's gone from an rfc to PECL to core in the intervening eight years though. We should open an actual follow-up although contrib might be a better option until we require PHP 7.4 (which would be Drupal 9 at the earliest).
entity caching is the only real use-case we have at the moment, although we've started using the memory cache for other services. See #3047289: Standardize how we implement in-memory caches and #2488350: Switch to a memory cache backend during installation, so we'll potentially have more use-cases soon.
@chx - we could maybe bring in this polyfill? It seems a bit overkill for just this patch but it would be very useful in general. https://github.com/symfony/polyfill-php73. Other changes look good too.
@hchonov that was the idea behind invalidate, but I'm fine with removing that hunk from the patch.
Comment #91
andypostFollow up for weak references probably should go into 9.1
Comment #92
catchPHP 8 will have weakmap: https://www.php.net/manual/en/class.weakmap.php
Comment #95
heddnDoes minimal changes to respond to the latest feedback on the patch. Doesn't address concerns with invalidate from @hchonov.
Comment #96
andypost@catch there's implementation in symfony which already in dependencies https://github.com/symfony/polyfill/issues/228#issuecomment-591843988
Comment #97
heddnWe also depend on 7.3 at this point in the drupal core development lifecycle, so even better.
Comment #98
catchIt looks like the weakmap polyfill didn't make it into Symfony, although it's available for PHP 7.4 as its own library. Either way, nine years later have opened #3190992: Add a WeakReference memory cache implementation. Now that Drupal 10 gives us an actual timeline for requiring PHP 8, we could potentially jump over this issue and just do weakref/weakmap instead.
Comment #99
catchComment #100
andypostIn 8.1 under discussion [RFC] CachedIterable (rewindable, allows any key&repeating keys)
Comment #101
neclimdulSince we're back here I did a review. Everything applied cleanly since its just new classes
I think I echo a lot of other reviews in that it feels like the array mangling could be optimized but pretty sure it would need to be decoupled from the MemoryCache and add a lot of complexity. I wouldn't hold up committing it on that since there's no core usage and its sat long enough. We can work within the interface with faster implementations if someone comes up with something clever.
These are void methods so we don't need to worry about the returns. I don't see any tests of the returns so easy fix.
Comment #103
jeroentI removed the returns as mentioned in #101.
Comment #104
jeroentComment #105
jeroentComment #106
mxh commentedThanks @JeroenT for applying the suggestions. I reviewed #105 and it looks that we're on a good track here.
Some notes:
1.
Can we please be more consistent here and at least have one default value? I'm fine with removing the default value of the property here.
I'd recommend to see that added LRU in-memory cache only as an additional goodie that site owners may use to replace existing caches, as they are available as services and thus replaceable as such anyway. Changing existing ones could mean major side effects on existing installations, and those who had problems with memory overflows, they mostly have it solved already on their own, also maybe using a custom strategy that works well for their site, but the LRU could destroy it.
2.
Guess this was already discussed, still the
countdoesn't feel right to me (unless PHP is already heavily optimized when counting arrays, then it's fine). I'm thinking of having large lists with super-tiny items, where I'd prefer an increment property and safely rely on it (I mean the cache property is protected, so what's the deal with not having a simple incement?).Comment #107
neclimdulre #106
1. I noticed that too. I think you just don't provide the default on the property if its always set in the constructor.
2. Its a function call but my understanding (and reading of the source just now) is that the function is just going to return the value of a property on the zval. A lot of time we try to avoid count() and try to use empty() because of that function call but being on the write side it seems appropriate here.
Comment #108
mxh commentedI'd agree when the
countwould happen withinsetMultiple, but it's being called whenever a single item is being added. It's probably only of a minor concern, but would be great to have it as efficient as possible right away so that it doesn't hurt also when handling lots of (tiny) items.Comment #112
andypostDoes it still makes sense?
Comment #113
robin.houtevelts commentedI think we are missing a return statement in the if-condition.
Comment #117
gappleUpdated the patch from #105 in a MR
- Move
allowedSlotsproperty declaration to constructor; replaces mismatched default value on property (#106)- Added missing early return in
set()(#113)- Override
setMultiple()to onlycount()extra items once before removal (#108)- Code style fixes
Comment #118
jweowu commentedJust cross-referencing with another old issue: https://www.drupal.org/project/drupal/issues/1961934#comment-7274730
Back when I raised that I was seeing a pattern of entity cache problems, and observed that a LRU approach was similarly problematic in the circumstances in question ("With LRU and limited memory, you can easily throw out everything you want to keep"), so I proposed a way of providing hints to the cache to enable it to be smarter.
I haven't tried to catch up with what's currently happening in this issue, so apologies in advance if it's not relevant any more; but seeing as how there's current activity on this LRU caching issue, I thought I'd add a link in case it was still a going concern.
The example scenario was:
* You are looping through a potentially-large list of entities (e.g. apache solr doing an indexing cron run).
* Each entity must be loaded for indexing, but you have no reason to expect anyone else to be requesting it in the near future.
* Conversely, the entity load cache is full of things which users have been actively requesting, so we want to retain all of those entities.
* So for each indexing loop iteration we want to ensure that the indexed entity only ends up in the load cache if it was already there beforehand. Any freshly-loaded entities in the indexing loop should be discarded from memory, uncached; because despite being "more recently used" we don't need them -- caching them all would merely push out a lot of much-more-useful cache entries.
Comment #119
alexpott@jweowu we're talking about static memory caches here so what other users are doing is not really relevant. A static memory cache is empty at the start of every request.
Comment #120
jweowu commentedVery true, "other users" are only relevant to the persistent caches (I was considering both in the other issue).
I don't believe that renders the notion irrelevant, though. After all, no cache-expiry scheme has any advantage over any other if the cache doesn't fill up. We're only concerned about what happens when, for whatever reason, the cache fills up, and my suggestion is that giving the programmer some control over what should be flushed and what should be kept in those situations would be an improvement, and that a simple LRU approach isn't always going to be the best approach.
Comment #121
catchWould also add the original issue described for Drupal 7 was fixed for Drupal 8+ in this issue #1596472: Replace hard coded static cache of entities with cache backends - which is what allows this one to happen cleanly e.g. the entity static cache can already be reset independently of the persistent cache.
Adding a reference to #2577417: Add an entity iterator to load entities in chunks too which would help when iterating over a very large number of entities.
Comment #122
alexpottPushing on this one again. The entity static cache filling up on long running jobs is an issue I run into quite often work around and then when I hit it again I get sad that we're get to fix this.
Comment #123
catchI opened this issue, but the last time I worked on the actual code here was 7 years ago, so I think I can RTBC it.
Comment #124
kristiaanvandeneyndeI'll have a quick look myself to see if I can +1 this.
Comment #125
kristiaanvandeneyndeOkay so the code will work, I really like the thorough test coverage and the fact that we uncovered MemoryCache itself didn't have a test extending GenericCacheBackendUnitTestBase yet. Only remark I have in general is that some methods are dedicated to numeric keys while other methods cover both string and numeric keys in one go. But don't let that detract from the overall excellent coverage.
I found a tiny nit regarding the word pidgin and think the test could benefit from renaming the assertion and an extra comment or two. This is why I'm setting back to NW.
The comment about moving the shared logic I'll leave up to you to decide, as MemoryBackend also seems to copy-paste things across methods occasionally. So I wouldn't be too bothered if we copy-paste a few lines of code here too. So feel free to RTBC without addressing this.
Now that this introduces another memory cache, would it be prudent to add a line to MemoryCache and LruMemoryCache's docblocks to tell people they should declare these bins using the cache.bin.memory tag?
Comment #126
alexpottI fixed 3/4 of the comments on the MR. There was one I didn;'t think needed doing.
Re the cache.bin.memory tag - interestingly the entity.memory_cache does not use this tag and it's not really managed like other cache bins or memory cache bins. I think we should consider adding documentation about somewhere but I don't think it should be in this issue. I also think we could consider defaulting to a LRU cache for all memory caches.
Comment #127
kristiaanvandeneyndeOne final nit, but can be accepted upon commit.
I agree that we should create a follow-up for checking if we have sufficient documentation re cache.bin.memory and why entity storages don't use it like that.
Comment #128
alexpottI tested this code on a long running batch process from the Entity Usage module - to build the entity_usage table from a large site 30,000+ nodes and 200000+ paragraphs using a module I made with this code - see https://www.drupal.org/project/entity_lru_cache - and it makes a massive difference. Memory no longer spirals to 90% of VM I'm running the job in and everything is a bit quicker too.
Comment #129
kristiaanvandeneyndeOh wow, that's great. We should consider making it the default then, no?
Comment #130
alexpott@kristiaanvandeneynde see #3498154: Use LRU Cache for static entity cache
Comment #131
nicxvan commentedI'm running a migration that usually takes 40 hours ish.
As a control I first am running it with
attemptMemoryReclaimcommented out on Drupal 11.1.0I will add info from
ps aux --sort -rss |grep 'drush'too periodically.I will update periodically.
It's several migrations chained
11280/81350 8min 43s/1hr 2min 24MiB
24038/81350 22min 37s/1hr 15min 26MiB
41000/81350 40min/1hr 19min 26.5MiB
Memory 23.3% VSZ690000 RSS107000
@alexpott on Slack said I should see a difference almost immediately, I will kill it and compare after adding this:
I am running it with
attemptMemoryReclaimcommented out on Drupal 11.1.0 and the above module enabled.11000/81350 7min 5s/52min 30s 24Mib
20.7%
25000/81350 22min 8s/1hr 11min 26.0MiB
22.4%
71000/ 1hr 10min/1hr 18min 31MiB
22.3%
Gonna wait for some of the bigger migrations.
We've gotten to the bigger migration, this is with the LRU cache
1682/1146871 21min/10days3hrs 710MiB
82%
2134/1146871 41min/15days 12hrs 732MiB
91.8%
Ok I killed it when it got to 27 days.
I am rerunning 11.x base just the long migration.
Comment #132
alexpottDiscussed sizing a bit @catch and we agreed to remove the default as any code that uses a LRU cache should consider the number of slots for the use-case.
Comment #133
nicxvan commentedSince my last comment I ran the migration on production with and without the LRU cache. Both took 39 hours.
@alexpott mentioned this is a positive result since it's the same, this update will either improve or do nothing, my migration helps prove if it's not needed it does not make the situation worse.
Comment #134
andypostComment #135
chi commentedWonder if we could go with LFU cache instead. Will the implementation be more complex?
Comment #136
catchLFU would be much more complex yes, you have to keep a count of how often an item has been accessed and then evict the least frequently accessed item. And the current implementation is being careful to add as little CPU overhead as possible.
In general on a normal web request, we've set the limits high enough that items should never be evicted, so it will only kick in for migrations and similar tasks, or pages that load an extreme number of entities for some reason.
Comment #137
needs-review-queue-bot commentedThe Needs Review Queue Bot tested this issue. It fails the Drupal core commit checks. Therefore, this issue status is now "Needs work".
This does not mean that the patch necessarily needs to be re-rolled or the MR rebased. Read the Issue Summary, the issue tags and the latest discussion here to determine what needs to be done.
Consult the Drupal Contributor Guide to find step-by-step guides for working with issues.
Comment #138
alexpottFixed the PHPStan issue.
Comment #139
needs-review-queue-bot commentedThe Needs Review Queue Bot tested this issue. It fails the Drupal core commit checks. Therefore, this issue status is now "Needs work".
This does not mean that the patch necessarily needs to be re-rolled or the MR rebased. Read the Issue Summary, the issue tags and the latest discussion here to determine what needs to be done.
Consult the Drupal Contributor Guide to find step-by-step guides for working with issues.
Comment #140
kristiaanvandeneyndeFixed the sniffer error using a suggestion.
Comment #141
catchTagging for release highlights, would just be a sentence in the performance improvements section but this is a good step towards making long-running processes easier to manage.
Comment #145
nod_Committed 4a3fbdb and pushed to 11.x. Thanks!
Comment #146
catchReally nice to get this in nearly 14 years after opening it!