Problem/Motivation

Long running processes - migrations and any big drush command can run into memory issues, especially with entity static caching.

Proposed resolution

#375494: Restrict the number of nodes held in the node_load() static cache added a memory cache - putting the 'static' cache into a service that can be swapped out, and cleared independently from entity persistent caching. This will allow drush's memory reclaim logic to work the same way that it did in Drupal 7 (when it cleared drupal_static).

However we can go a step or two further, and add an LRU implementation of the memory cache API, which holds a maximum number of items and cycles out once it goes over. This would allow drush or migrate to replace the service, and just load entities as much as it likes without requiring manual memory reclamation.

Remaining tasks

User interface changes

API changes

Data model changes

#375494: Restrict the number of nodes held in the node_load() static cache adds an LRU cache for the entity cache, I wanted to see if we could make a generic in-memory LRU cache that could maybe be used elsewhere.

Turns out to be reasonably simple to write. I haven't benchmarked this or anything yet, just wanted to get something that works first. It does not yet allow for adding multiple items to the cache at once, that may be a useful feature to add for the entity case.

Patch only adds the class and a unit test, but marking CNR for the bot and hopefully feedback.

CommentFileSizeAuthor
#139 1199866-nr-bot.txt2.39 KBneeds-review-queue-bot
#137 1199866-nr-bot.txt2.62 KBneeds-review-queue-bot
#105 interdiff-1199866-104-105.txt1.76 KBjeroent
#105 1199866-105.patch7.71 KBjeroent
#104 interdiff-1199866-103-104.txt2.12 KBjeroent
#104 1199866-104.patch7.71 KBjeroent
#103 interdiff-1199866-95-103.txt1.17 KBjeroent
#103 1199866-103.patch7.38 KBjeroent
#95 interdiff_72-95.txt1.08 KBheddn
#95 1199866-95.patch7.42 KBheddn
#72 1199866-72.patch7.5 KBcatch
#72 interdiff-1199866-71-72.txt1.25 KBcatch
#71 interdiff-1199866-60-71.txt1.71 KBcatch
#71 1199866-71.patch7.55 KBcatch
#60 1199866-60.patch7.38 KBcatch
#60 interdiff-1199866-59-60.txt629 bytescatch
#59 interdiff-1199866-56-58.txt1.5 KBcatch
#58 1199866-58.patch7.38 KBcatch
#58 interdiff-1199866-56-58.txt42.08 KBcatch
#56 1199866-54.patch7.18 KBcatch
#54 interdiff-1199866-50-54.txt5.97 KBcatch
#54 1199866.patch35.82 KBcatch
#50 1199866-50.patch8.4 KBcatch
#47 1199866-47.patch34.71 KBjofitz
#43 interdiff.txt4.16 KBdawehner
#37 1199866.patch35.82 KBcatch
#37 interdiff.txt3.96 KBcatch
#35 1199866-combined.patch34.15 KBcatch
#32 1199866-combined.patch34.15 KBcatch
#32 1199866-do-not-test.patch6.73 KBcatch
#21 13-21-interdiff.txt400 bytesalexpott
#21 1199866-21-lru-cache-backend.patch6.88 KBalexpott
#13 1199866-13-lru-cache-backend.patch6.8 KBalexpott
#10 1199866-10-lru-cache-backend.patch5.54 KBalexpott
#8 1199866-8-lru-cache-backend.patch9.92 KBalexpott
#6 1199866-6-lru-cache-backend.patch6.7 KBalexpott
lru.patch4.31 KBcatch

Issue fork drupal-1199866

Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Comments

franz’s picture

Just tested, seems to be working neatly.

// Replace the item in first position in the array.

I was mislead by this comment to believe the item was being replaced with the new one - which is not the case and would be nonsense. Maybe something like "Drop the first item - least recently used - to make room for the new item." would be clearer.

So currently entity controllers do like '$cache += $array_of_items;' which is not supported by ArrayObject. I guess a simple method that gets the items array as argument and iterates over them adding could do for now, right? Unless there is some performance impact in adding one by one.

catch’s picture

I've been pondering multiple add but haven't yet figure it out.

There's two issues:

- iterating is probably going to be a bit slower than +=, although since it only happens on (static) cache misses it might be fine.

- there is a chance that more items get added to the cache than it has capacity for - say you have a 30 limit but a list of 50 items all loaded at once.

franz’s picture

I thought about the second issue. And if you merge both - iterating a number of items above limit - we would most likely get a performance issue because of all the removals.

We could have a private property "$this->bypass" that defaults to FALSE. If it is true, offsetSet() would only call parent::offsetSet(). So this property can be used by an addMultiple() method so that it would handle removals and additions to $this->positions[] on itself and only once, avoiding endless iterations.

Concerning the second issue again, a simple solution would be to just cache a chunk of the maximum size, could be the last n items of the loaded items.

pounard’s picture

Maybe this should be implemented using the cache backend interface: we could then have an in memory-lru working implementation ready to be used there #1596472: Replace hard coded static cache of entities with cache backends, which would then make #375494: Restrict the number of nodes held in the node_load() static cache a duplicate that should be closed.

One stone, two birds.

catch’s picture

Yep, beat me to it :)

alexpott’s picture

StatusFileSize
new6.7 KB

Moved catch's LRU implementation to a cache backend.

There's more work to do sort out getMultiple, deletePrefix, expire, garbageCollection, invalidateTags...

pounard’s picture

This should probably implement the full cache backend interface, I see some missing methods. [EDIT: Sometime I read too quickly sorry.] This also probably would mean it'd need to be decoupled from the LRU class and use its own implementation instead (because of advanced features such as tags etc)?

alexpott’s picture

StatusFileSize
new9.92 KB

Merged work from #1637478: Add a PHP array cache backend to fully implement CacheBackend

pounard’s picture

I think you are doing the work in the wrong way, we should first finish the work of #1637478: Add a PHP array cache backend then continue this issue on a sane basis?

alexpott’s picture

StatusFileSize
new5.54 KB

Just for fun and because I'm not always sane :) ... here's is an implementation that extends Pounard's latest patch on #1637478: Add a PHP array cache backend. This will fail tests as that issue is not committed yet.

pounard’s picture

You're as sane as me, just the code from the other issue definitely wasn't.

Status: Needs review » Needs work

The last submitted patch, 1199866-10-lru-cache-backend.patch, failed testing.

alexpott’s picture

StatusFileSize
new6.8 KB

This patch is a reroll and extends the GenericCacheBackendUnitTestBase from #1637478: Add a PHP array cache backend

pounard’s picture

Serious stuff can now happen, good to see this.

alexpott’s picture

Status: Needs work » Needs review

Go testbot go!!!!

catch’s picture

We should use straight ArrayAccess for the LRU cache, doesn't need to be an ArrayObject I think, can't remember why I did that in the first place.

Or do we want to go as far as inlining the logic into the backend itself?

Thanks for keeping this updated!

lars toomre’s picture

FYI ... LRUArrayObject.php file is missing @file docblock at the top of the source code.

pounard’s picture

I'm OK with straight array access, but even better we could just implement it as a normal cache backend and use it into the CacheChain as soon as it can land. Then, the LRU usage would be fully transparent and its interface as a clear as the cache interface itself.

alexpott’s picture

Status: Needs review » Needs work

@pounard it is

+++ b/core/lib/Drupal/Core/Cache/MemoryLRUBackend.phpundefined
@@ -0,0 +1,95 @@
+class MemoryLRUBackend extends MemoryBackend {

a CacheBackend :) - because MemoryBackend is

alexpott’s picture

Status: Needs work » Needs review

dreditor changed the status... not fair.

alexpott’s picture

StatusFileSize
new6.88 KB
new400 bytes

If I change it to ArrayAccess I get the following error:

Fatal error: Class Drupal\Component\Utility\LRUArrayObject cannot extend from interface ArrayAccess in /Users/alex/dev/sites/drupal8alt.dev/core/lib/Drupal/Component/Utility/LRUArrayObject.php on line 70

Implemented #17

catch’s picture

Should just be implements instead of extends?

catch’s picture

Hmm not quite though, we might need to add a bit more that ArrayObject gives us for free currently, but shouldn't be a big change I'd hope.

alexpott’s picture

It's extending the new MemoryBackend so it doesn't have to have exactly the same functions... otherwise it'd be implementing the CacheBackend interface.

fubhy’s picture

This might be a really stupid question but does it maybe make sense to go with memory_get_usage() and restrict the cache by memory usage instead of slots? The slots don't really resemble the actual memory usage usually.

fabianx’s picture

#25: That IMHO would only work if several items are added to one slot, but the problem then is that those can be updated independently so I don't see how this could work.

catch’s picture

For actual memory usage I've been wanting to try using weak references (http://php.net/manual/en/book.weakref.php), I don't really see a way to take into account the memory limit without something like that. It's a PECL module so would need to be contrib though.

fubhy’s picture

We would have to track the memory usage of an item when we add it to the cache... Like so:

$before = memory_get_usage();
add_item_to_cache($item);
$after = memory_get_usage();
$usage_of_item = $after - $before;

Will get a bit more tricky once we start kicking out items once we hit the limit. But yeah, it's possible by simply using memory_get_usage(). The only question is how well that would perform in general.

fabianx’s picture

Issue summary: View changes
Status: Needs review » Needs work

So what is nice about #28 is that we could say:

- Store only 8 MB of node objects in the static node cache, but allow a minimum of 8 slots.
- It is getting a little tricky if a reference is dropped however, as the node's object memory would be 0 effectively at time of insertion

But if we assume that its cached when loaded fresh, that should work reasonabiy well and with low memory it

To the patch itself:

Should unset() not update the positions array as well, to free up slots?

Version: 8.0.x-dev » 8.1.x-dev

Drupal 8.0.6 was released on April 6 and is the final bugfix release for the Drupal 8.0.x series. Drupal 8.0.x will not receive any further development aside from security fixes. Drupal 8.1.0-rc1 is now available and sites should prepare to update to 8.1.0.

Bug reports should be targeted against the 8.1.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.2.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

catch’s picture

Version: 8.1.x-dev » 8.3.x-dev
Status: Needs work » Needs review
Parent issue: » #1596472: Replace hard coded static cache of entities with cache backends
StatusFileSize
new6.73 KB
new34.15 KB

Here's a re-roll just under four years later. No point doing an interdiff.

Depends on #1596472: Replace hard coded static cache of entities with cache backends.

Adds a new class LruStaticCache which extends from StaticCache. This implements the same LRU logic as the previous patches. Added support for invalidate/invalidateTags etc. and new unit tests.

Next step would be to swap the service definition for static_cache to use the LRU version.

To see if this works we really want a script that creates then loads a few thousand nodes recording the memory usage before/after.

Status: Needs review » Needs work

The last submitted patch, 32: 1199866-combined.patch, failed testing.

catch’s picture

Status: Needs work » Needs review
catch’s picture

StatusFileSize
new34.15 KB

Unit test is green.

Here's a patch that uses this for the entity cache, if the whole test suite passes then memory test should be viable.

dawehner’s picture

  1. +++ b/core/lib/Drupal/Core/Cache/StaticCache/LruStaticCache.php
    @@ -0,0 +1,164 @@
    +    // If there are still available slots, decrement the counter.
    +    if ($this->availableSlots) {
    +      $this->availableSlots--;
    +    }
    +    // Otherwise remove the least recently used item.
    +    else {
    +      $key = array_shift($this->positions);
    +      unset($this->cache[$key]);
    +    }
    

    What happens if you set a value which is already in the slots itself? Couldn't you optimize to not store one value twice?

  2. +++ b/core/lib/Drupal/Core/Cache/StaticCache/LruStaticCache.php
    @@ -0,0 +1,164 @@
    +    $this->availableSlots++;
    +    $current_position = array_search($cid, $this->positions);
    +    unset($this->positions[$current_position]);
    +    parent::delete($cid);
    

    What happens if the item doesn't exist yet. Shouldn't we check whether there is an item first?

  3. +++ b/core/lib/Drupal/Core/Cache/StaticCache/StaticCache.php
    @@ -0,0 +1,65 @@
    +/**
    + * Defines a static cache implementation.
    + *
    + * Stores cache items in memory using a PHP array.
    + *
    + * @ingroup cache
    + */
    +class StaticCache extends MemoryBackend implements StaticCacheInterface {
    

    It would be nice to explain the difference between MemoryBackend and StaticCache

  4. +++ b/core/lib/Drupal/Core/Config/Entity/ConfigEntityStorage.php
    @@ -104,9 +105,12 @@ class ConfigEntityStorage extends EntityStorageBase implements ConfigEntityStora
    +    $static_cache = isset($static_cache) ? $static_cache : \Drupal::service('static_cache');
    
    +++ b/core/lib/Drupal/Core/Entity/ContentEntityStorageBase.php
    @@ -43,9 +44,12 @@
    +    $static_cache = isset($static_cache) ? $static_cache : \Drupal::service('static_cache');
    
    +++ b/core/lib/Drupal/Core/Entity/EntityStorageBase.php
    @@ -73,18 +74,28 @@
    +  public function __construct(EntityTypeInterface $entity_type, StaticCacheInterface $static_cache = NULL) {
    ...
    +    $this->staticCache = isset($static_cache) ? $static_cache : \Drupal::service('static_cache');
    

    Do we need the BC in every of those 3 levels?

catch’s picture

StatusFileSize
new3.96 KB
new35.82 KB

Patch should fix the first three points of #36, those were bugs in the original patch from four years ago too. Added some extra test coverage for the first case.

With #4, you could extend from any one of those classes individually, so I think we do?

dawehner’s picture

With #4, you could extend from any one of those classes individually, so I think we do?

Won't ConfigEntityStorage and ContentEntityStorageBase fallback to EntityStorageBase, given that we pass along $static_cache to it?

Status: Needs review » Needs work

The last submitted patch, 37: 1199866.patch, failed testing.

dawehner’s picture

Is this really a jenkins issue?

09:03:13 Slave went offline during the build
09:03:13 ERROR: Connection was broken: java.io.EOFException
09:03:13 	at org.jenkinsci.remoting.nio.NioChannelHub$3.run(NioChannelHub.java:613)
09:03:13 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
09:03:13 	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
09:03:13 	at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:112)
09:03:13 	at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
09:03:13 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
09:03:13 	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
09:03:13 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
09:03:13 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
09:03:13 	at java.lang.Thread.run(Thread.java:745)
09:03:13 

The last submitted patch, 37: 1199866.patch, failed testing.

catch’s picture

Won't ConfigEntityStorage and ContentEntityStorageBase fallback to EntityStorageBase, given that we pass along $static_cache to it?

If you instantiate ConfigEntityStorage (which we do in unit tests), then you want to pass in the dependencies to the constructor. Without the unit test updates to pass it in, the unit tests fail on the service not being available in the \Drupal call. Not sure exactly what you think can be ditched here tbh.

dawehner’s picture

StatusFileSize
new4.16 KB

I was simply thinking of this ...

Version: 8.3.x-dev » 8.4.x-dev

Drupal 8.3.0-alpha1 will be released the week of January 30, 2017, which means new developments and disruptive changes should now be targeted against the 8.4.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.4.x-dev » 8.5.x-dev

Drupal 8.4.0-alpha1 will be released the week of July 31, 2017, which means new developments and disruptive changes should now be targeted against the 8.5.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

andypost’s picture

Issue tags: +Needs reroll
jofitz’s picture

Status: Needs work » Needs review
Issue tags: -Needs reroll
StatusFileSize
new34.71 KB

Re-rolled.

Status: Needs review » Needs work

The last submitted patch, 47: 1199866-47.patch, failed testing. View results
- codesniffer_fixes.patch Interdiff of automated coding standards fixes only.

Version: 8.5.x-dev » 8.6.x-dev

Drupal 8.5.0-alpha1 will be released the week of January 17, 2018, which means new developments and disruptive changes should now be targeted against the 8.6.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

catch’s picture

Status: Needs work » Needs review
StatusFileSize
new8.4 KB
catch’s picture

Issue summary: View changes
amateescu’s picture

I recently found this implementation of a in-memory LRU cache: https://github.com/aws/aws-sdk-php/blob/master/src/LruArrayCache.php and it seems a bit simpler than the one from the current patch. We would simply work on the \Drupal\Core\Cache\MemoryBackend::$cache variable, without the need for $positions and $availableSlots. What do you think?

catch’s picture

That seems much simpler, but I'm sure there was a reason we didn't go for that approach in the first place though. Wondering if it's consistency between numeric and string cache IDs?

Maybe there wasn't a very good reason for the current approach though. We have the test coverage here so it should be easy enough to verify.

catch’s picture

StatusFileSize
new35.82 KB
new5.97 KB

OK so I can only think of two reasons for the current complexity, neither are good:

1. The current code avoids using count(), but the extra complexity involved is not worth it and it's only needed on set anyway.

2. The ::invalidate() logic needs some extra care because an array_unshift() on a numeric array will reindex the array, but we can use array_reverse() to put an item at the beginning of an array without a re-index.

Broke the tests a couple of times when implementing this, so slightly improved the test assertions to make debugging easier.

So here's a new patch which is closer to the implementation in https://github.com/aws/aws-sdk-php/blob/master/src/LruArrayCache.php

$ diffstat interdiff-1199866-50-54.txt 
 lib/Drupal/Core/Cache/MemoryCache/LruMemoryCache.php |  115 ++++---------------
 tests/Drupal/Tests/Core/Cache/LruMemoryCacheTest.php |   11 +
 2 files changed, 34 insertions(+), 92 deletions(-)

Status: Needs review » Needs work

The last submitted patch, 54: 1199866.patch, failed testing. View results

catch’s picture

Status: Needs work » Needs review
StatusFileSize
new7.18 KB

interdiff was right, patch was the wrong file :(

amateescu’s picture

Nice, that looks much better!

Just a few more nitpicks left:

  1. +++ b/core/lib/Drupal/Core/Cache/MemoryCache/LruMemoryCache.php
    @@ -0,0 +1,111 @@
    +use Drupal\Core\Cache\Cache;
    +
    +
    +/**
    
    +++ b/core/tests/Drupal/Tests/Core/Cache/LruMemoryCacheTest.php
    @@ -0,0 +1,120 @@
    +  }
    +
    +
    +}
    

    Extra empty lines here :)

  2. +++ b/core/tests/Drupal/Tests/Core/Cache/LruMemoryCacheTest.php
    @@ -0,0 +1,120 @@
    +   * Test getting, setting and deleting items from the static cache.
    +   */
    +  public function testGetSetDelete() {
    

    We should also specify which methods are covered by this test.

  3. +++ b/core/tests/Drupal/Tests/Core/Cache/LruMemoryCacheTest.php
    @@ -0,0 +1,120 @@
    +  public function testInvalidate() {
    

    Should have

    @covers ::invalidate
    @covers ::invalidateMultiple
    

Also, I don't see any test coverage for invalidateTags(). Do you think it would be useful or not really?

catch’s picture

StatusFileSize
new42.08 KB
new7.38 KB

Fixed 1-3.

Invalidate tags... to me the combination of the existing coverage for MemoryBackend::invalidateTags() and the coverage for ::invalidate() here seems like enough, we'd be testing one method calls another method in terms of actual logic.

On the other hand it wouldn't hurt. Did not include that for now.

catch’s picture

StatusFileSize
new1.5 KB

interdiff without the cruft.

catch’s picture

StatusFileSize
new629 bytes
new7.38 KB

grrr

amateescu’s picture

Status: Needs review » Reviewed & tested by the community

I don't feel strongly about covering invalidateTags(), so I think this is ready :)

hchonov’s picture

+++ b/core/lib/Drupal/Core/Cache/MemoryCache/LruMemoryCache.php
@@ -0,0 +1,110 @@
+    $diff = count($this->cache) - $this->allowedSlots;
+    // Remove items from the cache entry until it's back to the allowed number
+    // of slots.
+    if ($diff > 0) {
...
+      while ($diff) {
...
+        $diff--;

Under which circumstances could diff be bigger than 1?

The last submitted patch, 58: 1199866-58.patch, failed testing. View results
- codesniffer_fixes.patch Interdiff of automated coding standards fixes only.

alexpott’s picture

Status: Reviewed & tested by the community » Needs work
+++ b/core/lib/Drupal/Core/Cache/MemoryCache/LruMemoryCache.php
@@ -0,0 +1,110 @@
+   *   The number of slots to allocate for items in the cache.

(Optional) - I'd default this to 100 and always set it. That way we'd lose a condition and the default value is exposed by IDEs when you construct.

catch’s picture

@hchonov so it should be zero, but that reminded me we should probably handle the case where there is a setMultiple() of more items than the allowed number of items. Do we want to temporarily go over the limit (but enforce it next time there's a set) or enforce it even if it means immediately discarding things?

@alexpott that makes sense.

berdir’s picture

* Do we have an issue where we're actually going to use this for the entity cache (there's the one about node_load() mentioned, we could use that I guess?)
* Would 100 make sense for the entity cache.. we have to consider that this is for all entity types in the system AFAIK, so 100 isn't *that* much, especially if we start to statically cache config entities more than we do now? If not, should increase the default and/or expose this as a container parameter/service argument (possibly both?)
* Issue summary says this hasn't been profiled yet, do we need to do that? Or will we profile this instead when used in action as the entity cache backend?

catch’s picture

For the entity cache the limit ought to be at least 300 since that's the maximum default comments per page, and we started static caching them in 8.x iirc.

However I'm not convinced we want to use this for the entity cache by default, I was thinking more that drush and similar could replace the service specifically for long-running operations.

dawehner’s picture

  1. +++ b/core/lib/Drupal/Core/Cache/MemoryCache/LruMemoryCache.php
    @@ -0,0 +1,110 @@
    +      // Move the item to the end of the array.
    +      unset($this->cache[$cid]);
    +      $this->cache[$cid] = $cached;
    

    How about expand the documentation to tell why we are dong that. "Move the item to the end of the array, so it'll be cleaned up last".

  2. +++ b/core/lib/Drupal/Core/Cache/MemoryCache/LruMemoryCache.php
    @@ -0,0 +1,110 @@
    +    // Remove items from the cache entry until it's back to the allowed number
    +    // of slots.
    +    if ($diff > 0) {
    +      reset($this->cache);
    +      while ($diff) {
    +        unset($this->cache[key($this->cache)]);
    +        next($this->cache);
    +        $diff--;
    +      }
    +    }
    

    May I ask why this code cannot just be using array_slice?

ndobromirov’s picture

Array slice makes a copy of the array for big arrays it could be a memory issue but array_splice could work-around that though. Overall agree with #68.2

amateescu’s picture

We can't use array_splice() either because it doesn't preserve numeric keys.

catch’s picture

Status: Needs work » Needs review
StatusFileSize
new7.55 KB
new1.71 KB

This addresses #64 and #68 (see interdiff). I've made the default limit 300, since we static cache comments and 300 comments are allowed per page via configuration.

It does not address hchonov's #62.

Here's what I think we should do, but would like a second opinion before making the change either way:

At the moment the patch allows the number of items to be progressively reduced down to the limit when an item is set. However we have no way to end up going over the limit, so the while loop there is redundant. It could be simplified to "one in, one out".

There's an alternative though, which is to allow a setMultiple() to add all the items passed in to the cache (even if it's 1000). We could do this by overriding setMultiple, temporarily setting the items limit to the count of items passed in, then setting it back at the end. Then the next item added to the cache would bring things down to the limit again. I was tempted by this originally, but leaning towards just sticking to the limit and simplifying the code - next comment will have that patch.

catch’s picture

StatusFileSize
new1.25 KB
new7.5 KB

OK here's the simplified version, removes the while loop, slightly amends the comment but retains pointing out why we don't use the various array_*() functions to take the item off.

amateescu’s picture

I think going over the limit could lead to out of memory errors, right? So I agree that we should keep it simple for now and stick to the limit until we have some evidence that going over the limit is actually useful at some point :)

The last submitted patch, 71: 1199866-71.patch, failed testing. View results

Version: 8.6.x-dev » 8.7.x-dev

Drupal 8.6.0-alpha1 will be released the week of July 16, 2018, which means new developments and disruptive changes should now be targeted against the 8.7.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.7.x-dev » 8.8.x-dev

Drupal 8.7.0-alpha1 will be released the week of March 11, 2019, which means new developments and disruptive changes should now be targeted against the 8.8.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

grahl’s picture

This looks great, what's missing for it to be added?

ghost of drupal past’s picture

Status: Needs review » Needs work

Thanks, this is awesome and useful. Two possible enhancements:

  1. +++ b/core/lib/Drupal/Core/Cache/MemoryCache/LruMemoryCache.php
    @@ -0,0 +1,109 @@
    +      reset($this->cache);
    +      unset($this->cache[key($this->cache)]);
    

    https://www.php.net/manual/en/function.array-key-first.php it has a polyfill, too.

  2. +++ b/core/lib/Drupal/Core/Cache/MemoryCache/LruMemoryCache.php
    @@ -0,0 +1,109 @@
    +      unset($this->cache[$cid]);
    +      $reversed = array_reverse($this->cache);
    +      $reversed[$cid] = $item;
    +      $this->cache = array_reverse($reversed);
    

    If I gather correctly you want to move the item to end and it is already unset. What about $this->cache += [$cid => $item] instead of the double array_reverse https://3v4l.org/82ZV8

hchonov’s picture

+++ b/core/lib/Drupal/Core/Cache/MemoryCache/LruMemoryCache.php
@@ -0,0 +1,109 @@
+      reset($this->cache);
+      unset($this->cache[key($this->cache)]);

https://www.php.net/manual/en/function.array-key-first.php it has a polyfill, too.

array_key_first() is only available for PHP 7 >= 7.3.0.

------

+  public function invalidate($cid) {
+    parent::invalidate($cid);
+    // Move the item to the least recently used position if it's not already
+    // there. This cannot use array_unshift() because it would reindex an array
+    // with numeric cache IDs.
+    if (isset($this->cache[$cid])) {
+      $item = $this->cache[$cid];
+      unset($this->cache[$cid]);
+      $reversed = array_reverse($this->cache);
+      $reversed[$cid] = $item;
+      $this->cache = array_reverse($reversed);
+    }
+  }

Why invalidating an entity makes it the least recently used one? If the invalidated entity is loaded by setting $allow_invalid = TRUE, then it will become the most recently used one. Therefore I don't really see a reason for the overhead of reordering the entities when they are being invalidated. Yes, one could argue that we want to remove first the invalidated entities when there is no place left, but then this isn't pure LRU implemention anymore and I can see people complaining about this inconsistency. I would rather leave the invalidation part away from the LRU implemention.

ghost of drupal past’s picture

Sorry I didn't elaborate enough: array_key_first is 7.3.0 but that's okay because the manual gives us a very simple polyfill and if we add said polyfill then all code under Drupal can enjoy that useful feature before 7.3 is required and don't need to change once it is. (As a footnote, I really hope we won't rush with 7.3 requirement, that serialize business is very problematic.)

vijaycs85’s picture

If I gather correctly you want to move the item to end and it is already unset. What about $this->cache += [$cid => $item] instead of the double array_reverse https://3v4l.org/82ZV8

I am not sure if it is true. here is current vs +=: https://3v4l.org/JYCGa

ghost of drupal past’s picture

Ah. Thanks for the demo. $this->cache = [$cid => $item] + $this->cache is still better than two array_reverses :) bonus: you don't need the unset. https://3v4l.org/6uEX1

vijaycs85’s picture

#82: +1. nice!

fgm’s picture

Regarding #67 / #67 for the number of keys, lots of D8 sites are using Paragraphs, which tends to cause an explosion of the number of entities actually loaded on node loads. So 100, or even 300, is probably erring on the low side. I'd likely add at least one zero, especially if this is only an optionally switchable service for drush and other non-default environments, which have higher memory resources than the minimal web-oriented defaults.

vijaycs85’s picture

+++ b/core/lib/Drupal/Core/Cache/MemoryCache/LruMemoryCache.php
@@ -0,0 +1,109 @@
+  protected $allowedSlots = 100;

100 could be removed here as constructor has default 300.

ndobromirov’s picture

Is there a new service that will allow custom code to use that? For example cache.memory.lru. I was not able to see a service definition on the patch.

What are the plans to use the LRU for, except the discussed entity cache?

Have you considered the new PHP's (7.4) weak references?
This way GC will handle the clean-up when there is memory pressure.

andypost’s picture

@ndobromirov nice point about php version, it sounds like needs polyfill but mostly reminds me about "hot cache preload" discussions years ago, means it very probably will be implemented in contrib first like chainedFast and so on

ghost of drupal past’s picture

While WeakReference would be nice, I can't imagine how that could be polyfilled so I think that is not useful for core for the time being (7.4 is not even out yet).

catch’s picture

@ndobromirov on weakreferences, this issue is a spin-off from #1237636-7: Lazy load multiple entities at a time using fibers and I wanted to do a weakreferences version since the start, but had been waiting for the non-weakref version (i.e. this patch) to land before discussing this. Cool that it's gone from an rfc to PECL to core in the intervening eight years though. We should open an actual follow-up although contrib might be a better option until we require PHP 7.4 (which would be Drupal 9 at the earliest).

entity caching is the only real use-case we have at the moment, although we've started using the memory cache for other services. See #3047289: Standardize how we implement in-memory caches and #2488350: Switch to a memory cache backend during installation, so we'll potentially have more use-cases soon.

@chx - we could maybe bring in this polyfill? It seems a bit overkill for just this patch but it would be very useful in general. https://github.com/symfony/polyfill-php73. Other changes look good too.

@hchonov that was the idea behind invalidate, but I'm fine with removing that hunk from the patch.

Version: 8.8.x-dev » 8.9.x-dev

Drupal 8.8.0-alpha1 will be released the week of October 14th, 2019, which means new developments and disruptive changes should now be targeted against the 8.9.x-dev branch. (Any changes to 8.9.x will also be committed to 9.0.x in preparation for Drupal 9’s release, but some changes like significant feature additions will be deferred to 9.1.x.). For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

andypost’s picture

Follow up for weak references probably should go into 9.1

catch’s picture

Version: 8.9.x-dev » 9.1.x-dev

Drupal 8.9.0-beta1 was released on March 20, 2020. 8.9.x is the final, long-term support (LTS) minor release of Drupal 8, which means new developments and disruptive changes should now be targeted against the 9.1.x-dev branch. For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

Version: 9.1.x-dev » 9.2.x-dev

Drupal 9.1.0-alpha1 will be released the week of October 19, 2020, which means new developments and disruptive changes should now be targeted for the 9.2.x-dev branch. For more information see the Drupal 9 minor version schedule and the Allowed changes during the Drupal 9 release cycle.

heddn’s picture

Status: Needs work » Needs review
StatusFileSize
new7.42 KB
new1.08 KB

Does minimal changes to respond to the latest feedback on the patch. Doesn't address concerns with invalidate from @hchonov.

andypost’s picture

@catch there's implementation in symfony which already in dependencies https://github.com/symfony/polyfill/issues/228#issuecomment-591843988

heddn’s picture

We also depend on 7.3 at this point in the drupal core development lifecycle, so even better.

catch’s picture

It looks like the weakmap polyfill didn't make it into Symfony, although it's available for PHP 7.4 as its own library. Either way, nine years later have opened #3190992: Add a WeakReference memory cache implementation. Now that Drupal 10 gives us an actual timeline for requiring PHP 8, we could potentially jump over this issue and just do weakref/weakmap instead.

catch’s picture

andypost’s picture

neclimdul’s picture

Status: Needs review » Needs work

Since we're back here I did a review. Everything applied cleanly since its just new classes

I think I echo a lot of other reviews in that it feels like the array mangling could be optimized but pretty sure it would need to be decoupled from the MemoryCache and add a lot of complexity. I wouldn't hold up committing it on that since there's no core usage and its sat long enough. We can work within the interface with faster implementations if someone comes up with something clever.

+++ b/core/lib/Drupal/Core/Cache/MemoryCache/LruMemoryCache.php
@@ -0,0 +1,106 @@
+      return parent::set($cid, $data, $expire, $tags);
+    }
+    $return = parent::set($cid, $data, $expire, $tags);

These are void methods so we don't need to worry about the returns. I don't see any tests of the returns so easy fix.

Version: 9.2.x-dev » 9.3.x-dev

Drupal 9.2.0-alpha1 will be released the week of May 3, 2021, which means new developments and disruptive changes should now be targeted for the 9.3.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

jeroent’s picture

Status: Needs work » Needs review
StatusFileSize
new7.38 KB
new1.17 KB

I removed the returns as mentioned in #101.

jeroent’s picture

StatusFileSize
new7.71 KB
new2.12 KB
jeroent’s picture

StatusFileSize
new7.71 KB
new1.76 KB
mxh’s picture

Status: Needs review » Needs work

Thanks @JeroenT for applying the suggestions. I reviewed #105 and it looks that we're on a good track here.

Some notes:

1.

<?php
+  protected $allowedSlots = 100;
[...]
+  public function __construct(int $slots = 300) {
[...]
?>

Can we please be more consistent here and at least have one default value? I'm fine with removing the default value of the property here.

I'd recommend to see that added LRU in-memory cache only as an additional goodie that site owners may use to replace existing caches, as they are available as services and thus replaceable as such anyway. Changing existing ones could mean major side effects on existing installations, and those who had problems with memory overflows, they mostly have it solved already on their own, also maybe using a custom strategy that works well for their site, but the LRU could destroy it.

2.

<?php
$diff = count($this->cache) - $this->allowedSlots;
?>

Guess this was already discussed, still the count doesn't feel right to me (unless PHP is already heavily optimized when counting arrays, then it's fine). I'm thinking of having large lists with super-tiny items, where I'd prefer an increment property and safely rely on it (I mean the cache property is protected, so what's the deal with not having a simple incement?).

neclimdul’s picture

re #106
1. I noticed that too. I think you just don't provide the default on the property if its always set in the constructor.
2. Its a function call but my understanding (and reading of the source just now) is that the function is just going to return the value of a property on the zval. A lot of time we try to avoid count() and try to use empty() because of that function call but being on the write side it seems appropriate here.

mxh’s picture

I'd agree when the count would happen within setMultiple, but it's being called whenever a single item is being added. It's probably only of a minor concern, but would be great to have it as efficient as possible right away so that it doesn't hurt also when handling lots of (tiny) items.

Version: 9.3.x-dev » 9.4.x-dev

Drupal 9.3.0-rc1 was released on November 26, 2021, which means new developments and disruptive changes should now be targeted for the 9.4.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 9.4.x-dev » 9.5.x-dev

Drupal 9.4.0-alpha1 was released on May 6, 2022, which means new developments and disruptive changes should now be targeted for the 9.5.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 9.5.x-dev » 10.1.x-dev

Drupal 9.5.0-beta2 and Drupal 10.0.0-beta2 were released on September 29, 2022, which means new developments and disruptive changes should now be targeted for the 10.1.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

andypost’s picture

Does it still makes sense?

robin.houtevelts’s picture

I think we are missing a return statement in the if-condition.

<php
+++ b/core/lib/Drupal/Core/Cache/MemoryCache/LruMemoryCache.php
+    // If the item is already in the cache, just move it to the first position.
+    if (isset($this->cache[$cid])) {
+      unset($this->cache[$cid]);
+      parent::set($cid, $data, $expire, $tags);
+    }
+    parent::set($cid, $data, $expire, $tags);

Version: 10.1.x-dev » 11.x-dev

Drupal core is moving towards using a “main” branch. As an interim step, a new 11.x branch has been opened, as Drupal.org infrastructure cannot currently fully support a branch named main. New developments and disruptive changes should now be targeted for the 11.x branch, which currently accepts only minor-version allowed changes. For more information, see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

gapple made their first commit to this issue’s fork.

gapple’s picture

Updated the patch from #105 in a MR

- Move allowedSlots property declaration to constructor; replaces mismatched default value on property (#106)
- Added missing early return in set() (#113)
- Override setMultiple() to only count() extra items once before removal (#108)
- Code style fixes

jweowu’s picture

Just cross-referencing with another old issue: https://www.drupal.org/project/drupal/issues/1961934#comment-7274730

Back when I raised that I was seeing a pattern of entity cache problems, and observed that a LRU approach was similarly problematic in the circumstances in question ("With LRU and limited memory, you can easily throw out everything you want to keep"), so I proposed a way of providing hints to the cache to enable it to be smarter.

I haven't tried to catch up with what's currently happening in this issue, so apologies in advance if it's not relevant any more; but seeing as how there's current activity on this LRU caching issue, I thought I'd add a link in case it was still a going concern.

The example scenario was:

* You are looping through a potentially-large list of entities (e.g. apache solr doing an indexing cron run).
* Each entity must be loaded for indexing, but you have no reason to expect anyone else to be requesting it in the near future.
* Conversely, the entity load cache is full of things which users have been actively requesting, so we want to retain all of those entities.
* So for each indexing loop iteration we want to ensure that the indexed entity only ends up in the load cache if it was already there beforehand. Any freshly-loaded entities in the indexing loop should be discarded from memory, uncached; because despite being "more recently used" we don't need them -- caching them all would merely push out a lot of much-more-useful cache entries.

alexpott’s picture

@jweowu we're talking about static memory caches here so what other users are doing is not really relevant. A static memory cache is empty at the start of every request.

jweowu’s picture

Very true, "other users" are only relevant to the persistent caches (I was considering both in the other issue).

I don't believe that renders the notion irrelevant, though. After all, no cache-expiry scheme has any advantage over any other if the cache doesn't fill up. We're only concerned about what happens when, for whatever reason, the cache fills up, and my suggestion is that giving the programmer some control over what should be flushed and what should be kept in those situations would be an improvement, and that a simple LRU approach isn't always going to be the best approach.

catch’s picture

Would also add the original issue described for Drupal 7 was fixed for Drupal 8+ in this issue #1596472: Replace hard coded static cache of entities with cache backends - which is what allows this one to happen cleanly e.g. the entity static cache can already be reset independently of the persistent cache.

Adding a reference to #2577417: Add an entity iterator to load entities in chunks too which would help when iterating over a very large number of entities.

alexpott’s picture

Status: Needs work » Needs review

Pushing on this one again. The entity static cache filling up on long running jobs is an issue I run into quite often work around and then when I hit it again I get sad that we're get to fix this.

catch’s picture

Status: Needs review » Reviewed & tested by the community

I opened this issue, but the last time I worked on the actual code here was 7 years ago, so I think I can RTBC it.

kristiaanvandeneynde’s picture

I'll have a quick look myself to see if I can +1 this.

kristiaanvandeneynde’s picture

Status: Reviewed & tested by the community » Needs work

Okay so the code will work, I really like the thorough test coverage and the fact that we uncovered MemoryCache itself didn't have a test extending GenericCacheBackendUnitTestBase yet. Only remark I have in general is that some methods are dedicated to numeric keys while other methods cover both string and numeric keys in one go. But don't let that detract from the overall excellent coverage.

I found a tiny nit regarding the word pidgin and think the test could benefit from renaming the assertion and an extra comment or two. This is why I'm setting back to NW.

The comment about moving the shared logic I'll leave up to you to decide, as MemoryBackend also seems to copy-paste things across methods occasionally. So I wouldn't be too bothered if we copy-paste a few lines of code here too. So feel free to RTBC without addressing this.

Now that this introduces another memory cache, would it be prudent to add a line to MemoryCache and LruMemoryCache's docblocks to tell people they should declare these bins using the cache.bin.memory tag?

alexpott’s picture

Status: Needs work » Needs review

I fixed 3/4 of the comments on the MR. There was one I didn;'t think needed doing.

Re the cache.bin.memory tag - interestingly the entity.memory_cache does not use this tag and it's not really managed like other cache bins or memory cache bins. I think we should consider adding documentation about somewhere but I don't think it should be in this issue. I also think we could consider defaulting to a LRU cache for all memory caches.

kristiaanvandeneynde’s picture

Status: Needs review » Reviewed & tested by the community

One final nit, but can be accepted upon commit.

I agree that we should create a follow-up for checking if we have sufficient documentation re cache.bin.memory and why entity storages don't use it like that.

alexpott’s picture

I tested this code on a long running batch process from the Entity Usage module - to build the entity_usage table from a large site 30,000+ nodes and 200000+ paragraphs using a module I made with this code - see https://www.drupal.org/project/entity_lru_cache - and it makes a massive difference. Memory no longer spirals to 90% of VM I'm running the job in and everything is a bit quicker too.

kristiaanvandeneynde’s picture

Oh wow, that's great. We should consider making it the default then, no?

alexpott’s picture

nicxvan’s picture

I'm running a migration that usually takes 40 hours ish.
As a control I first am running it with attemptMemoryReclaim commented out on Drupal 11.1.0

I will add info from ps aux --sort -rss |grep 'drush' too periodically.
I will update periodically.
It's several migrations chained
11280/81350 8min 43s/1hr 2min 24MiB
24038/81350 22min 37s/1hr 15min 26MiB
41000/81350 40min/1hr 19min 26.5MiB
Memory 23.3% VSZ690000 RSS107000

@alexpott on Slack said I should see a difference almost immediately, I will kill it and compare after adding this:

composer require drupal/entity_lru_cache
drush en entity_lru_cache -y

I am running it with attemptMemoryReclaim commented out on Drupal 11.1.0 and the above module enabled.
11000/81350 7min 5s/52min 30s 24Mib
20.7%
25000/81350 22min 8s/1hr 11min 26.0MiB
22.4%
71000/ 1hr 10min/1hr 18min 31MiB
22.3%

Gonna wait for some of the bigger migrations.

We've gotten to the bigger migration, this is with the LRU cache
1682/1146871 21min/10days3hrs 710MiB
82%
2134/1146871 41min/15days 12hrs 732MiB
91.8%
Ok I killed it when it got to 27 days.

I am rerunning 11.x base just the long migration.

alexpott’s picture

Discussed sizing a bit @catch and we agreed to remove the default as any code that uses a LRU cache should consider the number of slots for the use-case.

nicxvan’s picture

Since my last comment I ran the migration on production with and without the LRU cache. Both took 39 hours.

@alexpott mentioned this is a positive result since it's the same, this update will either improve or do nothing, my migration helps prove if it's not needed it does not make the situation worse.

andypost’s picture

chi’s picture

Wonder if we could go with LFU cache instead. Will the implementation be more complex?

catch’s picture

LFU would be much more complex yes, you have to keep a count of how often an item has been accessed and then evict the least frequently accessed item. And the current implementation is being careful to add as little CPU overhead as possible.

In general on a normal web request, we've set the limits high enough that items should never be evicted, so it will only kick in for migrations and similar tasks, or pages that load an extreme number of entities for some reason.

needs-review-queue-bot’s picture

Status: Reviewed & tested by the community » Needs work
StatusFileSize
new2.62 KB

The Needs Review Queue Bot tested this issue. It fails the Drupal core commit checks. Therefore, this issue status is now "Needs work".

This does not mean that the patch necessarily needs to be re-rolled or the MR rebased. Read the Issue Summary, the issue tags and the latest discussion here to determine what needs to be done.

Consult the Drupal Contributor Guide to find step-by-step guides for working with issues.

alexpott’s picture

Status: Needs work » Reviewed & tested by the community

Fixed the PHPStan issue.

needs-review-queue-bot’s picture

Status: Reviewed & tested by the community » Needs work
StatusFileSize
new2.39 KB

The Needs Review Queue Bot tested this issue. It fails the Drupal core commit checks. Therefore, this issue status is now "Needs work".

This does not mean that the patch necessarily needs to be re-rolled or the MR rebased. Read the Issue Summary, the issue tags and the latest discussion here to determine what needs to be done.

Consult the Drupal Contributor Guide to find step-by-step guides for working with issues.

kristiaanvandeneynde’s picture

Status: Needs work » Reviewed & tested by the community

Fixed the sniffer error using a suggestion.

catch’s picture

Tagging for release highlights, would just be a sentence in the performance improvements section but this is a good step towards making long-running processes easier to manage.

nod_ made their first commit to this issue’s fork.

  • nod_ committed 4a3fbdbb on 11.x
    Issue #1199866 by alexpott, catch, gapple, jeroent, heddn,...

nod_’s picture

Status: Reviewed & tested by the community » Fixed

Committed 4a3fbdb and pushed to 11.x. Thanks!

catch’s picture

Really nice to get this in nearly 14 years after opening it!

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.