Under normal circimstances caching every node into the static cache is ok because usually at a guess only 10-20 nodes would be loaded on every page request.

However when doing conversions or in situations when a lot of nodes needs to be loaded at one time this can cause a WSOD because of memory issues.

With the #233091: Move code from update.php into includes/update.inc patch which will allow upgrades to be done from the shell instead of via the browser which can take much longer it is quite possible to load all of the nodes in a system into memory.

This change implements a basic less used removed first purging policy to drop nodes which are just taking up memory and not to be used. Under normal situations this should not be done, but in extreme use cases to will keep the memory useage by the node_load() at a reasonable level.

I am not sure if 20 is a good or bad number, but for testing the purging it was much better to have a lower number to test. I have a couple of drupal_set_messages() which need to be removed, but this to see how often the cache is being purged as it should not be done too much.

Files: 
CommentFileSizeAuthor
#34 lru-375494-34.patch5.25 KBmikeryan
PASSED: [[SimpleTest]]: [MySQL] 29,436 pass(es). View
#31 lru-2.patch5.21 KBdhthwy
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch lru-2.patch. View
#24 lru.patch5.17 KBmoshe weitzman
FAILED: [[SimpleTest]]: [MySQL] 18,757 pass(es), 1 fail(s), and 0 exception(es). View
#21 cache_lru.patch1.74 KBcatch
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch cache_lru_0.patch. View
#19 cache_lru.patch1.74 KBcatch
FAILED: [[SimpleTest]]: [MySQL] 18,134 pass(es), 12 fail(s), and 8,988 exception(es). View
#13 0001-Restrict-caching-of-Nodes-to-reduce-memory-375494.patch4.73 KBgordon
Failed: Failed to apply patch. View
#7 0001-Restrict-caching-of-Nodes-to-reduce-memory-375494.patch4.83 KBgordon
Passed: 12421 passes, 0 fails, 0 exceptions View
#4 node_load_cache.patch1.73 KBgordon
Failed: Failed to apply patch. View
node_load_cache.patch1.73 KBgordon
Failed: 10267 passes, 1 fail, 0 exceptions View

Comments

kbahey’s picture

I support this for the very reason you mentioned. Running any module with a large number of nodes, fails with blowing PHP's memory limit.

The node module is to blame, but equally the function taxonomy_get_tree() and taxonomy_term_count_nodes() are big culprits. So perhaps you can include them in your patch?

gordon’s picture

I don't think they should be done here. taxonomy_get_tree() is a bad one mainly because it loads the entire tree even if you only want a part of it. I have played with this in the past which would build the tree from where you asked for it, so if you were 3 levels down it would reduce the size of the memory needed, esp useful when you have 120000+ terms.

I think we should set these up as different issues. Make it easier to get them into core.

Status: Needs review » Needs work

The last submitted patch failed testing.

gordon’s picture

Status: Needs work » Needs review
FileSize
1.73 KB
Failed: Failed to apply patch. View

Cleaned up patch to work again.

Berdir’s picture

Status: Needs review » Needs work

A few minor things...

+    $nodes+= $queried_nodes;

- Missing whitespace...

// TODO: These messages need to be removed before this is committed.
drupal_set_message(format_plural(count($node_cache)-NODE_LOAD_MAX_CACHE, 'Purge @count node from static cache', 'Purge @count nodes from static cache'), 'warning');

- You shouldn't add stuff to patches that shouldn't be commited :)

- Whouldn't array_slce() be easier to use, given that we only need one chunk? http://ch.php.net/manual/en/function.array-slice.php

gordon’s picture

thanks I will post another patch.

The main reason for the bits to remove was to help stimulate conversation over what NODE_LOAD_MAX_CACHE should be set to. But I will put it to 30, as I think this is something that any more will be just taking up memory for the sake of it.

gordon’s picture

Status: Needs work » Needs review
FileSize
4.83 KB
Passed: 12421 passes, 0 fails, 0 exceptions View

I have fixed up all the issues as listed above.

I also have written a couple of new tests to confirm the new restrictions on the size of the node cache is preforming as expected.

catch’s picture

Had been discussing a similar thing with Damien recently but I don't think there's an issue yet. This could also be centralized for all first class objects if #460320: Standarized, pluggable entity loading (nodes, users, taxonomy, files, comments) gets in.

Damien Tournoud’s picture

Indeed, we discussed with catch implementing an in-memory LRU cache. I even did some testing, but failed to open an issue yet.

moshe weitzman’s picture

If one is doing strange things like iterating over all nodes, then one can clear the cache every so often. once might argue that this is too specialized for core. i'm not very opinionated either way.

gordon’s picture

Yes you can do this, but sometime you don't know how many are loaded, as you could have places where nodes are loading nodes. Also sometimes you don't know.

Also with using things like panels/views a lay-person can easily do massive queries on the page quite easy it will help to keep down the memory footprint.

Gordon

moshe weitzman’s picture

Status: Needs review » Needs work

OK, I remove my objections. This is a good idea.

I would love to change the constant to a variable_get() so that sites can change the cache size as needed. A precedent for that is variable_get('session_write_interval', 180) in _drupal_session_write(). I will RTBC after that change.

If the entity loader gets in first, this should move into there.

gordon’s picture

Status: Needs work » Needs review
FileSize
4.73 KB
Failed: Failed to apply patch. View

I have made the change so that this is a variable_get() and can be set via $conf

moshe weitzman’s picture

Status: Needs review » Reviewed & tested by the community

Status: Reviewed & tested by the community » Needs work

The last submitted patch failed testing.

catch’s picture

This should now go into the entity.inc static caching (which is based on node_load_multiple()) so ought to be more or less cut and paste. Only question then is do we restrict the number held per entity type - you might easily have 300 taxonomy terms on a page, compared to 300 nodes for example.

moshe weitzman’s picture

Anyone up for a reroll? I think 100 items is a nice number. Could be lower I suppose.

catch’s picture

Component: node system » base system

entity.inc needs the patch now.

catch’s picture

Status: Needs work » Needs review
FileSize
1.74 KB
FAILED: [[SimpleTest]]: [MySQL] 18,134 pass(es), 12 fail(s), and 8,988 exception(es). View

Just ran into a huge memory leak when running devel_generate() on article nodes, due to a node_load() in file_field_update(). Here's a re-roll minus the tests putting this in entity.inc just to see what the test bot says.

Status: Needs review » Needs work

The last submitted patch, cache_lru.patch, failed testing.

catch’s picture

Status: Needs work » Needs review
FileSize
1.74 KB
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch cache_lru_0.patch. View

Minus the notreallycamelcase.

moshe weitzman’s picture

#21: cache_lru.patch queued for re-testing.

Status: Needs review » Needs work

The last submitted patch, cache_lru.patch, failed testing.

moshe weitzman’s picture

Status: Needs work » Needs review
FileSize
5.17 KB
FAILED: [[SimpleTest]]: [MySQL] 18,757 pass(es), 1 fail(s), and 0 exception(es). View

Rerolled with tests and even better docs. I could use some help on the remaining test failure. Seems like it is a real bug in the LRU code, not in the test.

Status: Needs review » Needs work

The last submitted patch, lru.patch, failed testing.

catch’s picture

Looked at this for a little while with + += array_merge() and array_merge_recursive(), as far as I can see if there's a numeric index, inserting new items will always insert them in numeric order with any of those rather than prepend/append the entire array. Didn't break it down to verify this yet.

moshe weitzman’s picture

Would be good to get this in soon.

webel’s picture

Am very interested in a Drupal 6.16 version of this.

I have a large command line script for generating an offline version of a Drupal site with thousands of nodes.

Despite all other memory reduction strategies, the memory usage just grows and grows until the limit is hit,
and as far as I can tell node_load is the culprit.

webel’s picture

Perhaps a little of topic, but I am using the following and nevertheless memory grows with every node_load:

   $revision = null;
   $reset_cache = true;
   $node = node_load($nid,$revision,$reset_cache);

Grateful for any feedback.

webel’s picture

Moved my questions to support request: #813590: Require assistance with memory usage in node_load

dhthwy’s picture

Status: Needs work » Needs review
FileSize
5.21 KB
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch lru-2.patch. View

an LRU sounds cool.

Previous patch's test failed because the popular node was only cached once, so it never got a chance to get into the front when loaded repeatedly. We need to cacheSet entities even if they're already in the cache for LRU to work right? Headsmacking oversight I think...

retester2010’s picture

#31: lru-2.patch queued for re-testing.

Status: Needs review » Needs work

The last submitted patch, lru-2.patch, failed testing.

mikeryan’s picture

Version: 7.x-dev » 8.x-dev
Status: Needs work » Needs review
FileSize
5.25 KB
PASSED: [[SimpleTest]]: [MySQL] 29,436 pass(es). View

Re-rolled for the 8.x branch.

moshe weitzman’s picture

Status: Needs review » Reviewed & tested by the community

Looks proper to me. Nice tests.

fago’s picture

Status: Reviewed & tested by the community » Needs work
-  protected $entityCache;
+  public $entityCache;

Why that?

--- a/includes/common.inc
+++ b/includes/common.inc
@@ -7287,6 +7287,7 @@ function entity_get_info($entity_type = NULL) {
           'fieldable' => FALSE,
           'controller class' => 'DrupalDefaultEntityController',
           'static cache' => TRUE,
+          'static cache max size' => 100,

If we add a new entity info key, we also need to add docs for it.

   /**
-   * Stores entities in the static entity cache.
+   * Stores entities in the static entity cache. Least recently used (LRU) are
+   * purged in order to honor 'static cache max size' property. Use
+   * hook_entity_info_alter() to change this property.

As to the coding standards, the first line of the comment block should be a full sentence.

+      // Add entities to the cache if we are not loading a revision. We need to
+      // cacheSet entities in the cache as well for LRU.

Out of context, this comment makes me just wonder what LRU means.

geerlingguy’s picture

Subscribe.

catch’s picture

Just opened #1199866: Add an in-memory LRU cache, that is not quite ready for here yet, would be good to have the facility to add multiple items to cache at once, just cross-linking for now.

geerlingguy’s picture

Issue tags: +memory

Tagging.

moshe weitzman’s picture

Issue summary: View changes

Is this still unbouded? We did add a persistent entity cache recently.

Version: 8.0.x-dev » 8.1.x-dev

Drupal 8.0.6 was released on April 6 and is the final bugfix release for the Drupal 8.0.x series. Drupal 8.0.x will not receive any further development aside from security fixes. Drupal 8.1.0-rc1 is now available and sites should prepare to update to 8.1.0.

Bug reports should be targeted against the 8.1.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.2.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

catch’s picture

Version: 8.1.x-dev » 8.3.x-dev
Component: base system » entity system
Assigned: gordon » Unassigned
Status: Needs work » Postponed

Marking postponed on #1199866: Add an in-memory LRU cache. The latest patch here actually solves this issue too.

Version: 8.3.x-dev » 8.4.x-dev

Drupal 8.3.0-alpha1 will be released the week of January 30, 2017, which means new developments and disruptive changes should now be targeted against the 8.4.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.