Migrations invalidate entity caches when trying to reclaim memory, should flush [#2558857]

Problem/Motivation

Migrations use a lot of memory, and the current method of reclaiming memory is often ineffective.

Follow-up from #2503999-32: Large volume entity migrations run out of memory

This issue was to followup on the consequences of invalidating entity caches when reclaiming static memory. Context from catch's comment:

Opened #2558857: Migrations invalidate entity caches when reclaiming memory to track the persistent cache clearing.

There are two cases that can be tricky:

- migrations that need to load entities as part of the migration may slow down a lot - some sites take 10 or 20 hours to migrate so a 10% regression can mean an extra hour or two.
- partial migrations on a live site
And on the subject of the original intent of this issue - the first point isn't really an issue, the cache is only cleared when memory has nearly run out, it will be very infrequent so migrations that load entities will benefit from the cache their entire run except shortly after clearing the cache.

Migrations on a live site (as using migration in a feeds-like way to do regular import) would be the greater concern (although, unless the periodic imports are very large they shouldn't hit the cache clearing code).

Proposed resolution

The current code for reclaiming memory is

    // Entity storage can blow up with caches so clear them out.
    $entity_type_manager = \Drupal::entityTypeManager();
    foreach ($entity_type_manager->getDefinitions() as $id => $definition) {
      $entity_type_manager->getStorage($id)->resetCache();
    }

This is complicated (looping through all entity types). Worse, resetCache() only invalidates cache items, it does not reclaim any memory. Instead, delete the entity-storage cache with

    \Drupal::service('entity.memory_cache')->deleteAll();

Remaining tasks

User interface changes

None

API changes

None

Data model changes

None

Comment	File	Size	Author
#34	2558857-34.patch	829 bytes	heddn
#34	interdiff_30-34.txt	718 bytes	heddn
#30	2558857-30.patch	864 bytes	heddn
#26	core-limit-entity-in-memory-cache.patch	1015 bytes	grahl
#3	migrations_invalidate-2558857-3.patch	732 bytes	andriyun

Comments

Comment #1

27 August 2015 at 21:35

catch created an issue. See original summary.

Comment #2

andypost

he/him

Russian

commented 27 August 2015 at 22:11

Issue summary:

View changes

Comment #3

andriyun commented 27 August 2015 at 22:33

Assigned:	Unassigned	» andriyun
Status:	Active	» Needs review

Status	File	Size
new	migrations_invalidate-2558857-3.patch	732 bytes

Comment #4

andypost

he/him

Russian

commented 27 August 2015 at 22:35

Assigned:	andriyun	» Unassigned
Status:	Needs review	» Reviewed & tested by the community
Issue tags:		+Quickfix

Great!

Comment #5

berdir

German

Switzerland

commented 27 August 2015 at 22:42

Status:

Reviewed & tested by the community

» Needs work

That fixes #32 but I don't think that's why @catch opened this issue? This is about finding a way to only invalidate static caches but not the persistent ones.

Comment #6

mikeryan

he/him

English

Pittsfield, MA, USA

commented 27 August 2015 at 22:45

Umm... This issue was not about the little entityManager() cleanup, that should have been a new, separate minor issue.
This issue was to followup on the consequences of invalidating entity caches when reclaiming static memory. Context from catch's comment:

Opened #2558857: Migrations invalidate entity caches when reclaiming memory to track the persistent cache clearing.

There are two cases that can be tricky:

- migrations that need to load entities as part of the migration may slow down a lot - some sites take 10 or 20 hours to migrate so a 10% regression can mean an extra hour or two.
- partial migrations on a live site

And on the subject of the original intent of this issue - the first point isn't really an issue, the cache is only cleared when memory has nearly run out, it will be very infrequent so migrations that load entities will benefit from the cache their entire run except shortly after clearing the cache.

Comment #7

27 August 2015 at 23:16

The last submitted patch, 3: migrations_invalidate-2558857-3.patch, failed testing.

Comment #8

andriyun commented 28 August 2015 at 12:09

Issue tags:

-Quickfix

1 file was hidden/shown/deleted

Status	File	Size
hidden	migrations_invalidate-2558857-3.patch	732 bytes

Patch for quick fix cleanup moved to issue #2559191: Clean-up migrate executable

Comment #9

andypost

he/him

Russian

commented 8 November 2015 at 22:05

Status:	Needs work	» Reviewed & tested by the community
Issue tags:		+rc eligible, +Quick fix

1 file was hidden/shown/deleted

Status	File	Size
shown	migrations_invalidate-2558857-3.patch	732 bytes

Let's keep the issue as clean-up as #8 said, and continue in #2545632: [PP1] Move memory reclamation out of migrate executable

Comment #10

8 November 2015 at 22:08

Status:

Reviewed & tested by the community

» Needs work

The last submitted patch, 3: migrations_invalidate-2558857-3.patch, failed testing.

Comment #11

damiankloip commented 14 December 2015 at 20:20

Status:

Needs work

» Active

Afaict, that issue won't resolve this one.

Everyone is trying to find workarounds for the simple fact that the ContentEntityStorageBase::resetCache method completely and utterly abuses the API.

Comment #12

14 December 2015 at 20:20

Version:

8.0.x-dev

» 8.1.x-dev

Drupal 8.0.6 was released on April 6 and is the final bugfix release for the Drupal 8.0.x series. Drupal 8.0.x will not receive any further development aside from security fixes. Drupal 8.1.0-rc1 is now available and sites should prepare to update to 8.1.0.

Bug reports should be targeted against the 8.1.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.2.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Comment #13

14 December 2015 at 20:20

Version:

8.1.x-dev

» 8.2.x-dev

Drupal 8.1.9 was released on September 7 and is the final bugfix release for the Drupal 8.1.x series. Drupal 8.1.x will not receive any further development aside from security fixes. Drupal 8.2.0-rc1 is now available and sites should prepare to upgrade to 8.2.0.

Bug reports should be targeted against the 8.2.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.3.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Comment #14

heddn

English

Nicaragua

commented 31 October 2016 at 17:31

Issue summary:

View changes

updated IS

Comment #15

catch

he/him

English

commented 31 October 2016 at 20:24

I think we should mark this duplicate of #1596472: Replace hard coded static cache of entities with cache backends.

Comment #16

31 October 2016 at 20:24

Version:

8.2.x-dev

» 8.3.x-dev

Drupal 8.2.6 was released on February 1, 2017 and is the final full bugfix release for the Drupal 8.2.x series. Drupal 8.2.x will not receive any further development aside from critical and security fixes. Sites should prepare to update to 8.3.0 on April 5, 2017. (Drupal 8.3.0-alpha1 is available for testing.)

Bug reports should be targeted against the 8.3.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.4.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Comment #17

31 October 2016 at 20:24

Version:

8.3.x-dev

» 8.4.x-dev

Drupal 8.3.6 was released on August 2, 2017 and is the final full bugfix release for the Drupal 8.3.x series. Drupal 8.3.x will not receive any further development aside from critical and security fixes. Sites should prepare to update to 8.4.0 on October 4, 2017. (Drupal 8.4.0-alpha1 is available for testing.)

Bug reports should be targeted against the 8.4.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.5.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Comment #18

nevergone

Hungarian

commented 23 November 2017 at 14:07

Status:	Active	» Closed (duplicate)
Related issues:		+#2559191: Clean-up migrate executable

Comment #19

andypost

he/him

Russian

commented 23 November 2017 at 15:29

Status:	Closed (duplicate)	» Active
Issue tags:	-rc eligible, -Quick fix

1 file was hidden/shown/deleted

Status	File	Size
hidden	migrations_invalidate-2558857-3.patch	732 bytes

Issue summary was updated in #14 and it's exactly about static caches of entity storage handlers

@nevergone that's why related issue was decaupled

Comment #20

andypost

he/him

Russian

commented 23 November 2017 at 15:30

Comment #21

pounard

French

commented 23 November 2017 at 16:03

Did you just mark this issue as being related to itself ?

Comment #22

andypost

he/him

Russian

commented 23 November 2017 at 20:28

+#1199866: Add an in-memory LRU cache, +#1596472: Replace hard coded static cache of entities with cache backends

sorry, proper one

Comment #23

catch

he/him

English

commented 24 November 2017 at 12:26

Title:	Migrations invalidate entity caches when reclaiming memory	» [PP-1] Migrations invalidate entity caches when reclaiming memory
Status:	Active	» Postponed

Postponing this on those issues (which I'm re-rolling). The first will make reclamation safe to do without affecting the persistent cache, the second may mean we can drop memory reclamation altogether.

Comment #24

24 November 2017 at 12:26

Version:

8.4.x-dev

» 8.5.x-dev

Drupal 8.4.4 was released on January 3, 2018 and is the final full bugfix release for the Drupal 8.4.x series. Drupal 8.4.x will not receive any further development aside from critical and security fixes. Sites should prepare to update to 8.5.0 on March 7, 2018. (Drupal 8.5.0-alpha1 is available for testing.)

Bug reports should be targeted against the 8.5.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.6.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Comment #25

24 November 2017 at 12:26

Version:

8.5.x-dev

» 8.6.x-dev

Drupal 8.5.6 was released on August 1, 2018 and is the final bugfix release for the Drupal 8.5.x series. Drupal 8.5.x will not receive any further development aside from security fixes. Sites should prepare to update to 8.6.0 on September 5, 2018. (Drupal 8.6.0-rc1 is available for testing.)

Bug reports should be targeted against the 8.6.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.7.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Comment #26

grahl

he/him

Zürich

commented 6 July 2019 at 16:02

Status	File	Size
new	core-limit-entity-in-memory-cache.patch	1015 bytes

I've been trying to run several large migrations (500k-5m records) with incremental updates and was constantly running into memory issues. I had to set the memory limit quite high so that it reflected a significant portion of overall memory. When Migrate then started reclaiming memory it was not unusual for the process to be killed due to the limited remaining resources.

It turns out, that the high memory usage is entirely avoidable when we just clear the MemoryCache. A hacky workaround for this is attached. Obviously the referenced #1199866: Add an in-memory LRU cache would be the better approach. This workaround avoids the issues outlined above with restarting migrations and starting from scratch, as well as providing nearly constant and reasonable memory usage.

(I think batch_size for SQL sources does make a difference there per migration, too, but haven't rerun the numbers.)

Comment #27

kevinquillen commented 29 October 2019 at 15:25

I had similar issues for migrations that used files on disk. JSON files, containing anywhere from 100 to up to 28,000 items, depending. The larger ones definitely had issues, especially if they generated Paragraphs. The memory reclamation part in migrate seems to fail or do nothing, at least on Pantheon. It will hit 85% used, reclaim a little bit, do a few more rows and exit.

The only way around this for me (Pantheon) was writing a script that broke down JSON files from one large file to many smaller files, and a new Drush command that could migrate and override the source file URL with one provided via argument and loop over that. We no longer had memory issues, but isn't the most ideal solution.

Oddly enough, this never happens to anyone on the dev team when doing a migration in the local Docker environment, it only happened on Pantheon. I am not totally sure why.

Comment #28

29 October 2019 at 15:25

Version:

8.6.x-dev

» 8.8.x-dev

Drupal 8.6.x will not receive any further development aside from security fixes. Bug reports should be targeted against the 8.8.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.9.x-dev branch. For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

Comment #29

29 October 2019 at 15:25

Version:

8.8.x-dev

» 8.9.x-dev

Drupal 8.8.7 was released on June 3, 2020 and is the final full bugfix release for the Drupal 8.8.x series. Drupal 8.8.x will not receive any further development aside from security fixes. Sites should prepare to update to Drupal 8.9.0 or Drupal 9.0.0 for ongoing support.

Bug reports should be targeted against the 8.9.x-dev branch from now on, and new development or disruptive changes should be targeted against the 9.1.x-dev branch. For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

Comment #30

heddn

English

Nicaragua

commented 19 October 2020 at 19:02

Title:	[PP-1] Migrations invalidate entity caches when reclaiming memory	» Migrations invalidate entity caches when reclaiming memory
Status:	Postponed	» Needs review

Status	File	Size
new	2558857-30.patch	864 bytes

1 file was hidden/shown/deleted

Status	File	Size
hidden	core-limit-entity-in-memory-cache.patch	1015 bytes

I've been doing some profiling of a migration recently. And what I find is that the biggest issue we are facing is that entity cache is "reset". Which just invalidates the memory. It doesn't actually clear out the memory. There is #1199866: Add an in-memory LRU cache, and that seems like a much better permanent solution, but for now, let actually reclaim memory. Patch attached. No interdiff, as this is a distinctly different approach.

Comment #31

benjifisher

he/him or they/them

English

Boston area

commented 4 January 2021 at 22:14

Version:

8.9.x-dev

» 9.2.x-dev

This will be a sort of rambling review, as I try to understand how this all works. Please confirm if you can. Perhaps some of these comments will be helpful in future reviews.

This is one of the smaller patches I have looked at recently:

    -    // Entity storage can blow up with caches so clear them out.
    -    $entity_type_manager = \Drupal::entityTypeManager();
    -    foreach ($entity_type_manager->getDefinitions() as $id => $definition) {
    -      $entity_type_manager->getStorage($id)->resetCache();
    -    }
    +    // Entity storage can blow up with caches so clear it out.
    +    $memory_cache = \Drupal::service('entity.memory_cache');
    +    $memory_cache->deleteAll();

Nit: as long as we are updating the comment, can we add a comma before "so"?
Nit: since we never use the $memory_cache variable again, can we eliminate it?
The service resolves to Drupal\Core\Cache\MemoryCache\MemoryCache. This class does not know anything about entities, so I guess the only thing that ties it to the entity system is that only entity-related code should be using this service. I searched the codebase for entity.memory_cache and am reviewing the results.
The code comment for reset() in the parent class says, "This is only used by tests", but this is not enforced and EntityTypeManager::useCaches() breaks that promise.

Here is the code for EntityStorageBase::resetCache(), the method we are removing in this patch:

   public function resetCache(array $ids = NULL) {
     if ($this->entityType->isStaticallyCacheable() && isset($ids)) {
       foreach ($ids as $id) {
         $this->memoryCache->delete($this->buildCacheId($id));
       }
     }
     else {
       // Call the backend method directly.
       $this->memoryCache->invalidateTags([$this->memoryCacheTag]);
     }
   }

As @heddn says in #30, this "… just invalidates the memory. It doesn’t actually clear out the memory", at least for entity types that cannot be statically cached. I do not see a mechanism for deleting invalidated items from the cache. Looking at the code for invalidateTags(), I see that it just sets the expire property to "one second before the request time".

in SqlFieldableEntityTypeListenerTrait::copyData(), I see a similar usage of \Drupal::service('entity.memory_cache')->deleteAll();, so there is precedent for this strategy.

Conclusion

This change looks like a good idea. It actually removes the cache instead of invalidating it. As a bonus, it does not spend a lot of effort looping through entity types to figure out which tags to invalidate. I declare the patch reviewed but not tested.

How do we test this? The question applies to both manual and automated testing.

Comment #32

benjifisher

he/him or they/them

English

Boston area

commented 4 January 2021 at 22:25

Title:	Migrations invalidate entity caches when reclaiming memory	» Migrations invalidate entity caches when trying to reclaim memory
Issue summary:	View changes

I am updating the issue summary with some of the notes from my previous comment.

Comment #33

catch

he/him

English

commented 5 January 2021 at 14:04

#30 looks good, and I think we should go ahead here. I'm not sure it's possible to add an automated test, memory cache itself does have test coverage and we're just calling one method, so possibly that's enough. For manual testing, would probably need to run a migration and do tens of thousands of items in one go, or artificially lower the memory limit with a smaller number of items - but need to break things to reproduce the out of memory first.

If/when we make #3190992: Add a WeakReference memory cache implementation the default entity cache backend, we should be able to remove the remaining line here.

Comment #34

heddn

English

Nicaragua

commented 5 January 2021 at 14:17

Status	File	Size
new	interdiff_30-34.txt	718 bytes
new	2558857-34.patch	829 bytes

1 file was hidden/shown/deleted

Status	File	Size
hidden	2558857-30.patch	864 bytes

If you need profiling, I've used this patch on a pantheon hosted site that has very limited memory allocated to it. Using this patch, the migrations run and finish successfully. Without it, the migration continually dies a horrible death against memory limits. It doesn't solve all the problems. I still have to wrap the drush migration command in bash loop drush ms --fields="Migration ID" --tag=Content |grep upgrade_d6 | while read line ; do echo $line; drush mim --continue-on-failure $line </dev/null ; done. But this patch and that loop makes things possible on Pantheon that which wasn't possible previously.

Also, address feedback from #31.

Comment #35

heddn

English

Nicaragua

commented 5 January 2021 at 14:20

Title:

Migrations invalidate entity caches when trying to reclaim memory

» Migrations invalidate entity caches when trying to reclaim memory, should flush

Comment #36

benjifisher

he/him or they/them

English

Boston area

commented 5 January 2021 at 15:42

@heddn, thanks for fixing the two nits. I consider this reviewed but not tested (RBNT?). I pinged @kevinquillen (Comment #27) on Slack, and I hope he can do some testing to confirm.

@catch, thanks for the guidance.

Comment #37

joachim commented 4 February 2021 at 11:56

Status:

Needs review

» Reviewed & tested by the community

Patch LGTM, fixes the problem described in the IS.

Working well for me on a Commerce Order rollback which was running out of memory (probably due to Commerce Migrate module loading all the order items for each order).

Comment #38

4 February 2021 at 15:01

catch committed a4095b9 on 9.2.x

Issue #2558857 by heddn, andr1yun, grahl, andypost, catch, benjifisher,...

Comment #39

4 February 2021 at 15:01

catch committed 2cadd7b on 9.1.x

Issue #2558857 by heddn, andr1yun, grahl, andypost, catch, benjifisher,...

Comment #40

catch

he/him

English

commented 4 February 2021 at 15:02

Version:	9.2.x-dev	» 9.1.x-dev
Status:	Reviewed & tested by the community	» Fixed

Committed/pushed to 9.2.x, and cherry-picked to 9.1.x, thanks!

Comment #41

18 February 2021 at 15:04

Status:

Fixed

» Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

Comment #42

kevinquillen commented 18 February 2021 at 15:35

Thumbs up all - seems to work for me too. Sorry for the late response!

Migrations invalidate entity caches when trying to reclaim memory, should flush

Problem/Motivation

Proposed resolution

Remaining tasks

User interface changes

API changes

Data model changes

Comments

Conclusion

Related issues

Referenced by