Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
Problem/Motivation
In large migrations, memory can get locked up in the entity system and not recovered. This is because the entity system stores a cache of loaded entities as a property on the storage class.
Proposed resolution
Iterate over entity storage classes and call their static reset method.
Remaining tasks
none
User interface changes
none
API changes
none
Data model changes
none
Original Report
When running a large migration, something internal structure is not getting reclaimed leading to a giant memory leak.
running an import with about 180,000 nodes I got this error:
nodes Running d6_node [ok]
Memory usage is 870.48 MB (85% of limit 1 GB), reclaiming memory. [warning]
Memory usage is now 870.47 MB (85% of limit 1 GB), not enough [warning]
reclaimed, starting new batch
at which point it fails and moves on to and fails importing revisions and other imports.
Comment | File | Size | Author |
---|---|---|---|
#29 | large_volume_entity-2503999-29.patch | 899 bytes | neclimdul |
#2 | migrate_cache_clear.patch | 899 bytes | neclimdul |
Comments
Comment #1
mikeryanWe've had some discussion of this in IRC. Migrate (in 7 & 8) monitors memory usage and when reaching a threshold (defaulting to 85% of memory_limit) issues a drupal_static_reset() and checks to see if "enough" memory was released. This works just fine with Drupal 7 core - node caches and such use drupal_static() and get cleared, migrations generally only run into memory problems with the odd contrib or custom module doing local static caching.
In Drupal 8, however, EntityStorageBase::$entities contains an in-memory cache of entities. To clean them up, migration would need to find all the EntityStorageBase instances and call resetCache() on each of them. First, though, a question - is there any reason EntityStorageBase shouldn't be using drupal_static() to cache the entities?
Comment #2
neclimdulI talked to larowlan and I didn't get the impression he was interested in fixing this in the entity system. I don't know if there is would be support from other people but for the time being here is the patch I'm using to work around the leak as far as migration is concerned.
Comment #3
andypostWould be great to limit that clearing with only target entities from executed migrations
would be great to skip caches that have static disabled
Comment #4
neclimdulI'm not sure I agree. This isn't run on every pass, it is resolving an excessively high memory event and we should try to clear every thing in Drupal. Furthermore, the cost of iterating and clearing is cheap in the context of what we are doing.
Wouldn't they just be a no-op method call? Why would we bother skipping them?
A little more information, the patch running on our migration looks like this:
Comment #5
andypostSo why that does not fire event?
there's no way to other sub-systems to get involved here.
And
\Drupal\Core\Entity\Sql\SqlContentEntityStorage::resetCache()
implementation causes cache_tag invalidation, which could be very expensive.Suppose we need follow-up here or at least document that operation could cause a chained invalidations
I'm mostly worried because this calls Cache::invalidateTags(array($this->entityTypeId . '_values'));
Let's remove that todo or add issue to explore that
Comment #6
neclimdulI'm honestly not comfortable moving this quick fix into the realm of new API/event/hooks. We have a bit of a mess with local variable caches like this in the container and long running tasks but that's no simple fix.
Comment #7
mikeryanIs there a particular reason not to have the in-memory entity cache use drupal_static()? It seems that resetCache() is too heavy-handed, and using drupal_static() would make that in-memory cache work just like all other in-memory caches without undesirable side-effects.
Comment #8
neclimdulCatch brought to light some related issues to this in IRC.
Comment #9
neclimdulgrr, submit just dropped the related issues
Comment #10
mikeryanI do feel that the entity cache needs proper static caching (specifically a means of clearing the static cache without invalidating external caches), and until #2260187: [PP-1] Deprecate drupal_static() and drupal_static_reset() drupal_static() is it.
Entity caching certainly qualifies, and is (unless I'm missing something big somewhere else) the biggest static memory user.
So, attached is a patch to use drupal_static() for the static entity cache. It's pretty simple and should be perfectly transparent - the one known issue is it'll break unit tests, which don't include bootstrap.inc, so we'd need to "mock" drupal_static/drupal_static_reset there. Knowing that, I'll let the testbot take a crack just to be sure there aren't any simpletests broken by this.
Comment #12
neclimdulblocked by #2545632: [PP1] Move memory reclamation out of migrate executable
Comment #13
neclimdulSorry, didn't see the patch. I'm still in favor if fixing this in migrate with an updated version of the previous patch with we've been running in our development environment for 2 months. That approach is blocked on the other issue.
Comment #14
catchSo #2501117: Add static caching for PermissionsHashGenerator::generate() adds a cache.static bin using the memory backend. That addition is just a couple of lines if for some reason that patch doesn't get committed.
#1596472: Replace hard coded static cache of entities with cache backends is very outdated but I was trying to replace the static cache for entities with a cache backend - would need to start from scratch but that could be done with cache.static or cache.entity_static similar to the above. Scope could be a lot narrower than the issue was originally.
Then #1199866: Add an in-memory LRU cache is also outdated but was an LRU version of the memory backend with a limit on the number of items. Could either use a numeric limit on number of items, or just check peak usage in the backend each time there's a set and empty it at 60% like migrate does, or some kind of combination.
Then migrate could just ensure that the LRU / memory checking version of the cache backend is used in the container and might not need to do anything special at all otherwise. Or it can call MemoryBackend::reset() manually.
Comment #15
mikeryanSo, those are all good issues, but are any of them going to make 8.0? I have to admit the drupal_static() approach is also a longshot for 8.0 at this stage, but with no API changes seems like a possibility.
We need some way for migrations to make space for themselves today - of the various issues that involve improving entity caching, which has the best chance to get into 8.0? Whichever one that is, I'm happy to contribute to pushing it forward.
Comment #16
catchThey're all performance/scaling improvements - and in the case of OOM straight bug fixes, so they can go in up to and potentially during release candidate if someone works on them. There's no API change in any of these, only API additions which is fine during beta as long as there's a concrete use case that needs it.
The LRU cache backend I'd personally like available in core, but that could be done in contrib once entities are using the static cache backend/service instead of a raw object property - and migrate can then depend on that either way. Or as mentioned use the reset() method to clear the entire backend manually instead of LRU.
Comment #17
webchickThis sounds pretty bad, so escalating to major.
Comment #18
BerdirDefinitely -1 on introducing drupal_static() in there. Using a memory backend with cache tag support (would also remove 95% of the cases where we need to reset static caches manually in tests) or other OO replacement, maybe, but not that.
Patch in #2 looks good to me as a quick fix for this issue. There are also long-standing issues about introducing a limit of how many nodes are kept in the static cache, that would be another approach on how this could be solved long-term.
Comment #19
neclimdulmaybe the path through is
1) use the approach in #2
2) open a follow up that this introduces a bug into migrate where it can clear the site cache on a live site
3) Resolve the entity cache issue and implement any change needed in the follow from 2)
Side note: The EntityStorageInterface only allows reseting of static caches according to the documentation. The current cache invalidation, while probably pretty darn important, breaks the contract.
Comment #20
BerdirCache tag invalidation on a long running migration (or any sort of long running process) is flawed anyway due of the static cache that prevents repeated invalidations, so only the first node would actually invalidate the list cache tag. And fixing that to do it all the time wouldn't be fun either as that would most definitely bring a live site down ;)
The only thing that might work is doing the actual invalidations after the migration completed, but that has problems too.
Want to re-upload your patch so it's clear which one should be committed? I can RTBC then.
Comment #21
BerdirAlso, issue summary should explain the solution/workaround here.
Comment #22
catchWe could look at rate-limiting that rather than either once or every time - enforce once per configurable-amount-of-seconds per-request. Since it's just a static cache that's not too much extra work to implement.
Comment #23
mikeryanLet me summarize how I understand the issue:
Yes, there are a number of other things that can be done to improve static entity caching, but I still don't understand why whatever static caching is done for entities should be exempt from the documented purpose of drupal_static(). It's certainly poor DX to have a documented "consistent API for resetting any other function's static variables" and yet have to remember that entities are special and you have to call a different API to reset them...
Comment #24
BerdirThe entity system is in no way special. In 7.x, we had 180 calls to drupal_static() according to api.drupal.org while in 8.x, we only have 50 left. And looking at the remaining calls, they are almost exclusively in old code.
That it frees memory is a side effect IMHO, the main use case for it was being able to reset the environment for tests. The way we do that in 8.x now is the container and dependency injection.
Resetting this could in some cases even have undesired effects, for example https://api.drupal.org/api/drupal/core%21includes%21bootstrap.inc/functi..., resetting it might actually result in dropped information that can't be recovered. Just an example, deprecated and very likely unrelevant for processed that migrate, but it also was the very first one that I looked at when searching for such an example, there are likely others.
Comment #25
mikeryanThe ability to free unnecessary memory is a critical use case for migration, and drupal_static() is the only existing general mechanism for that.
As for resetting resulting in dropped information, anywhere that happens is a case of misusing drupal_static() - "...use this function unless it is absolutely certain that the static variable will not need to be reset during the page request".
Must migration have a laundry list of module- or class-specific API calls to make for reclaiming memory? And what about contrib modules - if drupal_static() is being (at least informally) deprecated, will we go back to the D6 days of static caches that can't be reset?
Comment #26
catchTagging this as migrate critical. We could untag it later, but don't want to forget this one.
Comment #27
phenaproximaI haven't been involved in this issue, but reading over it I support using @neclimdul's patch in #2 as a quick fix, because this is a big deal. The larger discussion about how static caching is done for entities is very much worth having, but it's bigger than Migrate and should not block this fix. Once a more consistent static caching system is in core, this can and should be revisited. But until then, we need to be able to reclaim memory effectively.
Comment #28
phenaproximaComment #29
neclimdulSame patch still applies. Re-upload.
Comment #30
neclimdulComment #31
phenaproximaI'm OK with this patch, so I vote RTBC. However, I'd like someone more familiar with the intricacies of the Entity API to review and RTBC it for real.
Comment #32
andypostThere's
\Drupal::entityManager()
for thatComment #33
BerdirNote that this will also invalidate the persistent cache, but running migrations is going to be a severe hit on your performance anyway so this is probably acceptable, at least as a temporary solution.
I had a quick look into just resetting the static caches, we don't really have an API for that, we could reset the handlers property in the entity manager with useCaches(FALSE); useCaches(TRUE), but they might still be injected somewhere and not really freed.
Comment #34
webchickSeems like everyone is on board with this being a reasonable quick fix, and it's unclear whether more expansive fixes are possible. That's a good argument for getting this in, possibly with a follow-up issue to dig in more.
I inquired about test coverage but it's not easy to test memory consumption.
Committed and pushed to 8.0.x. Thanks!
#32 sounds like a good novice issue, but don't want to hold up a migrate critical on that.
Comment #36
catchOpened #2558857: Migrations invalidate entity caches when trying to reclaim memory, should flush to track the persistent cache clearing.
There are two cases that can be tricky:
- migrations that need to load entities as part of the migration may slow down a lot - some sites take 10 or 20 hours to migrate so a 10% regression can mean an extra hour or two.
- partial migrations on a live site
Comment #37
andypostFollow-up (quickfix) for #32
#2559191: Clean-up migrate executable
Comment #39
davidwbarratt CreditAttribution: davidwbarratt at Golf Channel commentedStill seeing the migration run out of memory... #2688297: File migration slows down and eats more and more memory, eventually stops