Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
Problem/Motivation
Part of the objective of #2541580: Remove obsoleted functionality from core Migrate was to remove much of the obsolete or un-related code from the migrate executable. Memory reclaiming is still needed though for the majority of migrations to complete.
Proposed resolution
Refactor memory reclaiming to use existing events.
Remaining tasks
blocked by #2778433: Allow migration events to affect executable loop status.
Comment | File | Size | Author |
---|---|---|---|
#30 | re_add_memory-2545632-30.patch | 28.28 KB | neclimdul |
#27 | 2545632-27.interdiff.txt | 29.28 KB | neclimdul |
#26 | re_add_memory-2545632-26.patch | 11.43 KB | mikeryan |
#22 | re_add_memory-2545632-22.patch | 5.49 KB | neclimdul |
#22 | 2545632-22.interdiff.txt | 752 bytes | neclimdul |
Comments
Comment #2
neclimdulQuick patch. Contains #2503999: Large volume entity migrations run out of memory because I that's the only memory limit I've ever run into and I needed it to test the patch..
Comment #3
neclimdulShould have put this in migrate. Don't know why I put it in migrate_drupal.
Comment #5
neclimdulignore [#3], tried to roll a patch too early.
This correctly moves things to migrate module. #2546004: Add message service to migrate events adds the message service to the event so we can better send messages.
Comment #6
neclimdulComment #8
mikeryanAll right, let's put it back. Maybe it would be better to start from a patch restoring the original code (including tests), then replace the old checkStatus() with the event?
I'm not fond of the resetCache() hack, I still think the entity system needs to use drupal_static() so it's possible to wipe the memory cache only.
Need to fix the namespace as well as move the file.
Comment #9
phenaproximaNeeds a doc block.
Won't $this->t fail unless the subscriber imports StringTranslationTrait and accepts a StringTranslationInterface? onPostRowSave() also has calls to $this->t(), which are currently commented out.
Since the event subscriber is created by the container, there is no need to use \Drupal here. Can you inject the entity manager in the constructor?
Another thought occurs -- maybe $memoryThreshold should be a protected static variable which can altered by a static setter, in case outside code needs to change the memory threshold in mid-stream. Does that make any sense at all, or would it be a pointless edge case?
Beyond that, this looks OK to me. It needs tests, however.
Comment #10
phenaproximaSince there is already a need for big migrations to work smoothly, I think this should be escalated.
Comment #11
mikeryanComment #12
neclimdul#8 done and done. I'll continue following up on the entity cache problem in the other issue.
#9
1) done
2) pulled in trait.
3) removed per #8
4) you almost saw through my clever feature addition. technically you can modify the service container in several ways so the container acts as this static already for the threshold already! The only difference is you can't change it during run time. you'd need a method and additional public interfaces. Feel free to file a follow up though, that might be handy to someone.
Comment #13
neclimduldidn't mean to change that.
Comment #14
neclimdulremove unused use statements. testbot.
Comment #15
phenaproximaShould be
$this->memoryThreshold = $threshold
. Maybe also cast $threshold to a float.Nitpick -- thar be an extra line.
Should we remove the commented-out stuff and file a follow-up issue in its place?
Beyond this, it looks good to me. Still needs tests, though.
And let's file a follow-up issue about clearing the entity caches.
Comment #16
mikeryanI submitted my patch at #2503999: Large volume entity migrations run out of memory.
Comment #17
mikeryanJust wanted to bring up a comment I made in IRC - the event listener could be on MigrateExecutable rather than in a separate class. One advantage would be there wouldn't be any need to figure out how to pass the message interface or any other information from the executable to the listener.
Comment #18
mikeryanRather than throwing an exception, once #2545672: Handle various migration interruption scenarios lands we'll do $migration->setInterruptionResult(MigrationInterface::RESULT_INCOMPLETE) so the front-end tools can do whatever they need to do to resume the migration.
Comment #19
hussainwebStarting with comments in #15 and a few other nitpicks. I renamed the function checkMemory() to something hopefully a bit more clearer. I will check on #17 as well.
Comment #20
hussainweb#17: I am considering on how to put this into MigrateExecutable and I think I prefer this to be a separate class. It provides a separation of concerns and does not clutter up MigrateExecutable further.
As far as saving messages is concerned, the event already passed in the migration object. Can we not just call
->getIdMap()->saveMessage()
on that? What do you think?Comment #21
mikeryanThese messages are not for saving in the message table (we don't really care what item happened to be importing when the memory threshold was hit), these are messages for display on the front-end.
Comment #22
neclimdulre-roll and missing return comment.
Comment #23
neclimdulmergediff incase someone wanted to see the conflict.
Comment #24
benjy CreditAttribution: benjy commentedThis looks good although we should probably wait on #2546004: Add message service to migrate events rather than commit the commented out code?
Also still needs the tests restoring, setting to NW for that.
Comment #25
mikeryanWe have some barriers to getting this resolved ASAP:
I propose that we
Comment #26
mikeryanThis is a straight restoration of the original memory handling code, with one small improvement - replacing the constructor exception with a display() call.
Comment #27
neclimdulI figure if we're doing anything that is not reverting we'll have to review it to make sure we got it right. Here is the subscriber approach with tests. Also, I noticed there is a utility in core for resolving byte notations into integers so I replaced the custom implementation with the better tested utility. the message patch applied so it can test things.
Edit: note. interdiff vs previous version of this patch in #22
Comment #29
hussainweb@neclimdul: It seems that the patch in #27 also contains changes from #2546004-15: Add message service to migrate events. Was that intentional?
Comment #30
neclimdulsimpletest... Who knew you could get failures only when using --all. Also, failures that tell you 100% the wrong thing. Typo in filename caused the failure.
Yes, that was the only way I came up with to effective do tests. clearly this approach is blocked on the other patch but we need to discuss the 2 approaches.
Comment #31
benjy CreditAttribution: benjy commentedWhy not: $migration, $message same as MigrateExecutable
Comment #33
webchickOk I figured this issue would be a quick, ~10 minute fix to just put the code back in that was taken out, but this issue has now taken on a life of its own. In the meantime though we've lost capabilities that several people actively doing migrations have cited as important, and while we're in no danger of rolling a new beta right now, I don't think we want to accidentally put one out with this functionality missing.
Therefore, committed and pushed #26 (simple rollback) to 8.0.x to get us back to where we were prior to #2541580: Remove obsoleted functionality from core Migrate.
That will necessitate a re-roll of this one, but it can now proceed at whatever pace.
I also believe that makes this one much less severe than before, so removing Migrate critical tag and downgrading to normal. If that's wrong, please feel free to re-adjust.
Comment #34
hussainwebThe objective now is to decouple the memory reclamation from migrate core by using events and messages. Rather than renaming the title and issue summary to more accurately describe the new objective, is it better to create a followup? The title currently is "Re-add memory reclamation in migrate system" and we have already done that (in #26, which was committed). Should we mark this as fixed then?
Comment #35
benjy CreditAttribution: benjy commentedWe can do it here, don't think it matters to much.
Comment #36
neclimdulAgreed. Updated title and IS and things.
Comment #38
benjy CreditAttribution: benjy commentedIs postponed the right status here?
Comment #40
mikeryanThis is postponed on ... #2546004: Add message service to migrate events, right?
Comment #41
neclimdulYes.
Comment #44
quietone CreditAttribution: quietone as a volunteer commented#2546004: Add message service to migrate events is in so this is no longer Postponed. Setting to NW since that what it was before being postponed. Not changing the title because I don't know the meaning of [PP1].
Comment #45
neclimdulBecause of recent change to the migration executable this is blocked again. This time by #2778433: Allow migration events to affect executable loop status..
Comment #47
mikeryanJust had a discussion in IRC with neclimdul about this and the child issue #2778433: Allow migration events to affect executable loop status.. I think the child issue is primarily motivated by the fact that right now, an exception thrown from a POST_ROW_SAVE handler would skip saveIdMapping(), which we don't want - we want to make sure the last imported item is properly recorded:
Looking at that code, I think it's wrong - POST_ROW_SAVE should be dispatched after saveIdMapping(). We want to maintain data integrity here - the import of an item, and the recording of its disposition, should not be separated by a potential exception from a POST_ROW_SAVE handler. Thus, my recommendation in IRC was:
But, as I reacquaint myself with this section of the executable processing loop, we already have a mechanism for interrupting the loop - the POST_ROW_SAVE handler could simply do
and it should all work fine.
Thoughts?
Comment #48
benjy CreditAttribution: benjy commentedYeah, +1 for
$migration->interruptMigration(MigrationInterface::RESULT_INCOMPLETE);
, I don't really like the idea of adding another exception to control the flow.Comment #49
catchThe drupal_static_reset() call here won't do much any more in 8.x, since 97% of drupal_static() usage have been removed.
#1199866: Add an in-memory LRU cache might make the explicit memory reclaim unnecessary.
Comment #50
mikeryanThe last patch doesn't really reflect reality - effectively, the drupal_static_reset() call is being moved, not added, and it might as well stay to protect against any contrib users of drupal_static() building a large cache.
Most definitely looking forward to the entity cache issues being addressed, though!
Comment #51
neclimdulWhat is this method? I'll have to take a look...
re #49yeah actually currently the most important thing is actually clearing the pesky on object entity cache. Great for page loads but on a long running process, having every one of your 160k nodes stored as a full blown object in the entity store doesn't work out so well... ;)
Comment #52
heddnThere are too many commits and rollbacks in the issue. Opening a new issue and closing it as duplicate so we have a clean slate to start with. I hope that doesn't ruffle too many feathers.
Comment #53
benjy CreditAttribution: benjy commentedPersonally i think a clean state is worse, it's nice to have all the previous conversation around the issue, plus we have 15 people watching the issue.