Problem/Motivation

Part of the objective of #2541580: Remove obsoleted functionality from core Migrate was to remove much of the obsolete or un-related code from the migrate executable. Memory reclaiming is still needed though for the majority of migrations to complete.

Proposed resolution

Refactor memory reclaiming to use existing events.

Remaining tasks

blocked by #2778433: Allow migration events to affect executable loop status.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

neclimdul created an issue. See original summary.

neclimdul’s picture

Status: Active » Needs review
FileSize
5.37 KB

Quick patch. Contains #2503999: Large volume entity migrations run out of memory because I that's the only memory limit I've ever run into and I needed it to test the patch..

neclimdul’s picture

Should have put this in migrate. Don't know why I put it in migrate_drupal.

Status: Needs review » Needs work

The last submitted patch, 3: re_add_memory-2545632-3.patch, failed testing.

neclimdul’s picture

ignore [#3], tried to roll a patch too early.

This correctly moves things to migrate module. #2546004: Add message service to migrate events adds the message service to the event so we can better send messages.

neclimdul’s picture

Status: Needs work » Needs review

Status: Needs review » Needs work

The last submitted patch, 5: re_add_memory-2545632-5.patch, failed testing.

mikeryan’s picture

All right, let's put it back. Maybe it would be better to start from a patch restoring the original code (including tests), then replace the old checkStatus() with the event?

I'm not fond of the resetCache() hack, I still think the entity system needs to use drupal_static() so it's possible to wipe the memory cache only.

+++ b/core/modules/migrate/src/EventSubscriber/MigratePostSaveSubscriber.php
@@ -0,0 +1,140 @@
+namespace Drupal\migrate_drupal\EventSubscriber;

Need to fix the namespace as well as move the file.

phenaproxima’s picture

Issue tags: +Needs tests
+++ b/core/modules/migrate/src/EventSubscriber/MigratePostSaveSubscriber.php
+  public function __construct($threshold = 0.85) {

Needs a doc block.

+++ b/core/modules/migrate/src/EventSubscriber/MigratePostSaveSubscriber.php
+            throw new MigrateException($this->t('Invalid PHP memory_limit !limit',
+              array('!limit' => $limit)));

Won't $this->t fail unless the subscriber imports StringTranslationTrait and accepts a StringTranslationInterface? onPostRowSave() also has calls to $this->t(), which are currently commented out.

+++ b/core/modules/migrate/src/EventSubscriber/MigratePostSaveSubscriber.php
+    // Entity storage can blow up with caches so clear them out.
+    $container = \Drupal::getContainer();
+    /** @var \Drupal\Core\Entity\EntityManagerInterface $manager */
+    $manager = $container->get('entity.manager');

Since the event subscriber is created by the container, there is no need to use \Drupal here. Can you inject the entity manager in the constructor?

Another thought occurs -- maybe $memoryThreshold should be a protected static variable which can altered by a static setter, in case outside code needs to change the memory threshold in mid-stream. Does that make any sense at all, or would it be a pointless edge case?

Beyond that, this looks OK to me. It needs tests, however.

phenaproxima’s picture

Priority: Normal » Major
Issue tags: +Migrate critical

Since there is already a need for big migrations to work smoothly, I think this should be escalated.

mikeryan’s picture

Title: Re-add memory reclimation in migrate system » Re-add memory reclamation in migrate system
neclimdul’s picture

Priority: Major » Normal
FileSize
5.31 KB
5.42 KB

#8 done and done. I'll continue following up on the entity cache problem in the other issue.

#9
1) done
2) pulled in trait.
3) removed per #8
4) you almost saw through my clever feature addition. technically you can modify the service container in several ways so the container acts as this static already for the threshold already! The only difference is you can't change it during run time. you'd need a method and additional public interfaces. Feel free to file a follow up though, that might be handy to someone.

neclimdul’s picture

Priority: Normal » Major

didn't mean to change that.

neclimdul’s picture

Status: Needs work » Needs review
FileSize
686 bytes
5.24 KB

remove unused use statements. testbot.

phenaproxima’s picture

+++ b/core/modules/migrate/src/EventSubscriber/MigratePostSaveSubscriber.php
+  public function __construct($threshold = 0.85) {
+    $this->memoryThreshold = 0.85;
+  }

Should be $this->memoryThreshold = $threshold. Maybe also cast $threshold to a float.

+  public function getMemoryLimit() {
+
+    $limit = trim(ini_get('memory_limit'));

Nitpick -- thar be an extra line.

Should we remove the commented-out stuff and file a follow-up issue in its place?

Beyond this, it looks good to me. Still needs tests, though.

And let's file a follow-up issue about clearing the entity caches.

mikeryan’s picture

And let's file a follow-up issue about clearing the entity caches.

I submitted my patch at #2503999: Large volume entity migrations run out of memory.

mikeryan’s picture

Just wanted to bring up a comment I made in IRC - the event listener could be on MigrateExecutable rather than in a separate class. One advantage would be there wouldn't be any need to figure out how to pass the message interface or any other information from the executable to the listener.

mikeryan’s picture

+++ b/core/modules/migrate/src/EventSubscriber/MigratePostSaveSubscriber.php
@@ -0,0 +1,144 @@
+        throw new MigrateException();

Rather than throwing an exception, once #2545672: Handle various migration interruption scenarios lands we'll do $migration->setInterruptionResult(MigrationInterface::RESULT_INCOMPLETE) so the front-end tools can do whatever they need to do to resume the migration.

hussainweb’s picture

Starting with comments in #15 and a few other nitpicks. I renamed the function checkMemory() to something hopefully a bit more clearer. I will check on #17 as well.

hussainweb’s picture

#17: I am considering on how to put this into MigrateExecutable and I think I prefer this to be a separate class. It provides a separation of concerns and does not clutter up MigrateExecutable further.

As far as saving messages is concerned, the event already passed in the migration object. Can we not just call ->getIdMap()->saveMessage() on that? What do you think?

mikeryan’s picture

Can we not just call ->getIdMap()->saveMessage() on that?

These messages are not for saving in the message table (we don't really care what item happened to be importing when the memory threshold was hit), these are messages for display on the front-end.

neclimdul’s picture

neclimdul’s picture

mergediff incase someone wanted to see the conflict.

 +  migrate_post_save_subscriber:
 +    class: Drupal\migrate\EventSubscriber\MigratePostSaveSubscriber
 +    tags:
 +      - { name: event_subscriber }
+   plugin.manager.migrate.builder:
+     class: Drupal\migrate\Plugin\MigratePluginManager
+     arguments: [builder, '@container.namespaces', '@cache.discovery', '@module_handler']
benjy’s picture

Status: Needs review » Needs work

This looks good although we should probably wait on #2546004: Add message service to migrate events rather than commit the commented out code?

Also still needs the tests restoring, setting to NW for that.

mikeryan’s picture

We have some barriers to getting this resolved ASAP:

  1. This is a refactoring of the original code, so will require more review to be confident in RTBCing.
  2. We don't have a test yet using the refactoring.
  3. This refactoring is dependent on a new issue, #2546004: Add message service to migrate events.
  4. With the memory checking being handled in an event rather than inline, we don't currently have a means to exit the executable import() loop when failing to reclaim enough memory - #2545672: Handle various migration interruption scenarios will provide that mechanism, but that's another external issue dependency.

I propose that we

  1. Restore the memory reclamation exactly as it was - we should be able to turn that patch around quickly.
  2. We get #2545672: Handle various migration interruption scenarios in next.
  3. In a follow-up issue, we can make the reclamation event-driven, using #2545672: Handle various migration interruption scenarios to break the loop on reclamation failure.
mikeryan’s picture

Status: Needs work » Needs review
FileSize
11.43 KB

This is a straight restoration of the original memory handling code, with one small improvement - replacing the constructor exception with a display() call.

neclimdul’s picture

I figure if we're doing anything that is not reverting we'll have to review it to make sure we got it right. Here is the subscriber approach with tests. Also, I noticed there is a utility in core for resolving byte notations into integers so I replaced the custom implementation with the better tested utility. the message patch applied so it can test things.

Edit: note. interdiff vs previous version of this patch in #22

Status: Needs review » Needs work

The last submitted patch, 27: re_add_memory-2545632-27.patch, failed testing.

hussainweb’s picture

@neclimdul: It seems that the patch in #27 also contains changes from #2546004-15: Add message service to migrate events. Was that intentional?

neclimdul’s picture

Status: Needs work » Needs review
FileSize
28.28 KB

simpletest... Who knew you could get failures only when using --all. Also, failures that tell you 100% the wrong thing. Typo in filename caused the failure.

Yes, that was the only way I came up with to effective do tests. clearly this approach is blocked on the other patch but we need to discuss the 2 approaches.

benjy’s picture

+++ b/core/modules/migrate/src/Event/MigrateImportEvent.php
@@ -25,10 +25,13 @@ class MigrateImportEvent extends Event {
+  public function __construct(MigrateMessageInterface $message, MigrationInterface $migration) {

Why not: $migration, $message same as MigrateExecutable

  • webchick committed 2066518 on 8.0.x
    Issue #2545632 by neclimdul, hussainweb, mikeryan, phenaproxima, benjy:...
webchick’s picture

Priority: Major » Normal
Status: Needs review » Needs work
Issue tags: -Migrate critical

Ok I figured this issue would be a quick, ~10 minute fix to just put the code back in that was taken out, but this issue has now taken on a life of its own. In the meantime though we've lost capabilities that several people actively doing migrations have cited as important, and while we're in no danger of rolling a new beta right now, I don't think we want to accidentally put one out with this functionality missing.

Therefore, committed and pushed #26 (simple rollback) to 8.0.x to get us back to where we were prior to #2541580: Remove obsoleted functionality from core Migrate.

That will necessitate a re-roll of this one, but it can now proceed at whatever pace.

I also believe that makes this one much less severe than before, so removing Migrate critical tag and downgrading to normal. If that's wrong, please feel free to re-adjust.

hussainweb’s picture

The objective now is to decouple the memory reclamation from migrate core by using events and messages. Rather than renaming the title and issue summary to more accurately describe the new objective, is it better to create a followup? The title currently is "Re-add memory reclamation in migrate system" and we have already done that (in #26, which was committed). Should we mark this as fixed then?

benjy’s picture

We can do it here, don't think it matters to much.

neclimdul’s picture

Title: Re-add memory reclamation in migrate system » Move memory reclamation out of migrate executable
Issue summary: View changes
Status: Needs work » Postponed

Agreed. Updated title and IS and things.

  • webchick committed 2066518 on 8.1.x
    Issue #2545632 by neclimdul, hussainweb, mikeryan, phenaproxima, benjy:...
benjy’s picture

Version: 8.0.x-dev » 8.1.x-dev
Assigned: neclimdul » Unassigned

Is postponed the right status here?

Version: 8.1.x-dev » 8.2.x-dev

Drupal 8.1.0-beta1 was released on March 2, 2016, which means new developments and disruptive changes should now be targeted against the 8.2.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

mikeryan’s picture

neclimdul’s picture

Title: Move memory reclamation out of migrate executable » [PP1] Move memory reclamation out of migrate executable

Yes.

  • webchick committed 2066518 on 8.3.x
    Issue #2545632 by neclimdul, hussainweb, mikeryan, phenaproxima, benjy:...

  • webchick committed 2066518 on 8.3.x
    Issue #2545632 by neclimdul, hussainweb, mikeryan, phenaproxima, benjy:...
quietone’s picture

Status: Postponed » Needs work

#2546004: Add message service to migrate events is in so this is no longer Postponed. Setting to NW since that what it was before being postponed. Not changing the title because I don't know the meaning of [PP1].

neclimdul’s picture

Issue summary: View changes
Status: Needs work » Postponed
Related issues: +#2778433: Allow migration events to affect executable loop status.

Because of recent change to the migration executable this is blocked again. This time by #2778433: Allow migration events to affect executable loop status..

Version: 8.2.x-dev » 8.3.x-dev

Drupal 8.2.0-beta1 was released on August 3, 2016, which means new developments and disruptive changes should now be targeted against the 8.3.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

mikeryan’s picture

Just had a discussion in IRC with neclimdul about this and the child issue #2778433: Allow migration events to affect executable loop status.. I think the child issue is primarily motivated by the fact that right now, an exception thrown from a POST_ROW_SAVE handler would skip saveIdMapping(), which we don't want - we want to make sure the last imported item is properly recorded:

$destination_id_values = $destination->import($row, $id_map->lookupDestinationId($this->sourceIdValues));
$this->getEventDispatcher()->dispatch(MigrateEvents::POST_ROW_SAVE, new MigratePostRowSaveEvent($this->migration, $row, $destination_id_values));
// call saveIdMapping()

Looking at that code, I think it's wrong - POST_ROW_SAVE should be dispatched after saveIdMapping(). We want to maintain data integrity here - the import of an item, and the recording of its disposition, should not be separated by a potential exception from a POST_ROW_SAVE handler. Thus, my recommendation in IRC was:

  1. Move the dispatch to after the idmap is saved.
  2. Define a new exception, to be thrown from the POST_ROW_SAVE handler for memory reclamation.
  3. Upon catching the exception, set $return to RESULT_INCOMPLETE and break the processing loop.

But, as I reacquaint myself with this section of the executable processing loop, we already have a mechanism for interrupting the loop - the POST_ROW_SAVE handler could simply do

$migration->interruptMigration(MigrationInterface::RESULT_INCOMPLETE);

and it should all work fine.

Thoughts?

benjy’s picture

Yeah, +1 for $migration->interruptMigration(MigrationInterface::RESULT_INCOMPLETE);, I don't really like the idea of adding another exception to control the flow.

catch’s picture

The drupal_static_reset() call here won't do much any more in 8.x, since 97% of drupal_static() usage have been removed.

#1199866: Add an in-memory LRU cache might make the explicit memory reclaim unnecessary.

mikeryan’s picture

The last patch doesn't really reflect reality - effectively, the drupal_static_reset() call is being moved, not added, and it might as well stay to protect against any contrib users of drupal_static() building a large cache.

Most definitely looking forward to the entity cache issues being addressed, though!

neclimdul’s picture

What is this method? I'll have to take a look...

re #49yeah actually currently the most important thing is actually clearing the pesky on object entity cache. Great for page loads but on a long running process, having every one of your 160k nodes stored as a full blown object in the entity store doesn't work out so well... ;)

heddn’s picture

Status: Postponed » Closed (duplicate)
Related issues: +#2821216: Move memory reclamation out of migrate executable

There are too many commits and rollbacks in the issue. Opening a new issue and closing it as duplicate so we have a clean slate to start with. I hope that doesn't ruffle too many feathers.

benjy’s picture

There are too many commits and rollbacks in the issue. Opening a new issue and closing it as duplicate so we have a clean slate to start with.

Personally i think a clean state is worse, it's nice to have all the previous conversation around the issue, plus we have 15 people watching the issue.