Currently, batch_limit for drush migrate-import is 1/100 of total items to process. It's fine (or doesn't matter) for sources like SQL but for fairly large file-based (xml in my case) sources it seems to slow down the process with each new batch. First 1-3 batches of 25k items import almost instantly, then it starts to slow down considerably. No big deal for a one-time process but in my case it's a recurrent (daily) migration. The source file isn't large so I think increasing batch_limit would help.

Is there a way other than hacking the code?

Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

scoff created an issue. See original summary.

scoff’s picture

Same migration, batch_limit 251, 1000 and 2507 (changed in MigrateBatchExecutable.php)

$ date && drush mim xmlsql && date
Thu Jul 11 13:10:13 +10 2019
Processed 251 items (251 created, 0 updated, 0 failed, 0 ignored) - continuing with 'xmlsql'
...
Processed 219 items (1 created, 0 updated, 0 failed, 218 ignored) - done with 'xmlsql'
Thu Jul 11 13:27:30 +10 2019

~17 minutes

$ date && drush mim xmlsql && date
Thu Jul 11 13:34:17 +10 2019
Processed 1000 items (1000 created, 0 updated, 0 failed, 0 ignored) - continuing with 'xmlsql'
...
Processed 68 items (0 created, 0 updated, 0 failed, 68 ignored) - done with 'xmlsql'
Thu Jul 11 13:39:23 +10 2019

~5 minutes

$ date && drush mim xmlsql && date
Thu Jul 11 13:41:23 +10 2019
Processed 2507 items (2507 created, 0 updated, 0 failed, 0 ignored) - continuing with 'xmlsql'
...
Processed 2505 items (2082 created, 0 updated, 0 failed, 423 ignored) - done with 'xmlsql'
Thu Jul 11 13:43:39 +10 2019

~2 minutes

jlscott’s picture

I have created a patch which allows a new parameter called "batch_size" to be included in the options array provided to the constructor of the MigrateBatchExecutable class. This value is used to set the size of each batch within the limit for each migrate import.

jlscott’s picture

Title: An option to override batch limit? » An option to override batch size?
rob_pr’s picture

I re-rolled the patch for the current 4.x-dev and added a few minor fixes.

Since drush migrate:import will use MigrateBatchExecutable soon, I added "--batch-size" as an option.

heddn’s picture

Version: 8.x-4.x-dev » 8.x-5.x-dev
Status: Needs review » Needs work

Let's re-roll this for 5x please and seems there is still some CS issues.

src/MigrateBatchExecutable.php ✗ 1 more
line 49	Missing @var tag in member variable comment
jlscott’s picture

Patch re-rolled for 5.x branch. Note that this patch also applies to the latest 4.x releases.

jlscott’s picture

Patch updated to add missing @var tag.

jlscott’s picture

Status: Needs work » Needs review
amjad1233’s picture

Status: Needs review » Reviewed & tested by the community

@jlscott Looks good to me. Just deployed to one of our projects seems working as expected.

heddn’s picture

Status: Reviewed & tested by the community » Closed (duplicate)

The great gitlab migration is upon us. See https://gitlab.com/drupalspoons/migrate_tools/-/issues/71. The latest patch from here is posted to https://gitlab.com/drupalspoons/migrate_tools/-/merge_requests/3.

zcht’s picture

It seems that I would need exactly this patch for my migration :) However, it doesn't seem to work for me when I take the patch from the #8 comment and install it. I'm using the current 8.x-5.x-dev version of the module, under Drupal 8.9.3. I wanted to do the migration with an example size:

drush migrate:import my_taxonomy_migration --batch-size=5

I get the following error:

<?php
 [error]  Error: Class 'Drupal\migrate\Bytes' not found in Drupal\migrate\MigrateExecutable->__construct() (line 109 of /var/www/html/docroot/core/modules/migrate/src/MigrateExecutable.php) #0 /var/www/html/docroot/modules/contrib/migrate_tools/src/MigrateExecutable.php(99): Drupal\migrate\MigrateExecutable->__construct(Object(Drupal\migrate\Plugin\Migration), Object(Drupal\migrate_tools\Drush9LogMigrateMessage))
#1 /var/www/html/docroot/modules/contrib/migrate_tools/src/Commands/MigrateToolsCommands.php(407): Drupal\migrate_tools\MigrateExecutable->__construct(Object(Drupal\migrate\Plugin\Migration), Object(Drupal\migrate_tools\Drush9LogMigrateMessage), Array)
#2 [internal function]: Drupal\migrate_tools\Commands\MigrateToolsCommands->rollback('my_taxonomy_migration...', Array)
#3 /var/www/html/vendor/consolidation/annotated-command/src/CommandProcessor.php(257): call_user_func_array(Array, Array)
#4 /var/www/html/vendor/consolidation/annotated-command/src/CommandProcessor.php(212): Consolidation\AnnotatedCommand\CommandProcessor->runCommandCallback(Array, Object(Consolidation\AnnotatedCommand\CommandData))
#5 /var/www/html/vendor/consolidation/annotated-command/src/CommandProcessor.php(176): Consolidation\AnnotatedCommand\CommandProcessor->validateRunAndAlter(Array, Array, Object(Consolidation\AnnotatedCommand\CommandData))
#6 /var/www/html/vendor/consolidation/annotated-command/src/AnnotatedCommand.php(302): Consolidation\AnnotatedCommand\CommandProcessor->process(Object(Symfony\Component\Console\Output\ConsoleOutput), Array, Array, Object(Consolidation\AnnotatedCommand\CommandData))
#7 /var/www/html/vendor/symfony/console/Command/Command.php(255): Consolidation\AnnotatedCommand\AnnotatedCommand->execute(Object(Drush\Symfony\DrushArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#8 /var/www/html/vendor/symfony/console/Application.php(1005): Symfony\Component\Console\Command\Command->run(Object(Drush\Symfony\DrushArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#9 /var/www/html/vendor/symfony/console/Application.php(255): Symfony\Component\Console\Application->doRunCommand(Object(Consolidation\AnnotatedCommand\AnnotatedCommand), Object(Drush\Symfony\DrushArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#10 /var/www/html/vendor/symfony/console/Application.php(148): Symfony\Component\Console\Application->doRun(Object(Drush\Symfony\DrushArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#11 /var/www/html/vendor/drush/drush/src/Runtime/Runtime.php(118): Symfony\Component\Console\Application->run(Object(Drush\Symfony\DrushArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#12 /var/www/html/vendor/drush/drush/src/Runtime/Runtime.php(49): Drush\Runtime\Runtime->doRun(Array, Object(Symfony\Component\Console\Output\ConsoleOutput))
#13 /var/www/html/vendor/drush/drush/drush.php(72): Drush\Runtime\Runtime->run(Array)
#14 /var/www/html/vendor/drush/drush/includes/preflight.inc(18): require('/var/www/html/v...')
#15 phar:///usr/local/bin/drush/bin/drush.php(141): drush_main()
#16 /usr/local/bin/drush(10): require('phar:///usr/loc...')
#17 {main}.
?>

As a test, I have included in MigrateExecutable.php file -> use Drupal\Component\Utility\Bytes; but that doesn't seem to work either. I get the following error:

1/124 [>---------------------------]   0% [error]  Error: Call to undefined method Drupal\Component\Utility\Bytes::toNumber() in Drupal\migrate\MemoryManager->__construct() (line 68 of /var/www/html/docroot/core/modules/migrate/src/MemoryManager.php) #0 /var/www/html/docroot/core/lib/Drupal/Component/DependencyInjection/Container.php(259): Drupal\migrate\MemoryManager->__construct(Object(Drupal\Component\EventDispatcher\ContainerAwareEventDispatcher), '0.9', '0.85')
#1 /var/www/html/docroot/core/lib/Drupal/Component/DependencyInjection/Container.php(173): Drupal\Component\DependencyInjection\Container->createService(Array, 'migrate.memory_...')
#2 /var/www/html/docroot/core/lib/Drupal.php(158): Drupal\Component\DependencyInjection\Container->get('migrate.memory_...')
#3 /var/www/html/docroot/core/modules/migrate/src/MigrateExecutable.php(150): Drupal::service('migrate.memory_...')
#4 /var/www/html/docroot/core/modules/migrate/src/MigrateExecutable.php(466): Drupal\migrate\MigrateExecutable->getMemoryManager()
#5 /var/www/html/docroot/core/modules/migrate/src/MigrateExecutable.php(338): Drupal\migrate\MigrateExecutable->checkStatus()
#6 /var/www/html/vendor/drush/drush/includes/drush.inc(206): Drupal\migrate\MigrateExecutable->rollback()
#7 /var/www/html/vendor/drush/drush/includes/drush.inc(197): drush_call_user_func_array(Array, Array)
#8 /var/www/html/docroot/modules/contrib/migrate_tools/src/Commands/MigrateToolsCommands.php(412): drush_op(Array)
#9 [internal function]: Drupal\migrate_tools\Commands\MigrateToolsCommands->rollback('my_taxonomy_migration...', Array)
#10 /var/www/html/vendor/consolidation/annotated-command/src/CommandProcessor.php(257): call_user_func_array(Array, Array)
#11 /var/www/html/vendor/consolidation/annotated-command/src/CommandProcessor.php(212): Consolidation\AnnotatedCommand\CommandProcessor->runCommandCallback(Array, Object(Consolidation\AnnotatedCommand\CommandData))
#12 /var/www/html/vendor/consolidation/annotated-command/src/CommandProcessor.php(176): Consolidation\AnnotatedCommand\CommandProcessor->validateRunAndAlter(Array, Array, Object(Consolidation\AnnotatedCommand\CommandData))
#13 /var/www/html/vendor/consolidation/annotated-command/src/AnnotatedCommand.php(302): Consolidation\AnnotatedCommand\CommandProcessor->process(Object(Symfony\Component\Console\Output\ConsoleOutput), Array, Array, Object(Consolidation\AnnotatedCommand\CommandData))
#14 /var/www/html/vendor/symfony/console/Command/Command.php(255): Consolidation\AnnotatedCommand\AnnotatedCommand->execute(Object(Drush\Symfony\DrushArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#15 /var/www/html/vendor/symfony/console/Application.php(1005): Symfony\Component\Console\Command\Command->run(Object(Drush\Symfony\DrushArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#16 /var/www/html/vendor/symfony/console/Application.php(255): Symfony\Component\Console\Application->doRunCommand(Object(Consolidation\AnnotatedCommand\AnnotatedCommand), Object(Drush\Symfony\DrushArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#17 /var/www/html/vendor/symfony/console/Application.php(148): Symfony\Component\Console\Application->doRun(Object(Drush\Symfony\DrushArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#18 /var/www/html/vendor/drush/drush/src/Runtime/Runtime.php(118): Symfony\Component\Console\Application->run(Object(Drush\Symfony\DrushArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#19 /var/www/html/vendor/drush/drush/src/Runtime/Runtime.php(49): Drush\Runtime\Runtime->doRun(Array, Object(Symfony\Component\Console\Output\ConsoleOutput))
#20 /var/www/html/vendor/drush/drush/drush.php(72): Drush\Runtime\Runtime->run(Array)
#21 /var/www/html/vendor/drush/drush/includes/preflight.inc(18): require('/var/www/html/v...')
#22 phar:///usr/local/bin/drush/bin/drush.php(141): drush_main()
#23 /usr/local/bin/drush(10): require('phar:///usr/loc...')
#24 {main}.

Any idea how I could fix it? My taxonomy migrations are very small, but I would like to use this patch for my node migrations, keeping the batch size as small as possible, since the node migration is very large (>60k nodes).

Thanks a lot for the great work :)

mrweiner made their first commit to this issue’s fork.

rob_pr’s picture

Status: Closed (duplicate) » Needs review

I updated the patch from #8 for the current 5.x-dev and started a merge request on drupalcode.org.

heddn’s picture

Status: Needs review » Needs work

I like the direction this is going. Can we add test coverage? Either to drush or the UI for adjusting the batch size?

Marios Anagnostopoulos’s picture

Would it be a good addition to allow for a more global settings? For example in the migration group, which will be overridden per migration if the setting in this patch is filled.

To implement that we would need changes in both the migrate tools & migrate plus modules though, and I have not idea how we should handle this in the issue queue.

I believe it is worth it though, and am willing to work on it if you can guide me on how to patches that are dependent to one another. Any thoughts?

Edit: Maybe we could also consider adding setting in the migration configurations or is it overkill?

Marios Anagnostopoulos’s picture

Additionally the patch in #8 does not apply for some reason, some line numbers in the patch are off (I am using 5.1.0 of migrate_tools and no other patch for the module, also checked it with the 8.x-5.x branch. Maybe I am doing something wrong, but for anyone facing the same issue, here is reroll i guess.

james.williams made their first commit to this issue’s fork.

james.williams’s picture

I've just brought across the work from #2701121: New batch started because of not enough reclaimed memory seems to forget which migrations have already run leading to missing migration dependencies , which expands the scope of this issue a bit, but as @heddn suggested in #2701121-53: New batch started because of not enough reclaimed memory seems to forget which migrations have already run leading to missing migration dependencies , combining forces on these two very related issues should be well worth it.

I've stuck with the 'batch-size' option from this issue, rather than using the 'batch' option from the other one. I've then added a single commit of my own work on top, to allow specifying the 'batch-size' option as either a percentage of the total number of items, or an absolute number of items to process in each iteration. This resolves the 'todo' comment in the existing code, which was placed there in the original issue that added batching in any form (see #2470882-16: Implement running migration processes through the UI), so that the limit can be set usefully for both large and small migrations.

Meanwhile, I'm hiding all the patches here, as the merge request is probably the thing that should be used nowadays. (The most recent patch was simply a re-rolled version that had nothing but trivial differences to the merge request.)

Leaving as 'needs work', as comment 16 sensibly asked for more test coverage.

Note that any potential credit for people from #2701121: New batch started because of not enough reclaimed memory seems to forget which migrations have already run leading to missing migration dependencies should be brought across, should this work get accepted.

Radelson’s picture

MigrateBatchExecutable is not compatible with the --sync option.

Not directly related with this issue description but it seems apt to mention it here. The --sync option isn't supported from the UI but the patch is introducing support for the Drush command, which advertise the --sync option.

Maybe we shouldn't support the --sync option when used with the --batch-size option and postpone the work to make it compatible ? The command could fail and warn the user that the flags aren't compatible.

We replaced our existing migrations to use batching and we ended up working around that limitation by running them first without batching and with the --limit option. We sync first and then import in a second step.

HitchShock made their first commit to this issue’s fork.

timohuisman’s picture

FileSize
20.62 KB

We use composer patches to apply patches. Merge request patch urls are updated for every commit on the branch, which makes it a moving target. Attached is a patch containing the current state of the merge request.

andycarlberg made their first commit to this issue’s fork.

andycarlberg’s picture

Version: 8.x-5.x-dev » 6.0.x-dev
Status: Needs work » Needs review

I rerolled this against the 6.x version and opened a new PR. I'm updating the version on this issue accordingly.

I haven't rerolled against a major version before. If I should have opened a new issue, let me know and I will do so.

gpolonus made their first commit to this issue’s fork.

grifstuf’s picture

To fix a type error, we had to add a boolean to integer type cast in the MigrateBatchExecutable class. This is because drush brings in CLI option flags as booleans but the class requires integers for the update and force options as opposed to booleans. This type cast did not happen automatically because the MigrateBatchExecutable class has strict_types enabled for that file.

While this fixes the type error encountered when running these batch migrations for now, it is fair to say that the MigrateBatchExecutable class should be able to accept booleans as well as integers. The values for the update and force options are used for their falsiness or truthiness, so semantically the class should accept booleans as well as integers for these options. It might be worth having another issue to fix this tech debt for anyone else wishing to interface with this class in the future.

timohuisman’s picture

FileSize
13.29 KB

This patch contains the current state of the merge request.

manuel.adan’s picture

Title: An option to override batch size? » Option to override batch size
Component: Drush commands » Code
Category: Support request » Feature request
FileSize
525 bytes

The new option might take some time to be fully implement. A "security" max limit could be useful in the meantime.

naveenvalecha’s picture

Assigned: Unassigned » naveenvalecha

Assigning myself to review & test the PR.

jardasmahel made their first commit to this issue’s fork.

jardasmahel’s picture

The tests are passing on my local however they are failing in CI.
Can someone share some wisdom on it?

lando phpunit --group=migrate_tools
PHPUnit 8.5.31 by Sebastian Bergmann and contributors.

Testing
............................

......
.. 36 / 36 (100%)

Time: 4.84 minutes, Memory: 353.00 MB

OK (36 tests, 328 assertions)

HTML output was generated
https://drupal-contributions.lndo.site/sites/simpletest/browser_output/D...
https://drupal-contributions.lndo.site/sites/simpletest/browser_output/D...
https://drupal-contributions.lndo.site/sites/simpletest/browser_output/D...
https://drupal-contributions.lndo.site/sites/simpletest/browser_output/D...
https://drupal-contributions.lndo.site/sites/simpletest/browser_output/D...
https://drupal-contributions.lndo.site/sites/simpletest/browser_output/D...
https://drupal-contributions.lndo.site/sites/simpletest/browser_output/D...
https://drupal-contributions.lndo.site/sites/simpletest/browser_output/D...
https://drupal-contributions.lndo.site/sites/simpletest/browser_output/D...
https://drupal-contributions.lndo.site/sites/simpletest/browser_output/D...
https://drupal-contributions.lndo.site/sites/simpletest/browser_output/D...
https://drupal-contributions.lndo.site/sites/simpletest/browser_output/D...
https://drupal-contributions.lndo.site/sites/simpletest/browser_output/D...
https://drupal-contributions.lndo.site/sites/simpletest/browser_output/D...
https://drupal-contributions.lndo.site/sites/simpletest/browser_output/D...
https://drupal-contributions.lndo.site/sites/simpletest/browser_output/D...
https://drupal-contributions.lndo.site/sites/simpletest/browser_output/D...
https://drupal-contributions.lndo.site/sites/simpletest/browser_output/D...
https://drupal-contributions.lndo.site/sites/simpletest/browser_output/D...
https://drupal-contributions.lndo.site/sites/simpletest/browser_output/D...
https://drupal-contributions.lndo.site/sites/simpletest/browser_output/D...
https://drupal-contributions.lndo.site/sites/simpletest/browser_output/D...
https://drupal-contributions.lndo.site/sites/simpletest/browser_output/D...
https://drupal-contributions.lndo.site/sites/simpletest/browser_output/D...
https://drupal-contributions.lndo.site/sites/simpletest/browser_output/D...
https://drupal-contributions.lndo.site/sites/simpletest/browser_output/D...
https://drupal-contributions.lndo.site/sites/simpletest/browser_output/D...
https://drupal-contributions.lndo.site/sites/simpletest/browser_output/D...
https://drupal-contributions.lndo.site/sites/simpletest/browser_output/D...
https://drupal-contributions.lndo.site/sites/simpletest/browser_output/D...
https://drupal-contributions.lndo.site/sites/simpletest/browser_output/D...

eleonel made their first commit to this issue’s fork.

naveenvalecha’s picture

Status: Needs review » Reviewed & tested by the community
timohuisman’s picture

FileSize
60 KB

This patch contains the current state of the merge request. See "Patches from drupal.org merge request URLs are dangerous?" for more information.

gaards’s picture

The latest patch provided here didn't apply, so I created a new one from the latest diff of merge request 19.

gaards’s picture

FileSize
16.99 KB

Updated the patch with a small fix to the option name which was incorrect, causing the batch size to not work (thanks to olli for spotting it).

thisisalistairsaccount’s picture

Hoping if things are all good here that we could get this into a future release for the module? This would be beneficial for some people we're working with and would love to see it included :)

Kfootm70’s picture

Much like the above comment, is there a date of release for this module please? Thanks team!

naveenvalecha’s picture

Assigned: naveenvalecha » Unassigned

forgot to unassign from myself

aurelianzaha’s picture

FileSize
16.62 KB

attached is the patched re-rolled against 6.0.x

Kristen Pol’s picture

Status: Reviewed & tested by the community » Needs work

I'm ignoring the patches (though I understand why some people are adding them).

We will use the MR. Since there are 2 open. we will use the more recent reroll.

But, the latest MR (19) needs a merge conflict resolved, so moving to needs work:

https://git.drupalcode.org/project/migrate_tools/-/merge_requests/19/con...

Kristen Pol’s picture

Status: Needs work » Needs review

Back to needs review.

Kristen Pol’s picture

I've compared MRs 7 and 19 to see what has changed other than coding standards and formatting and these are the things I found.

-    $migration_dependencies = $migration_plugin->getMigrationDependencies();
+    $migration_dependencies = $migration_plugin->getMigrationDependencies(TRUE);
-      $migration_dependencies = $migration->getMigrationDependencies();
+      $migration_dependencies = $migration->getMigrationDependencies(TRUE);
+   * @option batch-size Optionally use batch iterations, with a limit on the
+   *   number of items to process in each batch iteration.

The description is substantially shorter in the latest MR, but then that makes it much easier to read. Perhaps the extra details could be moved into the README or some other documentation.

+      // Integer cast required because MigrateBatchExecutable requires the
+      // `update` and `force` options to strictly be integers.
+      $options['update'] = (int) $options['update'];
+      $options['force'] = (int) $options['force'];
+    // If this batch is run via Drush, we need to initialize the progress bar
+    // for the background process.
+    if (PHP_SAPI === 'cli') {
+      $output = Drush::output();
+      $output->setDecorated(TRUE);
+      // Initialize the Symfony Console progress bar.
+      \Drupal::service('migrate_tools.migration_drush_command_progress')->initializeProgress(
+        $output,
+        $migration,
+        $options,
+      );
+    }
+    
+    $executable = new MigrateBatchExecutable($migration, $message, $options);
   public function onPostRowDelete(MigrateRowDeleteEvent $event) {
     if ($this->feedback && ($this->deleteCounter) && $this->deleteCounter % $this->feedback == 0) {
-      $this->rollbackMessage(FALSE);
+      $this->progressMessageEmit(FALSE);
Kristen Pol’s picture

I got confirmation from @heddn via Slack that these changes above look okay so this can move forward with a round of testing for the *new* MR. Once that happens and this is RTBC with the new MR, then we can see if @heddn or one of the other comaintainers can get this in.

emixaam’s picture

Sucessfully tested MR 19 on a large custom migration (42000 nodes) with --batch-size of 1000 and 2000, with and without the --update flag, in a local environment and on a dev server. migrate-import correctly processes the given amount of items, and the duration for each batch stays similar.

Kristen Pol’s picture

Issue tags: +Needs manual testing

Hmm... that's disappointing. Anyone else able to test?