Problem/Motivation

For testing large data sets, without waiting for a full migration to complete, it would be helpful to add a row limit option to limit the number of rows for each migration process.

Proposed resolution

Add a row limit option to MigrateExecutable.

Remaining tasks

  • Write tests.
  • Add the new option.
  • Determine how usage

User interface changes

None, but possibly included in #2281691: User interface for migration-based upgrades.

API changes

@todo.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

bdone’s picture

manual testing of this option with nodes works, but still needs tests.

bdone’s picture

i'm posting a copy of testImportWithValidRow() as a new test, but i've got questions as noted below, (and in this patch)...

    $this->assertTrue(FALSE, 'this will fail. see @todo.');
    // @todo: determine if this is the best place to test, how to
    // how to mock multiple rows, and how to access
    // MigrateIdMapInterface::importedCount()

i'm hoping someone can give pointers on those comments, or let me know if this the wrong direction.

benjy’s picture

Status: Active » Needs review

Seems like a good start, setting to NR for a bot run.

Status: Needs review » Needs work

The last submitted patch, 2: 2282013-row-limit-option-2.patch, failed testing.

dawehner’s picture

  1. +++ b/core/modules/migrate/src/Plugin/migrate/id_map/Sql.php
    @@ -723,13 +723,6 @@ public function rewind() {
    -
    -    // @todo Make this work.
    -    /*
    -    if (isset($this->options['itemlimit'])) {
    -      $query = $query->range(0, $this->options['itemlimit']);
    -    }
    -    */
    

    Would it still make sense to add a similar check given that it could potentially speed up the query? Not sure whether it is worth to consider at all, though.

  2. +++ b/core/modules/migrate/tests/src/MigrateExecutableTest.php
    @@ -366,6 +366,64 @@ public function testImportWithValidRowWithException() {
    +    $row->expects($this->once())
    +      ->method('getSourceIdValues')
    +      ->will($this->returnValue(array('id' => 'test')));
    

    I think we should check with at least entries as currently it "obviously", does not process more items, given that there are no more.

  3. +++ b/core/modules/migrate/tests/src/TestMigrateExecutable.php
    @@ -77,6 +77,26 @@ public function getMaxExecTime() {
    +   */
    +  public function rowLimitExceeded() {
    +    return parent::rowLimitExceeded();
    +  }
    

    Should we just skip this function in the test? The override does not do anything

benjy’s picture

Status: Needs work » Needs review
FileSize
7.22 KB
2.42 KB

@bdone, to answer your questions in #2

How to mock multiple rows? - I updated MigrateExecutableTest::getMockSource(), to take a parameter for the number of valid rows the source will have. Not sure if that is the right approach.

How to access MigrateIdMapInterface::importedCount()? - I don't think we need this? We can test the rowLimit by the fact that the other methods such as lookupDestinationId and getProcessPlugins should only be called once if the import breaks out of the loop.

#5

1. I don't see much benefit in this, a range that includes the entire data set?
2. We now have two rows been tested.
3. This makes rowLimitExceeded() public rather than protected.

hussainweb’s picture

Just a smallish change. I don't think it adds any real value but it doesn't hurt either.

ultimike’s picture

I chatted with chx about this issue this morning, here's a partial transcript:

09:35 chx: the idea is splendid, the execution is wrong
09:35 chx: cweagans was pointing out a migrate d7 issue
09:36 chx: https://www.drupal.org/node/2296187
09:36 Druplicon: https://www.drupal.org/node/2296187 => Add batched query support to MigrateSQLSource #2296187: Add batched query support to MigrateSQLSource => 11 comments, 1 IRC mention
09:36 chx: these are not the same, of course
09:36 chx: yet , the source running unlimited queries is a huge problem
09:36 chx: at least when i search for range() i get nothing in SqlBase
09:37 ultimike: So the unlimited queries part the two issues have in common.
09:37 chx: and the only range() in https://www.drupal.org/files/issues/add_a_row_limit_option-2282013-7.patch is (ironic!) is a removal of a broken range()
09:37 ultimike: Rather, it should be considered in both issues.
09:37 ultimike: heh
09:37 chx: i would say that this issue should be postponed on top of the batch
09:38 chx: cos if you have batch support
09:38 chx: then you can tell the executable to run one batch
09:38 chx: and presto, you have a row limit. although then it's much less necessary
09:38 chx: heck, probably it won't even require a core patch .
09:38 chx: just make sure the runOneBatch or something is public.

With this in mind, I created a new issue #2309695: Add query batching to SqlBase.

-mike

benjy’s picture

Status: Needs review » Postponed
mikeryan’s picture

#2535458: Dispatch events at key points during migration may be another way to accomplish this.

mikeryan’s picture

Status: Postponed » Closed (duplicate)

Contrib can do this itself now: https://www.drupal.org/node/2544874