Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
Problem/Motivation
For testing large data sets, without waiting for a full migration to complete, it would be helpful to add a row limit option to limit the number of rows for each migration process.
Proposed resolution
Add a row limit option to MigrateExecutable.
Remaining tasks
- Write tests.
- Add the new option.
- Determine how usage
User interface changes
None, but possibly included in #2281691: User interface for migration-based upgrades.
API changes
@todo.
Comment | File | Size | Author |
---|---|---|---|
#7 | interdiff-6-7.txt | 727 bytes | hussainweb |
#7 | add_a_row_limit_option-2282013-7.patch | 7.17 KB | hussainweb |
#6 | interdiff.txt | 2.42 KB | benjy |
#6 | 2282013-6.patch | 7.22 KB | benjy |
#2 | 2282013-row-limit-option-2.patch | 6.44 KB | bdone |
Comments
Comment #1
bdone CreditAttribution: bdone commentedmanual testing of this option with nodes works, but still needs tests.
Comment #2
bdone CreditAttribution: bdone commentedi'm posting a copy of testImportWithValidRow() as a new test, but i've got questions as noted below, (and in this patch)...
i'm hoping someone can give pointers on those comments, or let me know if this the wrong direction.
Comment #3
benjy CreditAttribution: benjy commentedSeems like a good start, setting to NR for a bot run.
Comment #5
dawehnerWould it still make sense to add a similar check given that it could potentially speed up the query? Not sure whether it is worth to consider at all, though.
I think we should check with at least entries as currently it "obviously", does not process more items, given that there are no more.
Should we just skip this function in the test? The override does not do anything
Comment #6
benjy CreditAttribution: benjy commented@bdone, to answer your questions in #2
How to mock multiple rows? - I updated MigrateExecutableTest::getMockSource(), to take a parameter for the number of valid rows the source will have. Not sure if that is the right approach.
How to access MigrateIdMapInterface::importedCount()? - I don't think we need this? We can test the rowLimit by the fact that the other methods such as lookupDestinationId and getProcessPlugins should only be called once if the import breaks out of the loop.
#5
1. I don't see much benefit in this, a range that includes the entire data set?
2. We now have two rows been tested.
3. This makes rowLimitExceeded() public rather than protected.
Comment #7
hussainwebJust a smallish change. I don't think it adds any real value but it doesn't hurt either.
Comment #8
ultimikeI chatted with chx about this issue this morning, here's a partial transcript:
09:35 chx: the idea is splendid, the execution is wrong
09:35 chx: cweagans was pointing out a migrate d7 issue
09:36 chx: https://www.drupal.org/node/2296187
09:36 Druplicon: https://www.drupal.org/node/2296187 => Add batched query support to MigrateSQLSource #2296187: Add batched query support to MigrateSQLSource => 11 comments, 1 IRC mention
09:36 chx: these are not the same, of course
09:36 chx: yet , the source running unlimited queries is a huge problem
09:36 chx: at least when i search for range() i get nothing in SqlBase
09:37 ultimike: So the unlimited queries part the two issues have in common.
09:37 chx: and the only range() in https://www.drupal.org/files/issues/add_a_row_limit_option-2282013-7.patch is (ironic!) is a removal of a broken range()
09:37 ultimike: Rather, it should be considered in both issues.
09:37 ultimike: heh
09:37 chx: i would say that this issue should be postponed on top of the batch
09:38 chx: cos if you have batch support
09:38 chx: then you can tell the executable to run one batch
09:38 chx: and presto, you have a row limit. although then it's much less necessary
09:38 chx: heck, probably it won't even require a core patch .
09:38 chx: just make sure the runOneBatch or something is public.
With this in mind, I created a new issue #2309695: Add query batching to SqlBase.
-mike
Comment #9
benjy CreditAttribution: benjy commentedPostponed on #2309695: Add query batching to SqlBase
Comment #10
mikeryan#2535458: Dispatch events at key points during migration may be another way to accomplish this.
Comment #11
mikeryanContrib can do this itself now: https://www.drupal.org/node/2544874