Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
Problem/Motivation
The MigrateExecutable class tracks statistics on the migration being performed, but provides no way to retrieve them. TestMigrateExecutable does.
Proposed resolution
Move public methods getTotalSuccesses(), getTotalProcessed(), getSuccessesSinceFeedback() and getProcessedSinceFeedback() from TestMigrateExecutable to MigrateExecutable.
Remaining tasks
Move the functions.
User interface changes
N/A for core (Migrate Plus will make use of it in contrib).
API changes
Add getTotalSuccesses(), getTotalProcessed(), getSuccessesSinceFeedback() and getProcessedSinceFeedback() to the public API.
Comment | File | Size | Author |
---|---|---|---|
#34 | 2443081-migrate-map-events-34.patch | 19.63 KB | mikeryan |
#29 | interdiff.txt | 5.01 KB | mikeryan |
#29 | make_migrateexecutable-2443081-29.patch | 11.88 KB | mikeryan |
Comments
Comment #1
mikeryanAlso, defaulted the stats to 0 (rather than make any user-facing code have to check for NULL and throw in a 0).
Comment #2
benjy CreditAttribution: benjy at CodeDrop commentedWe should probably add the methods to MigrateExecutableInterface as well.
Comment #3
mikeryanAdded the methods to the interface per @benjy's suggestion.
Also improved the docs.
And, added a $reset parameter to the *SinceFeedback methods - when being retrieved for feedback purposes, passing TRUE will zero the fields.
Comment #4
ultimikeLooks good to me.
-mike
Comment #5
benjy CreditAttribution: benjy at CodeDrop commentedI'm not a huge fan of the $reset parameter to getters to reset the values, wouldn't a reset method be better?
Also, I was thinking, maybe we should move the statistics out of the executable into their own object? The executable could just return a MigrationStatistics object or such. Thoughts?
Comment #6
mikeryanIt seems simpler for clients to make the one method call rather than two (get and reset) - it would be rare to reset the count without reading the previous count. But, no biggie - anyone else want to weigh in?
Interesting. That simplifies the MigrateExecutable API to just the one method, and we wouldn't have to extend it (just the MigrationStatistics class) to add more stats...
Comment #7
mikeryanOn the other hand, where the MigrateExecutable currently simply increments its count members, the MigrationStatistics class would have to (in addition to getters and resetters) need to have incrementers for each stat... - it'll be a lot of methods on that API to just manage a few counters.
Comment #8
benjy CreditAttribution: benjy at CodeDrop commentedIt maybe a lot of methods but it will also be an accurate representation of what statistics are available and the public methods will document the API/stats that are available.
Comment #9
mikeryanHere it is with the additional class.
Comment #10
mikeryanLatest version is not yet rtbc.
Comment #11
benjy CreditAttribution: benjy at CodeDrop commentedIn general I like this, I think it's a good clean-up. Few coding standards and comment nitpicks below;
This should be the full class path.
This should be @inheritdoc on the implementation.
Full class path here.
All method comments on the implementation should just use @inheritdoc
s/boolean/bool
Comment #12
mikeryanUpdated per code review.
But now, much as I hate to bikeshed my own patch, I was wondering if we should make it more general and allow counters to be added dynamically. Something like
So, replace
$this->statistics->incrementSuccesses()
with$this->statistics->incrementCounter('successes')
. As for handling the feedback case, it could be done by also tracking an explicit 'successes_since_feedback' counter, but that would require making two increment calls instead of one at each point. Alternatively we could automatically track two counters per counter name, add a getCounterSinceFeedback($counter_name), and rename resetCounter to resetFeedback.Use case - when developing and testing migration code I'm very frequently counting stuff - how many times we find a matching term instead of inserting a new one, how often a given field has 0, 1, or multiple values, etc. This would make it very easy to throw a counter into any plugin - the drush migrate-import command could have a --dump-counters option or somesuch. The disadvantage is static discovery of available statistics, although we could maintain getTotalSuccesses etc. as wrappers around the general API for the statistics used in core.
Comment #13
mikeryanAnd now thinking more about that "feedback"... In D7, progress messages were being generated within the migration class itself - at the end of the migration process by default, and periodically during it with the "feedback" option enabled. And it was only outputting the feedback counters, not the total counters - looking closely at the code, the total counters were only used for two things: computing overall throughput (983 items/second), and populating the migrate_log table, neither of which we're doing in the core module now.
So, I think we should forget about tracking the "feedback" separately - we just have simple counters with get/increment/reset methods. The web and drush interfaces can easily accumulate overall totals for themselves if they need them.
That being said, how will an external tool get periodic feedback during a migration process? #2309695: Add query batching to SqlBase might be the answer there, so I'm not inclined to worry about it here...
Comment #14
mikeryanNo interdiff, since nearly every line changed. This removes the extra "feedback" counters and provides general-purpose counters rather than hard-coded successes/processed counters. At this point, I'm not sure the new class really adds value over building the counters into MigrateExecutable, but consider this a POC...
Comment #15
mikeryanUgh, let me regenerate that patch file....
Comment #16
mikeryanComment #17
benjy CreditAttribution: benjy at CodeDrop commentedHow about somewhere in-between. Eg, lets keep methods for successes and processed, since they're the most common uses and then either in this issue or a follow-up, also add support for arbitrary counters.
Also, I see we've renamed to MigrationCounters. Do we have any other statistic that doesn't get classified as a counter?
Comment #18
mikeryanHaven't had time to generate an iteration with the explicit successes/processed methods... But re: counters - no, all the execution stats we're tracking are counters, and the API with its increment(Counter) method is assuming counters. I felt the more specific name would be clearer, but I'm not wedded to it - we can go back to Statistics if we think some other kind of statistics might be useful.
Comment #19
benjy CreditAttribution: benjy at CodeDrop commentedI'm not to worried either, counters works for me if you feel confident that's all we have.
Comment #20
mikeryanHere it is with specific methods for success/processed. Keeping in mind that I also want to add error/ignored counters, though, I fear method explosion...
Comment #21
benjy CreditAttribution: benjy at CodeDrop commentedYeah I see what you mean about so many methods. I was thinking methods like incrementSuccesses() and incrementProcessed() added to MigrateExecutable instead, and then we don't add the methods to the MigrationCounters object. That way, it's more readable and defined inside of MigrateExecutable but still leaves MigrationCounters flexible?
Comment #22
mikeryanLike so?
At this point #2443089: Track additional statistics when migrating is so simple I folded it in here, but I can back it out if requested...
edit: linked to wrong issue
Comment #23
mikeryanUploaded wrong interdiff...
Comment #24
benjy CreditAttribution: benjy at CodeDrop commentedThe more i think about this issue the more i'm unsure, I had a go at making the executable have a collection of MigrationCounter objects but I'm not sure about that either.
Now we're back to "dynamically" creating counters, maybe we should just go back to an array of counters on the executable and do away with MigrationCounters object altogether. When I originally suggested it we were calling the object MigrationStatistics with methods such as getTotalSuccesses() but that's all gone now. "simplified" patch attaches showing that approach as well.
Thoughts?
Comment #25
mikeryanNo longer an object.
I would prefer 'resetCount' and 'incrementCount'.
These functions need docs.
simplified++
Thanks!
Comment #26
mikeryanMade the changes myself (other than the name changes - writing the doc comments, I decided the original names were right). Also, today I ran across the source plugin counting statistics that nobody uses, I think this is the right patch to remove those.
Comment #27
benjy CreditAttribution: benjy at CodeDrop commentedCome to think of it do the increment* helper methods need to be public? They're only used by the executable now right? Are they likely to be used by something else in the future?
TODO
* Add missing param docs for incrementCounter, getCount and resetCounter
* Add increment* helper methods back to the interface if they're staying public.
* Add resetCounter, incrementCounter and getCount to the interface.
Comment #28
mikeryanWell... The source plugin might need access to incrementIgnored(), depending on how we resolve #2443617: Find a way for SourcePluginBase to get MigrateExecutable. But without the helpers, it can always use incrementCounter() anyway...
I'm feeling more now that we've moved this all back into MigrateExecutable that the helpers don't really add much - if they're not public (and I don't see a use case for overriding them) they don't add anything useful to the API. Maybe, if want the available counters to be discoverable, have
Definite yes to adding the basic counter APIs to the interface.
Comment #29
mikeryanRemoving the specialized methods, adding the public methods to the interface, adding getCounterList().
Comment #30
benjy CreditAttribution: benjy at CodeDrop commentedWhy would we do this? One documents exactly what it does, the other relies on a random string that I have to go find somewhere?
Comment #31
mikeryanStepping back a bit - this issue exists purely to support contrib UIs, which don't have the information necessary to be able to determine on its own the details of what's been imported/failed/ignored. So, for something completely different - instead of the core API dealing with statistics on its own, how about it dispatches events, enabling contrib to track the stats (and who knows what else)? This patch rips the stat tracking out and replaces it with having the ID map dispatch events when an ID mapping is saved or deleted. Note that it would obviate the need for #2443089: Track additional statistics when migrating as well.
I should admit, this is a foot in the door, I think an event-driven approach will help with other issues including
Comment #33
mikeryanYep, fixed up the unit tests to reflect my changes, forgot the simpletests...
Comment #34
mikeryanOK, this time both unit tests and simpletest pass locally.
I'll post a patch at #2432993: Implement reporting of results on migrate-import using the events to count imported/ignored/failed items.
Comment #35
mikeryanComment #36
benjy CreditAttribution: benjy at CodeDrop commentedIn general I think I like this approach, allows us to move a lot of this stuff to contrib.
Why would the migration offer the event dispatcher? Shouldn't Sql just get it's own?
Comment #37
benjy CreditAttribution: benjy at CodeDrop commentedAlso, we need a new title and IS
Comment #38
mikeryanI'd like to go beyond this, emitting events at various key points where contrib can do various things with them - it seemed simplest to me to have the migration hold the event dispatcher, which could propagate to the various plugins (e.g., so the destination plugin can implement #2501731: Migrate pre/post Import/Rollback methods not invoked with events).
Perhaps it would be best to postpone this issue and open a separate issue for adding all the event dispatching I'm imagining? And once that's in, this and a few other issues could be close as dupe/won't fix...
Comment #39
benjy CreditAttribution: benjy at CodeDrop commentedYes, lets open a new issue but lets not go for every event in one hit, how about a few obvious ones to get the concept into Migrate and then we can add more events as follow-ups?
Comment #40
mikeryanAt the moment, the plan is for this to be accomplished by #2535458: Dispatch events at key points during migration.
Comment #41
mikeryanContrib can do this itself now: https://www.drupal.org/node/2544874