Problem/Motivation

The MigrateExecutable class tracks statistics on the migration being performed, but provides no way to retrieve them. TestMigrateExecutable does.

Proposed resolution

Move public methods getTotalSuccesses(), getTotalProcessed(), getSuccessesSinceFeedback() and getProcessedSinceFeedback() from TestMigrateExecutable to MigrateExecutable.

Remaining tasks

Move the functions.

User interface changes

N/A for core (Migrate Plus will make use of it in contrib).

API changes

Add getTotalSuccesses(), getTotalProcessed(), getSuccessesSinceFeedback() and getProcessedSinceFeedback() to the public API.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

mikeryan’s picture

Status: Active » Needs review
FileSize
3.34 KB

Also, defaulted the stats to 0 (rather than make any user-facing code have to check for NULL and throw in a 0).

benjy’s picture

We should probably add the methods to MigrateExecutableInterface as well.

mikeryan’s picture

FileSize
5.32 KB
3.69 KB

Added the methods to the interface per @benjy's suggestion.

Also improved the docs.

And, added a $reset parameter to the *SinceFeedback methods - when being retrieved for feedback purposes, passing TRUE will zero the fields.

ultimike’s picture

Status: Needs review » Reviewed & tested by the community

Looks good to me.

-mike

benjy’s picture

I'm not a huge fan of the $reset parameter to getters to reset the values, wouldn't a reset method be better?

Also, I was thinking, maybe we should move the statistics out of the executable into their own object? The executable could just return a MigrationStatistics object or such. Thoughts?

mikeryan’s picture

I'm not a huge fan of the $reset parameter to getters to reset the values, wouldn't a reset method be better?

It seems simpler for clients to make the one method call rather than two (get and reset) - it would be rare to reset the count without reading the previous count. But, no biggie - anyone else want to weigh in?

Also, I was thinking, maybe we should move the statistics out of the executable into their own object? The executable could just return a MigrationStatistics object or such. Thoughts?

Interesting. That simplifies the MigrateExecutable API to just the one method, and we wouldn't have to extend it (just the MigrationStatistics class) to add more stats...

mikeryan’s picture

On the other hand, where the MigrateExecutable currently simply increments its count members, the MigrationStatistics class would have to (in addition to getters and resetters) need to have incrementers for each stat... - it'll be a lot of methods on that API to just manage a few counters.

benjy’s picture

It maybe a lot of methods but it will also be an accurate representation of what statistics are available and the public methods will document the API/stats that are available.

mikeryan’s picture

Here it is with the additional class.

mikeryan’s picture

Status: Reviewed & tested by the community » Needs review

Latest version is not yet rtbc.

benjy’s picture

In general I like this, I think it's a good clean-up. Few coding standards and comment nitpicks below;

  1. +++ b/core/modules/migrate/src/MigrateExecutable.php
    @@ -81,6 +58,13 @@ class MigrateExecutable implements MigrateExecutableInterface {
    +   * @var MigrationStatistics
    

    This should be the full class path.

  2. +++ b/core/modules/migrate/src/MigrateExecutable.php
    @@ -605,6 +584,15 @@ protected function getTimeElapsed() {
       /**
    +   * Return an object containing current execution statistics for a migration.
    +   *
    +   * @return MigrationStatistics
    +   */
    

    This should be @inheritdoc on the implementation.

  3. +++ b/core/modules/migrate/src/MigrateExecutableInterface.php
    @@ -45,6 +45,13 @@ public function processRow(Row $row, array $process = NULL, $value = NULL);
    +   * @return MigrationStatistics
    

    Full class path here.

  4. +++ b/core/modules/migrate/src/MigrationStatistics.php
    @@ -0,0 +1,115 @@
    +  /**
    +   * Returns the number of items successfully imported since the last feedback.
    +   *
    +   * @param boolean $reset
    +   *   TRUE to reset the number of successes since feedback to 0.
    +   *
    +   * @return int
    +   *   The number of successes since last feedback.
    +   */
    

    All method comments on the implementation should just use @inheritdoc

  5. +++ b/core/modules/migrate/src/MigrationStatisticsInterface.php
    @@ -0,0 +1,63 @@
    +   * @param boolean $reset
    

    s/boolean/bool

mikeryan’s picture

Updated per code review.

But now, much as I hate to bikeshed my own patch, I was wondering if we should make it more general and allow counters to be added dynamically. Something like

  public getCounter($counter_name);
  public incrementCounter($counter_name);
  public resetCounter($counter_name);

So, replace $this->statistics->incrementSuccesses() with $this->statistics->incrementCounter('successes'). As for handling the feedback case, it could be done by also tracking an explicit 'successes_since_feedback' counter, but that would require making two increment calls instead of one at each point. Alternatively we could automatically track two counters per counter name, add a getCounterSinceFeedback($counter_name), and rename resetCounter to resetFeedback.

Use case - when developing and testing migration code I'm very frequently counting stuff - how many times we find a matching term instead of inserting a new one, how often a given field has 0, 1, or multiple values, etc. This would make it very easy to throw a counter into any plugin - the drush migrate-import command could have a --dump-counters option or somesuch. The disadvantage is static discovery of available statistics, although we could maintain getTotalSuccesses etc. as wrappers around the general API for the statistics used in core.

mikeryan’s picture

And now thinking more about that "feedback"... In D7, progress messages were being generated within the migration class itself - at the end of the migration process by default, and periodically during it with the "feedback" option enabled. And it was only outputting the feedback counters, not the total counters - looking closely at the code, the total counters were only used for two things: computing overall throughput (983 items/second), and populating the migrate_log table, neither of which we're doing in the core module now.

So, I think we should forget about tracking the "feedback" separately - we just have simple counters with get/increment/reset methods. The web and drush interfaces can easily accumulate overall totals for themselves if they need them.

That being said, how will an external tool get periodic feedback during a migration process? #2309695: Add query batching to SqlBase might be the answer there, so I'm not inclined to worry about it here...

mikeryan’s picture

No interdiff, since nearly every line changed. This removes the extra "feedback" counters and provides general-purpose counters rather than hard-coded successes/processed counters. At this point, I'm not sure the new class really adds value over building the counters into MigrateExecutable, but consider this a POC...

mikeryan’s picture

Ugh, let me regenerate that patch file....

mikeryan’s picture

benjy’s picture

How about somewhere in-between. Eg, lets keep methods for successes and processed, since they're the most common uses and then either in this issue or a follow-up, also add support for arbitrary counters.

Also, I see we've renamed to MigrationCounters. Do we have any other statistic that doesn't get classified as a counter?

mikeryan’s picture

Haven't had time to generate an iteration with the explicit successes/processed methods... But re: counters - no, all the execution stats we're tracking are counters, and the API with its increment(Counter) method is assuming counters. I felt the more specific name would be clearer, but I'm not wedded to it - we can go back to Statistics if we think some other kind of statistics might be useful.

benjy’s picture

I'm not to worried either, counters works for me if you feel confident that's all we have.

mikeryan’s picture

Here it is with specific methods for success/processed. Keeping in mind that I also want to add error/ignored counters, though, I fear method explosion...

benjy’s picture

Yeah I see what you mean about so many methods. I was thinking methods like incrementSuccesses() and incrementProcessed() added to MigrateExecutable instead, and then we don't add the methods to the MigrationCounters object. That way, it's more readable and defined inside of MigrateExecutable but still leaves MigrationCounters flexible?

mikeryan’s picture

FileSize
11.46 KB
556 bytes

Like so?

At this point #2443089: Track additional statistics when migrating is so simple I folded it in here, but I can back it out if requested...

edit: linked to wrong issue

mikeryan’s picture

FileSize
6.23 KB

Uploaded wrong interdiff...

benjy’s picture

The more i think about this issue the more i'm unsure, I had a go at making the executable have a collection of MigrationCounter objects but I'm not sure about that either.

Now we're back to "dynamically" creating counters, maybe we should just go back to an array of counters on the executable and do away with MigrationCounters object altogether. When I originally suggested it we were calling the object MigrationStatistics with methods such as getTotalSuccesses() but that's all gone now. "simplified" patch attaches showing that approach as well.

Thoughts?

mikeryan’s picture

  1. +++ b/core/modules/migrate/src/MigrateExecutable.php
    @@ -81,6 +58,13 @@ class MigrateExecutable implements MigrateExecutableInterface {
    +   * An object for tracking statistics.
    

    No longer an object.

  2. +++ b/core/modules/migrate/src/MigrateExecutable.php
    @@ -604,6 +584,49 @@ protected function getTimeElapsed() {
    +  public function resetCounter($name) {
    +    $this->counters[$name] = 0;
    +  }
    +
    +  public function incrementCounter($name) {
    +    if (!isset($this->counters[$name])) {
    +      $this->counters[$name] = 0;
    +    }
    +    $this->counters[$name]++;
    +  }
    +
    +  public function getCount($name) {
    

    I would prefer 'resetCount' and 'incrementCount'.

    These functions need docs.

simplified++

Thanks!

mikeryan’s picture

FileSize
11.42 KB
3.53 KB

Made the changes myself (other than the name changes - writing the doc comments, I decided the original names were right). Also, today I ran across the source plugin counting statistics that nobody uses, I think this is the right patch to remove those.

benjy’s picture

Come to think of it do the increment* helper methods need to be public? They're only used by the executable now right? Are they likely to be used by something else in the future?

TODO

* Add missing param docs for incrementCounter, getCount and resetCounter
* Add increment* helper methods back to the interface if they're staying public.
* Add resetCounter, incrementCounter and getCount to the interface.

mikeryan’s picture

Well... The source plugin might need access to incrementIgnored(), depending on how we resolve #2443617: Find a way for SourcePluginBase to get MigrateExecutable. But without the helpers, it can always use incrementCounter() anyway...

I'm feeling more now that we've moved this all back into MigrateExecutable that the helpers don't really add much - if they're not public (and I don't see a use case for overriding them) they don't add anything useful to the API. Maybe, if want the available counters to be discoverable, have

public function counterList() {
  return array_keys($this->counters);
}

Definite yes to adding the basic counter APIs to the interface.

mikeryan’s picture

FileSize
11.88 KB
5.01 KB

Removing the specialized methods, adding the public methods to the interface, adding getCounterList().

benjy’s picture

+++ b/core/modules/migrate/src/MigrateExecutable.php
@@ -258,7 +258,7 @@
-        $this->incrementSkips();
+        $this->incrementCounter('skips');

Why would we do this? One documents exactly what it does, the other relies on a random string that I have to go find somewhere?

mikeryan’s picture

Stepping back a bit - this issue exists purely to support contrib UIs, which don't have the information necessary to be able to determine on its own the details of what's been imported/failed/ignored. So, for something completely different - instead of the core API dealing with statistics on its own, how about it dispatches events, enabling contrib to track the stats (and who knows what else)? This patch rips the stat tracking out and replaces it with having the ID map dispatch events when an ID mapping is saved or deleted. Note that it would obviate the need for #2443089: Track additional statistics when migrating as well.

I should admit, this is a foot in the door, I think an event-driven approach will help with other issues including

Status: Needs review » Needs work

The last submitted patch, 31: 2443081-migrate-map-events-31.patch, failed testing.

mikeryan’s picture

Yep, fixed up the unit tests to reflect my changes, forgot the simpletests...

mikeryan’s picture

Status: Needs work » Needs review
FileSize
19.63 KB

OK, this time both unit tests and simpletest pass locally.

I'll post a patch at #2432993: Implement reporting of results on migrate-import using the events to count imported/ignored/failed items.

mikeryan’s picture

benjy’s picture

In general I think I like this approach, allows us to move a lot of this stuff to contrib.

+++ b/core/modules/migrate/src/Plugin/migrate/id_map/Sql.php
@@ -139,8 +150,8 @@ class Sql extends PluginBase implements MigrateIdMapInterface {
+    $this->eventDispatcher = $migration->getEventDispatcher();

Why would the migration offer the event dispatcher? Shouldn't Sql just get it's own?

benjy’s picture

Also, we need a new title and IS

mikeryan’s picture

I'd like to go beyond this, emitting events at various key points where contrib can do various things with them - it seemed simplest to me to have the migration hold the event dispatcher, which could propagate to the various plugins (e.g., so the destination plugin can implement #2501731: Migrate pre/post Import/Rollback methods not invoked with events).

Perhaps it would be best to postpone this issue and open a separate issue for adding all the event dispatching I'm imagining? And once that's in, this and a few other issues could be close as dupe/won't fix...

benjy’s picture

Perhaps it would be best to postpone this issue and open a separate issue for adding all the event dispatching I'm imagining?

Yes, lets open a new issue but lets not go for every event in one hit, how about a few obvious ones to get the concept into Migrate and then we can add more events as follow-ups?

mikeryan’s picture

Status: Needs review » Postponed

At the moment, the plan is for this to be accomplished by #2535458: Dispatch events at key points during migration.

mikeryan’s picture

Status: Postponed » Closed (duplicate)

Contrib can do this itself now: https://www.drupal.org/node/2544874