Problem/Motivation

  1. Create a migration where initializeIterator yields. This is a great fit and works well.
  2. Now add this as a requirement to another migration.
  3. SourcePluginBase::count() gets called which in turn does $count = $this->getIterator()->count(); which fatals as Generator does not implement count().
  4. See $this->skipCount at the top of the method, add protected $skipCount = TRUE to the class.
  5. Observe the constructor unconditionally overriding this value.
  6. Tear your hair out.

Proposed resolution

Uh, I dunno, at least check whether the iterator is Countable before trying to count the bloody thing. And/or only pick up the skip_count from the configuration if it is set. This is one of those D7 vestiges which got ported and abandoned to bite us later.

Remaining tasks

doCount could use a little doxygen.

Write a test source plugin which uses yield and try to count it and assert it is -1. Check the configuration keys are picked up properly.

User interface changes

API changes

Data model changes

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

chx created an issue. See original summary.

chx’s picture

Issue summary: View changes
Status: Active » Needs review
FileSize
2.25 KB

How can I credit this to Integral Vision. Odd. Anyways, here's how it could look.

chx’s picture

This organization system drupal.org has is one of most confusing UIs I've ever seen. If the organization is only usable for credits if you enter it precisely then why is there no info saying so? I had "Integral Vision" prior now I have "Integral Vision Ltd" and it works. Anyways, fixing credit.

chx’s picture

Edit: nevermind.

heddn’s picture

+1 on approach. Just skip counting, if

  1. skip_count = TRUE
  2. iterator isn't Countable
heddn’s picture

The downside, contrib and maybe core doesn't always implement \Countable, even though it has a count() method. It would be safer to check for count() on the object than trust that all classes implement the interface. For BC reasons at least.

chx’s picture

Core most of the time extends from SqlBase which has its own count(). EmbeddedDataSource and EmtptySource both return ArrayIterator() which is both Countable and also sets an example for contrib. I wouldn't be so worried.

heddn’s picture

migrate_source_csv might be the only one affected then. migrate_source_json seems to return \ArrayIterator. And I'm reworking it right now so we'll be sure to add \Countable to it. I don't know the status for xml, I seem to think it is less mature.

chx’s picture

FileSize
2.27 KB

Oh I found a fantastic trick in the csv source! Thanks, here's a better patch. I don't particularly think I've heard of iterator_count before or if I did it was so long ago I didn't remember.

This both fixes the issue AND keeps full backwards compatibility.

Edit: we need to review the code flow to ensure the source plugin is recreated before/after counting this way because generators are not resetable. Might not worth the bothering?

dawehner’s picture

Oh, this is a pretty neat patch. Using a generator was always pratical already, just some admin features via drush didn't worked. It is great that we fix it.
IMHO we should add a test with a generator to ensure that generators continue to work from end to end.

benjy’s picture

Issue tags: +Needs tests

Also still need to address the feedback in #9 regarding iterator_count messing the internal pointer of the iterator.

chx’s picture

Or ignore #9 because of it.

heddn’s picture

I really like #9. That's how I unknowingly got around the problem in csv, because there is/was an inherent (hidden) assumption that getIterator() returned an Iterator that was Countable. This approach just hides that.

The warning on iterator_count about not retaining the position of the pointer is a little scary. So we should do some testing in that area.

chx’s picture

FileSize
2.29 KB

So iterator_count is only usable if we re-instantiate the iterator because the way it counts, the only way it possibly can count is to exhaust the iterator. But, turns out we have a method for that... this is the non-scary version of the patch and through the layers of caching it should not be horribly slow but also, what slow, this is migrate.

If you have something like a remote thing where count just takes way too long then by all means add $skipCount to your class or implement your own count. There's no real regression here because the only case when it takes degenerate long previously fataled so there surely was some coding around that already like

  public function __construct(array $configuration, $plugin_id, $plugin_definition, MigrationInterface $migration) {
    parent::__construct($configuration, $plugin_id, $plugin_definition, $migration);
    $this->skipCount = TRUE;
gnuget’s picture

Assigned: Unassigned » gnuget

I'm going to work on this.

edysmp’s picture

Status: Needs review » Reviewed & tested by the community
gnuget’s picture

Status: Reviewed & tested by the community » Needs work

I'm still working on the tests.

Sorry this took me more time than I expected because in my country just finished the holiday season, hope be able to finish my tests this week.

gnuget’s picture

Assigned: gnuget » Unassigned

Sorry, I have been having very hectic weeks and not sure if I'm gonna be able to work on this soon so I switch back the ticket to unassigned.

gnuget’s picture

Assigned: Unassigned » gnuget
gnuget’s picture

Assigned: gnuget » Unassigned
Status: Needs work » Needs review
Issue tags: -Needs tests
FileSize
10.18 KB
12.47 KB

Ok I created a couple migration tests.

Both using a generator instead of an iterator and the second migration has as a dependency the first one.

Status: Needs review » Needs work

The last submitted patch, 20: requiring_a_migration-2684567-20.patch, failed testing.

gnuget’s picture

Status: Needs work » Needs review
FileSize
1.07 KB
12.46 KB

I fixed some typos.

vasi’s picture

Hi gnuget, thanks for the tests. However, they don't test the actual thing we want to test. From the summary: "Create a migration where initializeIterator yields."

That means we want to use a PHP generator: http://php.net/manual/en/language.generators.overview.php

vasi’s picture

Status: Needs review » Needs work
gnuget’s picture

Status: Needs work » Needs review

Hi gnuget, thanks for the tests. However, they don't test the actual thing we want to test. From the summary: "Create a migration where initializeIterator yields."

Actually, it does:

The initializeIterator use GeneratorObject which use:

  /**
+   * readFile.
+   */
+  private function readFile($file) {
+    $f = fopen($file, 'r');
+    try {
+      while ($line = fgets($f)) {
+        yield $line;
+      }
+    } finally {
+      fclose($f);
+    }
+  }
+

I had to do it in this way because the rewind method in the generators doesn't work as expected so I had to wrap the generator in another object and fix the rewind.

Also in the MigrateUsingGenerator I made two tests, one using the generator and a second one using as a dependency a migration which uses the generator plugin.

Thanks for your review!

mikeryan’s picture

Status: Needs review » Needs work
  1. +++ b/core/modules/migrate/src/Plugin/migrate/source/SourcePluginBase.php
    @@ -141,10 +141,11 @@ public function __construct(array $configuration, $plugin_id, $plugin_definition
    -    $this->cacheCounts = !empty($configuration['cache_counts']);
    -    $this->skipCount = !empty($configuration['skip_count']);
    -    $this->cacheKey = !empty($configuration['cache_key']) ? !empty($configuration['cache_key']) : NULL;
    -    $this->trackChanges = !empty($configuration['track_changes']) ? $configuration['track_changes'] : FALSE;
    +    foreach (['cacheCounts' => 'cache_counts', 'skipCount' => 'skip_count', 'cacheKey' => 'cache_key', 'trackChanges' => 'track_changes'] as $property => $config_key) {
    +      if (isset($configuration[$config_key])) {
    +        $this->$property = (bool) $configuration[$config_key];
    +      }
    +    }
    

    I don't see how reorganizing this code is in scope for this patch? Oh, now I do, so skipCount (and cacheCounts) is only overridden if explicitly in configuration. However, cacheKey is not supposed to be a boolean - that's being fixed in #2723115: Count cache configuration does not work as expected - so shouldn't be part of the loop here.

  2. +++ b/core/modules/migrate/tests/modules/migrate_generator_test/src/GeneratorObject.php
    --- /dev/null
    +++ b/core/modules/migrate/tests/modules/migrate_generator_test/src/Plugin/migrate/source/generator.php
    
    +++ b/core/modules/migrate/tests/modules/migrate_generator_test/src/Plugin/migrate/source/generator.php
    @@ -0,0 +1,111 @@
    + * Contains \Drupal\migrate_source_generator\Plugin\migrate\source\generator.
    ...
    +class generator extends SourcePluginBase {
    

    s/generator/Generator/

  3. +++ b/core/modules/migrate/tests/src/Kernel/MigrateUsingGenerator.php
    @@ -0,0 +1,152 @@
    +  protected $countries_file;
    ...
    +  protected $states_file;
    

    Coding standard for member variables is $countriesFile/$statesFile.

The test seems more elaborate than necessary to test the actual problem being fixed (counts) - presumably in response to dawehner's comment "IMHO we should add a test with a generator to ensure that generators continue to work from end to end." Maybe that should be a follow-up.

Speaking of which... I don't see the counts being tested?

vasi’s picture

Hmm. I've tried making migrations with just a generator in initializeIterator(), eg:

  public function initializeIterator() {
    $data = [
      ['title' => 'foo'],
      ['title' => 'bar'],
      ['title' => 'iggy'],
    ];
    foreach ($data as $row) {
      yield $row;
    }
  }

I didn't need to wrap it in an iterator. Did you try that? What went wrong?

I think it could make your tests much simpler. The simpler the better :)

gnuget’s picture

Hi Vasi.

Hmm. I've tried making migrations with just a generator in initializeIterator(), eg:

  public function initializeIterator() {
    $data = [
      ['title' => 'foo'],
      ['title' => 'bar'],
      ['title' => 'iggy'],
    ];
    foreach ($data as $row) {
      yield $row;
    }
  }

I didn't need to wrap it in an iterator. Did you try that? What went wrong?

I think it could make your tests much simpler. The simpler the better :)

As far as I remember when I wrote this the bug is only reproducible when the migration depends on another migration which uses generators.

I will fix the points of #26 and try to make this simpler

Thanks for your time and review!

vasi’s picture

Yes, in my testing I was able to make migrationA use a generator (like the code I pasted above), and then migrationB depend on it, and that triggered the bug from the summary. So it would make a great test!

I didn't see any secondary issue with rewind().

gnuget’s picture

Status: Needs work » Needs review
FileSize
2.26 KB

Alright, I will start fresh here, I will move my end-to-end test to a follow up, and here I will just test the counts value.

So, I attach a reroll of #14. and in my next patch I will include the test.

gnuget’s picture

Hi Vasi, you were right, I was overthinking this, This time I was able to write a much simpler patch, thanks for your help!

Tests and the patch.

gnuget’s picture

The last submitted patch, 31: 2684567_31_only_test.patch, failed testing.

gnuget’s picture

I improved a little the comments and renamed the generator class.

Version: 8.2.x-dev » 8.3.x-dev

Drupal 8.2.0-beta1 was released on August 3, 2016, which means new developments and disruptive changes should now be targeted against the 8.3.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

mikeryan’s picture

Version: 8.3.x-dev » 8.2.x-dev
Status: Needs review » Reviewed & tested by the community

OK, looks good to me...

heddn’s picture

+++ b/core/modules/migrate/src/Plugin/migrate/source/SourcePluginBase.php
@@ -417,4 +419,15 @@ protected function getCache() {
+  protected function doCount() {
+    $iterator = $this->getIterator();
+    return $iterator instanceof \Countable ? $iterator->count() : iterator_count($this->initializeIterator());

Just did a quick run through this again and refreshed myself why the fancy checking is necessary. It is because SourcePluginBase implements \Countable, therefore absolving any modules that extend from it to implement the count() method. But not all iterators are Countable, therefore the need to iterator_count() instead. A source plugin that yields would also be in that same camp.

+1 on RTBC.

Status: Reviewed & tested by the community » Needs work

The last submitted patch, 34: requiring_a_migration-2684567-34.patch, failed testing.

The last submitted patch, 34: requiring_a_migration-2684567-34.patch, failed testing.

The last submitted patch, 34: requiring_a_migration-2684567-34.patch, failed testing.

mikeryan’s picture

Issue tags: +Needs reroll
gnuget’s picture

Status: Needs work » Needs review
Issue tags: -Needs reroll
FileSize
5.12 KB

Here a #34 reroll.

chx’s picture

Status: Needs review » Reviewed & tested by the community

Back to RTBC then.

mikeryan’s picture

Noting here (as chx did in the other issue) this overlaps with #2598670: Source count caching does not work for SQL sources, except it names the new method doCount rather than computeCount. If this patch renamed SqlBase::count() to SqlBase::doCount() the other issue could be closed as a duplicate.

Status: Reviewed & tested by the community » Needs work

The last submitted patch, 42: requiring_a_migration-2684567-42.patch, failed testing.

chx’s picture

Status: Needs work » Reviewed & tested by the community
alexpott’s picture

Status: Reviewed & tested by the community » Fixed
Issue tags: +rc eligible

Committed and pushed ae1f599 to 8.3.x and 1d5f0f4 to 8.2.x. Thanks!

Committed to 8.2.x since this is a migrate patch and the module is still experimental.

diff --git a/core/modules/migrate/tests/src/Unit/MigrateSourceTest.php b/core/modules/migrate/tests/src/Unit/MigrateSourceTest.php
index 2d669c0..4cc91fd 100644
--- a/core/modules/migrate/tests/src/Unit/MigrateSourceTest.php
+++ b/core/modules/migrate/tests/src/Unit/MigrateSourceTest.php
@@ -532,4 +532,5 @@ protected function initializeIterator() {
       yield $row;
     }
   }
+
 }

Fixed on commit.

  • alexpott committed ae1f599 on 8.3.x
    Issue #2684567 by gnuget, chx, heddn, vasi, mikeryan: Requiring a...

  • alexpott committed 1d5f0f4 on 8.2.x
    Issue #2684567 by gnuget, chx, heddn, vasi, mikeryan: Requiring a...

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

quietone’s picture

publish change record