Problem/Motivation

Using the migrate_drupal module, I was trying to write a module that has a UI that shows total number of rows processed over the total number of actual rows in the source DB. For example, if I have 42 articles in my Drupal 7 site and I process 13 of them, it would have a textual display that shows: 13/42 rows processed.

As I was doing this, I discovered that the Variable source plugin will return 0 for the total count if the configured variables don't exist, but initializeIterator() will always return an iterator with 1 row. This leads to a weird case where my UI shows 1/0 rows processed, which is nonsensical.

Proposed resolution

Check if the source has rows and return an iterator created from an empty array if the count of source rows is 0

User interface changes

None.

API changes

None.

Data model changes

None.

Release notes snippet

Not necessary.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

gabesullice created an issue. See original summary.

gabesullice’s picture

Crediting @Wim Leers because we were pair programming when we found this bug and wrote a patch.

gabesullice’s picture

Status: Active » Needs review
Issue tags: +Needs tests
FileSize
667 bytes

Status: Needs review » Needs work

The last submitted patch, 4: 3096969-4.patch, failed testing. View results

quietone’s picture

Let me begin this with the fact that the Variable is unique and does not behave as other source plugins and the doc does tell us that it always returns a row.

* This source class always returns a single row and as such is not a good
* example for any normal source class returning multiple rows.

As we can see by the failing test having the Variable source plugin return an empty row causes the pipeline to not be processed. That is a significant change and one I don't think we want to make. But let's think about that.

I guess the question is 'what do we do when the variable being requested does not exist?'. Currently, the Variable source plugin does not create a property for a variable that does not exist. But a row *is* created with, at least, an 'id' property set to, I think, the first variable being requested. So, there is always a row and it contains a property for each variable requested and no property when a variable does not exist. Do we need to be explicit when a variable does not exist? If it was we could create a property, say undefined, which could contain all the variable names that do not exist. Then the pipeline could use that information as needed. But if we did that we would still have a row which this issue want to avoid.

I do think this is a won't fix.

Wim Leers’s picture

This is the key thing we would like to see fixed:

This leads to a weird case where my UI shows 1/0 rows processed, which is nonsensical.

This is a UI problem first and foremost.

If we can't change the Variable source plugin because of deep underlying technical reasons, then so be it, but we do need to ensure we inform the user of the Migrate UI correctly. Right now the UI is conveying incorrect information.

Do you see a way forward to improve the accuracy of the UI?

quietone’s picture

I think we are on different pages, the core Migrate UI does not display row counts.

gabesullice’s picture

Maybe the problem is at the other end. If the iterator must always have 1 row, then Variable::count() should always return 1 instead of doing this:

  /**
   * {@inheritdoc}
   */
  public function count($refresh = FALSE) {
    return intval($this->query()->countQuery()->execute()->fetchField() > 0);
  }

which will return 0 if no relevant variables exist on the source site.

gabesullice’s picture

Do we need to be explicit when a variable does not exist? If it was we could create a property, say undefined, which could contain all the variable names that do not exist. Then the pipeline could use that information as needed. But if we did that we would still have a row which this issue want to avoid.

If I'm understanding you correctly, you're saying that it's a "feature" of the Variable source plugin that it always returns a row. This might be useful if you want to define some configuration on the destination site based on the fact that a variable was missing. Is that correct? If so, I think I see how that could be valuable.

I don't see how my proposal in #9 would break that. WDYT?

quietone’s picture

Status: Needs work » Needs review
FileSize
638 bytes

OK, lets see what happens if we force Variable to return a count of 1.

mikelutz’s picture

Status: Needs review » Needs work

Setting to NW for a test. Let's explicitly test that that source returns a count of 1 for a non-existent variable so we don't inadvertently change this back in the future.

quietone’s picture

Status: Needs work » Needs review
Issue tags: -Needs tests
FileSize
1.32 KB
1.94 KB

Right, I meant to come back to this.

Added two tests, one with one of two variable names misspelled and one with two of two variable names misspelled.

mikelutz’s picture

Status: Needs review » Reviewed & tested by the community

Thanks for the test!

  • webchick committed fb65f06 on 9.0.x
    Issue #3096969 by quietone, gabesullice, Wim Leers, mikelutz:...

  • webchick committed 58b8dca on 8.9.x
    Issue #3096969 by quietone, gabesullice, Wim Leers, mikelutz:...

  • webchick committed 2df2354 on 8.8.x
    Issue #3096969 by quietone, gabesullice, Wim Leers, mikelutz:...
webchick’s picture

Status: Reviewed & tested by the community » Fixed

I got a bit confused on what this was doing, as it seemed like we were losing valuable logic checking... @mikelutz explained that the bug was if you asked that source how many rows it was going to send, it would tell you zero if the variable didn’t exist, even though it was still going to send one, because of the variable's ID.

...which still is confusing to me ;) but nevertheless, everyone else here seems to be able to make sense of it, and there is test coverage, so...

Committed and pushed to 9.0.x; 8.9.x; 8.8.x. Thanks!

webchick’s picture

Version: 8.9.x-dev » 8.8.x-dev
Wim Leers’s picture

...which still is confusing to me ;) but nevertheless, everyone else here seems to be able to make sense of it, and there is test coverage, so...

I 100% agree this is still very confusing.

But at least now it's consistently confusing. It makes it impossible to have the UI say "1 of 0 rows processed" — it'll now say "1 of 1 rows processed" even when there's 0 variables. So it's a step in the right direction.

(There are reasons deep in the migrate architecture for why this makes sense for the underlying code/infrastructure — it still doesn't make sense for the end user.)

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.