Problem/Motivation

There was a bug report related to problems with the row hash when using track_changes and migrating using the --update option. Tested in #5 and was not able to reproduce.

Proposed resolution

Add assertions to TrackChangesTest

Remaining tasks

Decide if new test coverage is worth committing.

User interface changes

None

API changes

None

Data model changes

None

Release notes snippet

None

Original Report Below

Problem/Motivation

During a migration process, the Row object contains $source property containing source data. The hash is used for checking if the source data have been changed since the last migration. The hash is calculated from the whole $source property.

When a record is imported for the first time, the $source property array keys migrate_map_sourceid1 and migrate_map_source_row_status are NULL.
When the same record is imported again (e.g. as migrate import with --update parameter), then the keys migrate_map_sourceid1 and migrate_map_source_row_status are not NULL anymore and these changes cause the calculated hash to be different. It means the record is considered as changed, but the source data have not.

Steps to reproduce

Set $this->trackChanges = TRUE, migrate one record, check the hash in the migrate_map.
Migrate the same record again (as --update or --idlist) and check the hash again.
The hash has been changed. It shouldn't, because the data are still the same.

Proposed resolution

Use the same logic as D7 Migrate module and exclude migrate_map_* data from the hash calculation.

Remaining tasks

Would be nice to know what is adding 'migrate_map_' properties to the source

Comments

martin_klima created an issue. See original summary.

martin_klima’s picture

Status: Active » Needs review
StatusFileSize
new891 bytes
martin_klima’s picture

StatusFileSize
new811 bytes

Fix patch: Add missing row.

Version: 9.3.x-dev » 9.4.x-dev

Drupal 9.3.0-rc1 was released on November 26, 2021, which means new developments and disruptive changes should now be targeted for the 9.4.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

quietone’s picture

Issue tags: +Bug Smash Initiative
StatusFileSize
new2.16 KB

What is adding migrate_map properties to the source?

I tested this on Drupal 9.4.x, standard install and was not able to reproduce the problem.

I used the Drupal 7 test fixture as the source and the d7_user_role migration, with track_changes: true added. I ran drush mim d7_user_role. I then used the debugger and checked the hash in Row::rehash() before and after mim --update d7_user_role.

Maybe not needed but I added assertions in TrackChangesTest to confirm the hash does not change when the source does not change.

quietone’s picture

Issue summary: View changes
StatusFileSize
new712 bytes
new2.16 KB

Fix spelling error.

quietone’s picture

StatusFileSize
new626 bytes
new1.84 KB

Remove an unneccesary change.

danflanagan8’s picture

Status: Needs review » Needs work

What is adding migrate_map properties to the source?

That's a great question. This can't be a core source plugin right?

As for the new assertions, it looks like there's some kind of copy-paste typo below:

+++ b/core/modules/migrate/tests/src/Kernel/TrackChangesTest.php
@@ -131,6 +145,16 @@ public function testTrackChanges() {
+    $this->assertNotEquals($original_hash[1], $new_hash[1]);
+    $this->assertEquals($original_hash[2], $new_hash[2]);
+    $this->assertNotEquals($original_hash[1], $new_hash[1]);
+    $this->assertNotEquals($original_hash[1], $new_hash[1]);

I'm thinking this was intended to be more like this:

$this->assertNotEquals($original_hash[1], $new_hash[1]);
$this->assertEquals($original_hash[2], $new_hash[2]);
$this->assertNotEquals($original_hash[3], $new_hash[3]);
$this->assertNotEquals($original_hash[4], $new_hash[4]);

I also have a question. The IS and comments refer to using the --update option when running drush mim, but the test obviously doesn't rely on drush. Do we need to add $id_map->prepareUpdate(); somewhere(s) in the test?

FWIW, the new assertion still pass if I throw add $id_map->prepareUpdate(); before the re-migration.

I'll set this to NW to fix that copy/paste typo, but then it should be good. Thanks!

quietone’s picture

Status: Needs work » Needs review
StatusFileSize
new1.73 KB
new2.51 KB

@danflanagan8, thanks for the review. Good point about using prepareUpdate.

#8. It turns out the both 'Item 2' and 'Item 4' are modified so two of the hashes change. At the end of the test I added the prepareUpdate() and retested.

danflanagan8’s picture

Status: Needs review » Needs work

If we add the re-import after calling prepareUpdate the assertions would be more useful if they were all assertEquals. The assertNotEquals assertions are kind of funny here.

+++ b/core/modules/migrate/tests/src/Kernel/TrackChangesTest.php
@@ -144,6 +169,17 @@ public function testTrackChanges() {
+
+    // Test hashes again after forcing all rows to be re-imported.
+    $id_map->prepareUpdate();
+    for ($i = 1; $i < 5; $i++) {
+      $row = $id_map->getRowBySource(['tid' => $i]);
+      $new_hash[$i] = $row['hash'];
+    }
+    $this->assertNotEquals($original_hash[1], $new_hash[1]);
+    $this->assertEquals($original_hash[2], $new_hash[2]);
+    $this->assertNotEquals($original_hash[3], $new_hash[3]);
+    $this->assertEquals($original_hash[4], $new_hash[4]);

Maybe this last foreach could store the has in a variable called $newer_hash such that each of the assertions would look like $this->assertEquals($new_hash[#], $newer_hash[#]);.

quietone’s picture

Status: Needs work » Needs review
StatusFileSize
new1.17 KB
new2.58 KB

Well, that was a real oops. I did mean to execute the migration again after the prepareUpdate.

#10. Done. I agree, it reads better.

danflanagan8’s picture

Category: Bug report » Task
Issue summary: View changes
Status: Needs review » Reviewed & tested by the community

Double oops: I hadn't even noticed the migration didn't get run again after preparing for update.

The test certainly looks good now. So what do we do here? You've proven the reported bug is not a problem in core. Do we change this to a task for adding coverage related to the hash?

I think it's reasonable coverage to add. The only place I can find explicit coverage related to hashing is in Drupal\Tests\migrate\Unit\RowTest. The new coverage is testing for unexpected interaction between trackChanges and prepareUpdate, which seems ok to add.

I'm going to update the IS and category and RTBC.

quietone’s picture

Title: Recalculate row hash without migrate_map data (avoid false positive track change) » Add tests of row hash to trackChangesTest

How about changing the title to what this is actually doing.

alexpott’s picture

Version: 9.4.x-dev » 9.3.x-dev
Status: Reviewed & tested by the community » Fixed

Committed and pushed be8ec278470 to 10.0.x and 83394db7aa6 to 9.4.x and 7bd674b5249 to 9.3.x. Thanks!

Backported to 9.3.x as additional test coverage is always welcome.

  • alexpott committed be8ec27 on 10.0.x
    Issue #3240873 by quietone, martin_klima, danflanagan8: Add tests of row...

  • alexpott committed 83394db on 9.4.x
    Issue #3240873 by quietone, martin_klima, danflanagan8: Add tests of row...

  • alexpott committed 7bd674b on 9.3.x
    Issue #3240873 by quietone, martin_klima, danflanagan8: Add tests of row...

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.