See #2748609: [meta] Preserving auto-increment IDs on migration is fragile.

Problem/Motivation

Preserving serial IDs in the upgrade process (e.g., source site nid 53 => destination site nid 53) means that if any content with the same ID is manually created on the destination side before upgrade, it will be overwritten with data from the source. Since the core upgrade process does not provide an option to not preserve IDs, we should warn the user before executing the upgrade process of any potential conflicts.

Proposed resolution

Run a process before the actual import which identifies conflict risks - cases where an ID in the content to be imported matches an ID that already exists in the destination.

  1. Maintain data integrity
    If this is a required step, the user can choose whether to proceed and overwrite any conflicts, or take an alternative approach (per the docs).
  2. No surprises
    They are told exactly what content could be affected by conflicts.
  3. Provide a path forward
    Yes.
  4. Preserve URLs (e.g., node/17).
    Yes.
  5. Minimize technical debt.
    Audit code should be reasonably isolated and maintainable.
  6. Minimize effort to implement.
    We’d need to first do what the import process normally does, identify and instantiate the necessary migrations. For each migration, we need to identify the destination ID field, see if that field is mapped in the process section, look for any of destination ID values which are not in the migration’s map table, and see if any of *those* IDs are in the source. Oh, but we need to consolidate this across all migrations to the same entity type… Ugh.

Remaining tasks

Implement all the things.

User interface changes

The upgrade summary page (after entering db credentials, before executing migrations) should run the audit and display any conflicts.

API changes

Likely addition of an "audit()" method or somesuch to the Migration plugin.

Data model changes

None.

CommentFileSizeAuthor
#65 interdiff-2876085-63-65.txt23.49 KBmaxocub
#65 2876085-65.patch42.01 KBmaxocub
#63 review.txt32.35 KBmaxocub
#63 interdiff-2876085-62-63.txt13.11 KBmaxocub
#63 2876085-63.patch53.33 KBmaxocub
#62 interdiff-60-62.txt616 bytesmaxocub
#62 2876085-62.patch44.47 KBmaxocub
#60 review.txt23.68 KBmaxocub
#60 interdiff-58-60.txt32.88 KBmaxocub
#60 2876085-60.patch44.47 KBmaxocub
#58 interdiff-55-58.txt4.49 KBmaxocub
#58 2876085-58.patch52.7 KBmaxocub
#55 2876085-55.patch50.63 KBheddn
#55 interdiff_53-55.txt1.3 KBheddn
#53 review.txt28.92 KBmaxocub
#53 interdiff-50-53.txt5.09 KBmaxocub
#53 2876085-53.patch50.26 KBmaxocub
#50 review.txt25.6 KBmaxocub
#50 interdiff-48-50.txt5.19 KBmaxocub
#50 2876085-50.patch46.52 KBmaxocub
#48 review.txt23.14 KBheddn
#48 interdiff_44-48.txt11.14 KBheddn
#48 2876085-48.patch43.76 KBheddn
#44 Selection_173.png41.09 KBheddn
#44 review.txt21.56 KBheddn
#44 2876085-44.patch42.01 KBheddn
#44 interdiff_36-44.txt3.89 KBheddn
#40 interdiff-36-40.txt3.23 KBmaxocub
#40 2876085-38.patch42.27 KBmaxocub
#38 interdiff-36-38.txt3.23 KBmaxocub
#38 2876085-38.patch42.28 KBmaxocub
#36 interdiff_35-36.txt7.13 KBheddn
#36 2876085-36.patch41.56 KBheddn
#35 2876085-35.patch34.57 KBheddn
#35 interdiff_24-35.txt61.59 KBheddn
#24 interdiff_22-24.txt791 bytesheddn
#24 2876085.patch96.16 KBheddn
#22 2876085-22.patch95.39 KBheddn
#22 review.txt14.1 KBheddn
#22 interdiff_19-22.txt1.32 KBheddn
#19 2876085-18.patch95.4 KBheddn
#17 2876085-17.patch97.57 KBheddn
#17 interdiff_13-17.txt84.12 KBheddn
#15 Selection_012.png21.85 KBquietone
#13 2876085-13.patch15.51 KBheddn
#13 interdiff_7-13.txt3.99 KBheddn
#7 2876085-6.patch15.55 KBvasi
Members fund testing for the Drupal project. Drupal Association Learn more

Comments

mikeryan created an issue. See original summary.

vasi’s picture

At the triage, I discussed the possibility of making this much simpler:

The current approach requires looking at all of the source, process and destination in complex ways, to identify exactly which new IDs will be migrated.

Instead, we could simply check "are there any new destination entities of certain types since the last migration (if any)?" If so, warn the user that these new entities might be overwritten.

This risks false positives: It could be that you have created nid 5, and are about to migrate nids 3, 4 and 6. Even though the migration would work, the simple check will warn you. But this case sounds quite rare, and careful users who know they fall in this case can click through the warning.

cilefen’s picture

Priority: Critical » Major
Issue tags: +Triaged D8 major

@xjm, @alexpott, @effulgentsia, @lauriii, @catch and I discussed this issue at a recent meeting and agreed that since the parent is critical, we would downgrade this to major but also make the other related children major.

heddn’s picture

We discussed this in the weekly migrate meeting and mikeryan, quiteone and heddn all agreed with vasi's approach as outlined in #2.

vasi’s picture

Ok, so here's the proposed implementation plan:

  • Create an interface with the getHighestId() method
  • Implement this on id_map, which will yield the highest ID that has been migrated.
  • Implement this on content destinations, which will yield the highest ID that exists of the destination type. Eg: highest node ID on the site.
  • Add a new field 'stable_ids' field to content destination configuration. If this is set and true, on the first import they will check that </code>$id_map->getHighestId() == $destination->getHighestId(). If not, throw an exception. (Maybe we want to do this as a global pre-flight check instead, so as to not waste user time?)
  • Add the stable_ids to all the relevant migration definitions. Eg: Add to d6_node, but not d6_node_translation, since it doesn't use an auto-increment ID. It's too annoying to automatically determine whether a migration is intended to have stable IDs, so I think this is the best option.
mikeryan’s picture

  • Create an interface with the getHighestId() method
  • Implement this on id_map, which will yield the highest ID that has been migrated.
  • Implement this on content destinations, which will yield the highest ID that exists of the destination type. Eg: highest node ID on the site.

Good thought - and if/when we do #2876086: Manipulate auto-increment values to avoid conflicts, we'd implement this on sources as well, so we'd know how high to set the auto-increment value.

(Maybe we want to do this as a global pre-flight check instead, so as to not waste user time?)

Yes, that's the plan as specified in the issue summary here - this is a pre-flight check, runtime checks would be done in #2876090: Warn if migrated content will overwrite manually-created content.

vasi’s picture

Status: Active » Needs review
Issue tags: +Needs tests
FileSize
15.55 KB

Here's a teensy implementation. Still needs tests, and vetting.

vasi’s picture

Status: Needs review » Needs work

The last submitted patch, 7: 2876085-6.patch, failed testing. View results
- codesniffer_fixes.patch Interdiff of automated coding standards fixes only.

vasi’s picture

Note also that the patch doesn't really do a good job of deciding when to audit IDs. We'll probably want to add 'preserve_ids: true' to all the relevant YAML files? Or we can try to use heuristics to do the equivalent automatically.

vasi’s picture

At migration triage, it was decided to prefer a top-level property for this. Also, the patch needs to be modified to handle multiple migrations to the same destination type (eg: nodes of different types).

heddn’s picture

  1. +++ b/core/modules/migrate/src/Plugin/migrate/destination/EntityContentBase.php
    @@ -289,4 +297,70 @@ protected function getDefinitionFromEntity($key) {
    +    return (int)reset($found);
    

    Nit.Space.

  2. +++ b/core/modules/migrate/src/Plugin/migrate/destination/EntityContentBase.php
    @@ -289,4 +297,70 @@ protected function getDefinitionFromEntity($key) {
    +  protected function getHighestIdField() {
    +    return $this->getKey('id');
    

    Why is this needed? Should this be public and on the interface?

  3. +++ b/core/modules/migrate/src/Plugin/migrate/destination/EntityContentBase.php
    @@ -289,4 +297,70 @@ protected function getDefinitionFromEntity($key) {
    +    if (isset($this->configuration['preserves_ids'])
    +      && !$this->configuration['preserves_ids']
    

    Can we just check !empty()?

  4. +++ b/core/modules/migrate/src/Plugin/migrate/destination/EntityRevision.php
    @@ -78,4 +78,11 @@ public function getIds() {
    +  public function getHighestIdField() {
    +    return $this->getKey('revision');
    +  }
    

    Why is this public but ECB is protected?

  5. +++ b/core/modules/migrate/src/Plugin/migrate/id_map/Sql.php
    @@ -923,4 +924,22 @@ public function valid() {
    +    return (int)$found;
    

    Nit.Space.

  6. +++ b/core/modules/migrate_drupal_ui/src/Form/MigrateUpgradeForm.php
    @@ -1034,6 +1061,64 @@ public function validateCredentialForm(array &$form, FormStateInterface $form_st
    +    $form['actions']['submit']['#value'] = $this->t('I acknowledge I may lose data, continue anyway');
    

    Nit. Period at end of sentence.

heddn’s picture

FileSize
3.99 KB
15.51 KB
heddn’s picture

Nits in my review are now addressed and we're now looking at a top-level attribute. Next up is manual testing and modifying to handle multiple migrations to the same destination type (eg: nodes of different types).

Plus we'll need automated tests.

quietone’s picture

FileSize
21.85 KB

Just a wee manual test. Installed D8 using my script which will then enabled migrate modules and admin_toolbar. Then went to '/upgrade' and completed the form for a D7 site. The result was the warning message about entities may be overwritten. Yes, there are shortcuts but 'preserves_id' is false for the shortcut migration.

+++ b/core/modules/migrate/src/Plugin/migrate/destination/EntityContentBase.php
@@ -289,4 +297,64 @@ protected function getDefinitionFromEntity($key) {
+    if (!empty($this->migration->getPluginDefinition()['preserves_ids'])) {

Maybe change to isset? And since ids is plural we should change preserves_ids to preserve_ids.


I also think the working needs improvement. But it is too late in the day to do anything about that.

maxocub’s picture

I would also suggest, and I think it was mentioned in a previous meeting, that we should use an 'opt in' tag instead of an 'opt out'. Something like audit_ids: true to audit ids instead of preserves_ids: false to not audit ids.

heddn’s picture

Status: Needs work » Needs review
FileSize
84.12 KB
97.57 KB

What a bummer. That bumps the diff from 15k to 97k. We'll have to work on a review only patch from here on out.

maxocub’s picture

About the multiple migrations to the same destination type (eg: nodes of different types), I don't understand what needs to be done.

If there's an id conflict with migrated nodes, does it makes a difference if it's an article or a page? All of them are stored in the same table and have the same id field.

heddn’s picture

+++ b/core/modules/migrate/src/Plugin/migrate/destination/EntityContentBase.php
@@ -332,22 +332,18 @@ public function getHighestIdField() {
-    if (!empty($this->migration->getPluginDefinition()['preserves_ids'])) {
-      // If IDs are not preserved, we can't have conflicts.
-      return FALSE;
+    // If IDs are are audited, we can't have conflicts.
+    if (!empty($this->migration->getPluginDefinition()['audit_ids'])) {
+      if (!($idMap instanceof MigrateMaxIdInterface)) {
+        // We don't know how to audit IDs without a cooperating ID map.
+        return FALSE;
+      }
+      $highestMigrated = $idMap->getMaxId($this->getHighestIdField());
+      if ($this->highestDestinationId() > $highestMigrated) {
+        // There's a new ID that we might conflict with!
+        return TRUE;
+      }
     }
-
-    if (!($idMap instanceof MigrateMaxIdInterface)) {
-      // We don't know how to audit IDs without a cooperating ID map.
-      return FALSE;
-    }
-
-    $highestMigrated = $idMap->getMaxId($this->getHighestIdField());
-    if ($this->highestDestinationId() > $highestMigrated) {
-      // There's a new ID that we might conflict with!
-      return TRUE;
-    }
-    return FALSE;
   }

For reviewers, above are the relevant changes. The rest are all yaml changes in #17/#18. #18 is a new patch where I backed out the audit_ids config on some test assets:
core/modules/migrate/tests/modules/migrate_external_translated_test/migrations/migrate.migration.external_translated_test_node.yml
core/modules/migrate/tests/modules/migrate_external_translated_test/migrations/migrate.migration.external_translated_test_node_translation.yml
core/modules/taxonomy/tests/modules/taxonomy_term_stub_test/migrations/taxonomy_term_stub_test.yml

The last submitted patch, 17: 2876085-17.patch, failed testing. View results
- codesniffer_fixes.patch Interdiff of automated coding standards fixes only.

Status: Needs review » Needs work

The last submitted patch, 19: 2876085-18.patch, failed testing. View results
- codesniffer_fixes.patch Interdiff of automated coding standards fixes only.

heddn’s picture

Status: Needs work » Needs review
FileSize
1.32 KB
14.1 KB
95.39 KB

To generate the review.txt, I used the follow:

$ git diff --cached >2876085-22.patch
$ git diff --cached -G "audit_ids: true" > match.txt
$ interdiff match.txt 2876085-22.patch  >review.txt

I also agree with #18. We don't care about the types. Probably up next is to fix the UI tests. What's after that?

Status: Needs review » Needs work

The last submitted patch, 22: 2876085-22.patch, failed testing. View results

heddn’s picture

Status: Needs work » Needs review
FileSize
96.16 KB
791 bytes
quietone’s picture

Why have 'audit_ids: ' defined in all migrations? Surely we can have a default? Will it be the expectation that custom migrations add it? Can we use 'no_audit_ids: true' when we don't want to audit?

maxocub’s picture

I don't think it's necessary to put the audit_id tag everywhere. My understanding was that we were going to put it only in the migrations where we know it is needed.

If we don't want to audit IDs, we don't need to put audit_ids: false since this line evaluate as false if the tag is not present:

+++ b/core/modules/migrate/src/Plugin/migrate/destination/EntityContentBase.php
@@ -289,4 +297,60 @@ protected function getDefinitionFromEntity($key) {
+    if (!empty($this->migration->getPluginDefinition()['audit_ids'])) {
maxocub’s picture

+++ b/core/modules/migrate/src/Plugin/migrate/destination/EntityContentBase.php
@@ -289,4 +297,60 @@ protected function getDefinitionFromEntity($key) {
+      if (!($idMap instanceof MigrateMaxIdInterface)) {
+        // We don't know how to audit IDs without a cooperating ID map.
+        return FALSE;

For now, only EntityContentBase is an instance of MigrateIDAuditInterface, so only migrations with a destination extending EntityContentBase should use audit_ids (if needed). All other migrations that are not content will never have there IDs audited, so the audit_ids tag is really not necessary there.

Also, I tested the patch manually with the D7 fixtures and I found that if you enable the Forum module you will have a taxonomy term (General discussion) created on the destination site, and so you will be presented with the conflict warning page. That is why this line was needed in the tests:

+++ b/core/modules/migrate_drupal_ui/tests/src/Functional/MigrateUpgradeTestBase.php
@@ -147,6 +147,7 @@ public function testMigrateUpgrade() {
+    $this->drupalPostForm(NULL, [], t('I acknowledge I may lose data, continue anyway.'));

Is this a problem? It might confuse people.

quietone’s picture

Is this a problem? It might confuse people.

Yes, the new screen needs work. Do we work on it some more ourselves or tag for Usability review?

vasi’s picture

Sorry I haven't had time to help with this lately!

Unless I'm missing something, the patch still needs to be rearranged to correctly handle multiple migrations with the same destination types. Eg:

  • User runs migrate_upgrade from a D7 site. Five new nodes of type 'page' are created, with nids 1-5
  • User run an incremental migrate_upgrade. No new nodes have been created, so this should succeed.
    • Migration 'node__page' runs 'unsafeIdsExist()'. The ID map says the highest nid migrated is 5, and the destination says the highest nid on the site is 5. Those are equal, so this migration says we're ok!
    • Migration 'node__article' runs 'unsafeIdsExist()`. The ID map says no nid has been migrated, and the destination says the highest nid on the site is 5. Since 5 > 0, the migration incorrectly says we're at risk of overwriting data.

    There are several potential fixes. I think the most resilient one would be to aggregate migrations with the same destination plugin ID—then compare $destination->highestDestinationId() with max(array_map($migrations, 'highestIdField')) (pseudocode). Sound reasonable?

    We probably also have to somehow handle revision IDs, ick.

heddn’s picture

We discussed this in the weekly migrate meeting today. The summary of the conversation: forum module actually introduces the situation that we are trying to resolve with this audit. It inserts a term into a new vocabulary with a term name of 'General Discussion', probably as term #1. This is the topic for the initial forum. If you migrate over from D6/D7 and you have term data (which you probably do) then it will overwrite term #1 with the data that was in D6. This results in your General Discussion term magically getting replaced as 'Cats' or whatever the term was in D6/D7.

To address this, we remembered we had initial thought that we would link out to a handbook page discussing some potential situations and solutions. Forum module is a bit of an edge case, it isn't as popular as some of the other modules. It might just be easiest to add it as a scenario to the docs.

But one thing we didn't know in the meeting was if Forum was unique here. So, let's do some additional testing to see if any other modules in core do similar things.

heddn’s picture

Next steps

  1. Add a handbook page to discuss this.
  2. Add link for handbook page to UI page.
  3. Automated test of an upgrade without forum module enabled.
  4. We don't need to add a test for with, because we already have one.
maxocub’s picture

Status: Needs review » Needs work

Back to NW

krystalcode’s picture

Status: Needs work » Needs review

Added a section at https://www.drupal.org/docs/8/upgrade/known-issues-when-upgrading-from-d..., based on my understanding of the situation. Feel free to make any corrections or additions.

heddn’s picture

re #27: I found a listing of entity things over in #2711099: Categorize migrations according to their type. Going to work on that and adding a test.

+            'd6_aggregator_feed',
+            'd7_aggregator_feed',
+            'd6_aggregator_item',
+            'd7_aggregator_item',
+            'd7_blocked_ips',
+            'd6_book',
+            'd6_comment',
+            'd7_comment',
+            'd6_custom_block',
+            'd7_custom_block',
+            'd6_file',
+            'd7_file',
+            'd6_upload',
+            'd6_menu_links',
+            'd7_menu_links',
+            'd6_profile_values',
+            'd7_shortcut',
+            'd7_shortcut_set_users',
+            'd6_taxonomy_term',
+            'd7_tracker_node',
+            'd7_tracker_user',
+            'd6_url_alias',
+            'd7_url_alias',
+            'd6_user',
+            'd7_user',
+            'd6_user_contact_settings',
+            'd6_user_picture_file',
heddn’s picture

FileSize
61.59 KB
34.57 KB

ok, this only templates from the list above. To reduce the list from *all* templates, I took the list from the related issue and did a quick search/replace to make a series of commands like: git reset HEAD core/*/action_settings.yml, etc.

Interdiff attached if you finding it interesting. Next up is tests.

heddn’s picture

FileSize
41.56 KB
7.13 KB

Status: Needs review » Needs work

The last submitted patch, 36: 2876085-36.patch, failed testing. View results
- codesniffer_fixes.patch Interdiff of automated coding standards fixes only.

maxocub’s picture

Status: Needs work » Needs review
FileSize
42.28 KB
3.23 KB

First attempt to handle multiple migrations to the same destination (d7_node:artice, d7_node:page).
Tests will follow.

Status: Needs review » Needs work

The last submitted patch, 38: 2876085-38.patch, failed testing. View results

maxocub’s picture

Status: Needs work » Needs review
FileSize
42.27 KB
3.23 KB

Oups

Status: Needs review » Needs work

The last submitted patch, 40: 2876085-38.patch, failed testing. View results
- codesniffer_fixes.patch Interdiff of automated coding standards fixes only.

maxocub’s picture

Since this will be a hard issue to review, would it be OK to concentrated on the code and tests here, with only a few audit_ids to test with, and to open a follow up to tag all migrations that needs auditing? It think it would make our job and the reviewer's job a lot easier.

heddn’s picture

@maxocub to create a review patch, see #22 for instructions.

Now a review of #40.

+++ b/core/modules/migrate/src/Plugin/MigrateMaxIdInterface.php
@@ -10,11 +10,14 @@
+  public function getMaxId($field, $bundles);

I see what we are doing here now. I was curious why this was needed. So bear with me while I verbalize my thoughts. If there is a page and page_custom content types in the source, there would be 2 derivative migrations on the destination aligned with those 2 sources. However, if someone built a custom migration and piped those two sources into the same page content type (because that's what they want them to be on their d8 site), then there will be problems. There would be 2 source bundles but only a single mapping table. Let's call this scenario #1.

Alternatively, some builds a completely custom solution from CSVs and has only a single source but multiple bundles on the destination... then we'd have multiple bundles but only a single destination map. Scenario #2.

Last example. A single page content type on the legacy site. Custom migration. The single source is getting split into 3 content types on the destination with 3 different migrations and ignoring (skipping) content that isn't destined for that content type. Perhaps its breaking things up by media type so videos go one place. Audio another direction and photos the last content type. Scenario #3.

All the above are valid scenarios and aren't even that uncommon. Let's explore why bundle is important now. I'm not sure it is.

In any scenario, if the highest nid, media id, term_id, etc is greater than any mapping table's, we'll throw a warning. It indicates that the destination entity has more entities created than which came from a migration. In many cases, this is fine. Especially if you aren't mapping ids from old system to new. But if you mapping entity ids, then it does become important.

heddn’s picture

Let's see if we can resolve the test failures. Interdiff and changes against #36 (for now). If we decide to, we can add in #40 at a later point.

And from a UX perspective, I've also grabbed a screenshot of the conflicts page. It now has a link off to the recent handbook page entry. Maybe not the prettiest, but I'd don't think its too ugly either. It serves the purpose for solving a critical. If we want to bikeshed much on it, let's fork that to a follow-up, please.

conflict id upgrade page

Status: Needs review » Needs work

The last submitted patch, 44: 2876085-44.patch, failed testing. View results
- codesniffer_fixes.patch Interdiff of automated coding standards fixes only.

maxocub’s picture

Quick review of last patch.

1.

+++ b/core/modules/migrate/src/Plugin/MigrateMaxIdInterface.php
@@ -0,0 +1,20 @@
+namespace Drupal\migrate\Plugin;
+
+
+interface MigrateMaxIdInterface {

Extra line.

2.

+++ b/core/modules/migrate/src/Plugin/migrate/destination/EntityContentBase.php
@@ -289,4 +297,60 @@ protected function getDefinitionFromEntity($key) {
+  /**
+   * Check whether unsafe IDs exist that should inhibit migration.
+   *
+   * @param \Drupal\migrate\Plugin\MigrateIdMapInterface $idMap
+   *   The ID map for this migration.
+   *
+   * @return bool
+   *   Whether unsafe IDs exist.
+   */
+  public function unsafeIdsExist(MigrateIdMapInterface $idMap) {

Should use {@inheritdoc}.

3.

+++ b/core/modules/migrate/src/Plugin/migrate/destination/EntityContentBase.php
@@ -289,4 +297,60 @@ protected function getDefinitionFromEntity($key) {
+    // If IDs are are audited, we can't have conflicts.
+    if (!empty($this->migration->getPluginDefinition()['audit_ids'])) {

Should be 'we can have conflicts".

4.

+++ b/core/modules/migrate/src/Plugin/migrate/destination/EntityContentBase.php
@@ -289,4 +297,60 @@ protected function getDefinitionFromEntity($key) {
+   * @param \Drupal\migrate\Plugin\MigrateIdMapInterface $idMap
...
+  public function unsafeIdsExist(MigrateIdMapInterface $idMap) {
...
+      if (!($idMap instanceof MigrateMaxIdInterface)) {
...
+      $highestMigrated = $idMap->getMaxId($this->getHighestIdField());
+      if ($this->highestDestinationId() > $highestMigrated) {

+++ b/core/modules/migrate/src/Plugin/migrate/id_map/Sql.php
@@ -923,4 +924,22 @@ public function valid() {
+    $sqlField = $this->destinationIdFields()[$field];
...
+      ->fields('map', [$sqlField])
+      ->orderBy($sqlField, 'DESC')

+++ b/core/modules/migrate_drupal_ui/src/Form/MigrateUpgradeForm.php
@@ -723,12 +739,18 @@ class MigrateUpgradeForm extends ConfirmFormBase {
+  public function __construct(StateInterface $state, DateFormatterInterface $date_formatter, RendererInterface $renderer, MigrationPluginManagerInterface $plugin_manager, EntityTypeManagerInterface $entityTypeManager, MigrateIdAuditor $idAuditor) {
...
+    $this->entityTypeManager = $entityTypeManager;
+    $this->idAuditor = $idAuditor;

@@ -1082,6 +1109,64 @@ public function validateCredentialForm(array &$form, FormStateInterface $form_st
+    $migrationIds = array_keys($form_state->get('migrations'));
...
+    $typeIds = $this->idAuditor->auditIds($migrations);
...
+    sort($typeIds);
+    foreach ($typeIds as $typeId) {
+      $items[] = $this->entityTypeManager->getDefinition($typeId)->getLabel();

This is maybe too much of a nit, so fell free to ignore, but the coding standard says to use same types of variable names (snake_case / lowerCamelCase) inside a file.

5.

+++ b/core/modules/migrate_drupal_ui/tests/src/Functional/d6/MigrateUpgradeConflicts6Test.php
@@ -0,0 +1,57 @@
+use Drupal\Tests\migrate_drupal_ui\Functional\MigrateUpgradeTestBase;
+use Drupal\user\Entity\User;

+++ b/core/modules/migrate_drupal_ui/tests/src/Functional/d7/MigrateUpgradeConflicts7Test.php
@@ -0,0 +1,57 @@
+use Drupal\Tests\migrate_drupal_ui\Functional\MigrateUpgradeTestBase;
+use Drupal\user\Entity\User;

Unused use statements.

6.

+++ b/core/modules/migrate_drupal_ui/tests/src/Functional/d6/MigrateUpgrade6Test.php
@@ -15,6 +15,21 @@
   /**
+   * Modules to enable.
+   *
+   * @var array
+   */
+  public static $modules = [

+++ b/core/modules/migrate_drupal_ui/tests/src/Functional/d6/MigrateUpgradeConflicts6Test.php
@@ -0,0 +1,57 @@
+  /**
+   * Modules to enable.
+   *
+   * @var array
+   *
+   * Forum module adds content, resulting in content id conflicts.
+   */
+  public static $modules = [

+++ b/core/modules/migrate_drupal_ui/tests/src/Functional/d7/MigrateUpgradeConflicts7Test.php
@@ -0,0 +1,57 @@
+  /**
+   * Modules to enable.
+   *
+   * @var array
+   *
+   * Forum module adds content, resulting in content id conflicts.
+   */
+  public static $modules = [

Should use {@inheritdoc}

7.

+++ b/core/modules/migrate_drupal_ui/tests/src/Functional/d6/MigrateUpgradeConflicts6Test.php
@@ -0,0 +1,57 @@
+    // Ensure submitting the form with invalid database credentials gives us a
+    // nice warning.
+    $this->drupalPostForm(NULL, [$driver . '[database]' => 'wrong'] + $edits, t('Review upgrade'));
+    $this->assertText('Resolve the issue below to continue the upgrade.');

+++ b/core/modules/migrate_drupal_ui/tests/src/Functional/d7/MigrateUpgradeConflicts7Test.php
@@ -0,0 +1,57 @@
+    // Ensure submitting the form with invalid database credentials gives us a
+    // nice warning.
+    $this->drupalPostForm(NULL, [$driver . '[database]' => 'wrong'] + $edits, t('Review upgrade'));
+    $this->assertText('Resolve the issue below to continue the upgrade.');

This is already tested in the parent class.

heddn’s picture

Re #38 & @maxocub: I think I understand the problem now. To solve, I think we need to inspect all the migration destinations and for all that "match" the entity type, see what they have recorded in their mapping table. We loop over all migrations in #2711099: Categorize migrations according to their type, so a similar approach is what I'd expect here.

Next up, I'll work in responses to #46. I'm leaving #38 /
#22 for a later point. Maybe even by someone else.

heddn’s picture

Status: Needs work » Needs review
FileSize
43.76 KB
11.14 KB
23.14 KB

Setting to NR for test bot, but we still need to address #22.

Status: Needs review » Needs work

The last submitted patch, 48: 2876085-48.patch, failed testing. View results

maxocub’s picture

Status: Needs work » Needs review
FileSize
46.52 KB
5.19 KB
25.6 KB

Here's a second POC for handling multiple migrations to the same destination. What do you think? (Still needs tests)

I also changed back the assertSame to assertEquals to make the migrate_drupal_ui tests pass.

Status: Needs review » Needs work

The last submitted patch, 50: 2876085-50.patch, failed testing. View results

Version: 8.4.x-dev » 8.5.x-dev

Drupal 8.4.0-alpha1 will be released the week of July 31, 2017, which means new developments and disruptive changes should now be targeted against the 8.5.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

maxocub’s picture

Status: Needs work » Needs review
FileSize
50.26 KB
5.09 KB
28.92 KB

Here's fixing the Unit tests.

@heddn: I'm not sure I like the idea of extending MigrateUpgrade6Test & MigrateUpgrade7Test for the conflicts test. Those are so long to run and they often need to be updated when we modify the fixtures. If we extend them, it means two times as long to run and two more places to update the entity count.
I did spent a lot of time trying to make them pass, and when I finally got the counts right, there was this block migration that was failing due to the forum module not being enabled.
If you're OK with that, I'm going to try to work on dedicated tests.

Status: Needs review » Needs work

The last submitted patch, 53: 2876085-53.patch, failed testing. View results

heddn’s picture

re #53 we might have to do that, but then we'd be dropping a "test all the things" test case, more or less. Let's try a little longer to see if we can make things pass. Or if you have an idea to keep the "test all the things" and a non-conflicts test case, that might be another option.

maxocub’s picture

Sorry that I didn't uploaded my patch with the updated counts, but it's not just the field_storage_config that needs updating, but also field_config, taxonomy_vocabulary, rdf_mapping, base_field_override (and maybe another one, I don't remember) and you have to lower it in the test where Forum is not enabled and keep the initial value in the test where Forum is enabled. ;)

If we keep the "test all the things" scenario, we could just test that there is conflicts, but not migrate and count entities, and in other tests, do the migration and entity count but on selected migrations?

Status: Needs review » Needs work

The last submitted patch, 55: 2876085-55.patch, failed testing. View results

maxocub’s picture

Status: Needs work » Needs review
FileSize
52.7 KB
4.49 KB

Here's the correct entity counts.
There should now be only one test failing (MigrateUprade6Test) with this error:

1) Drupal\Tests\migrate_drupal_ui\Functional\d6\MigrateUpgrade6Test::testMigrateUpgrade
Migration of a:3:{s:9:"sourceid1";s:5:"forum";s:9:"sourceid2";s:1:"0";s:9:"sourceid3";s:7:"garland";} to a:0:{} as part of the d6_block migration. The source row status is 3

Maybe it's a problem with the bloc migration, but it's unable to migrate the forum blocs if forum is not enabled.

Status: Needs review » Needs work

The last submitted patch, 58: 2876085-58.patch, failed testing. View results

maxocub’s picture

Status: Needs work » Needs review
FileSize
44.47 KB
32.88 KB
23.68 KB

I started to work on tests, using KTB instead of the slow BTB. This is just a begining, I plan on adding more of them.
The interdiff is a bit bloated because I did a little refactoring of the API to make it cleaner, I think.

Status: Needs review » Needs work

The last submitted patch, 60: 2876085-60.patch, failed testing. View results

maxocub’s picture

Oups, wrong class name.

maxocub’s picture

Status: Needs work » Needs review
Issue tags: -Needs tests
FileSize
53.33 KB
13.11 KB
32.35 KB

I added tests auditing the obvious entity migrations for D6 & D7. I'll add more tests for the other places where we need to audit IDs.

heddn’s picture

Status: Needs review » Needs work
Issue tags: +Needs change record
+++ b/core/modules/migrate/src/Plugin/MigrateDestinationAuditInterface.php
@@ -0,0 +1,35 @@
+  public function getEntityType();

We should remove this helper function from the interface so we can have this solution work with non-entities as well. Plus, we should look at i18n and revisions.

maxocub’s picture

I removed the getEntityType() method from the interface and moved things a bit. Now the ID auditor is using the migration's base_id & label for it's report, instead of the entity's id & label, since we want to support more than just entities.

Let's start by auditing IDs of the obvious entity types and then move on to the revisions, translations and other exceptions.