This is an alternative for the derivation strategy in #2911244-141: Field collections deriver and base migration.

Problem/Motivation

Bug (A)

If you try to migrate nested paragraphs with the current paragraphs HEAD, you will get a lot of orphaned paragraphs entities after the migration, with broken references in the parent entities.

But even for having an easier and more straightforward (nested) field collections and paragraphs migrations, it would be nice to derive paragraph and field collection entity migration not just based on their own bundle (this equals to the parent field in case of field collections), but based on their parent field, parent entity type ID and (maybe) parent entity bundle ID.

If paragraphs derives these migrations with this strategy, it will be very easy to represent the migration dependencies in the corresponding field migration plugins' defineValueProcessPipeline method, the references will be kept, and you won't have any orphaned paragraphs.

Work around data integrity issues caused by Drupal 7 paragraphs (B)

There might be other data integrity issues as well: I've seen a client site where paragraphs entities (and almost all of their revisions) were deleted (because user deleted a paragraph in a newer node revision for example). In this casewe cannot migrate anything for some previous revisisions if we have no data, but the revision of the paragraph reference field is kept, so the value process pipeline creates non-updatable stubs in these cases (and most of the cases in the wrong paragraph migrations).

I except that in such cases, the non-resolvable references should be gone on the destination site as well (you don't have this data on the source D7 site) instead of creating invalid stubs.

Proposed resolution

Solution for bug (A):

  1. Apply a new, parent-based paragraphs derivation strategy, including new, or extended tests.
  2. Add new test content, e.g: new content with paragraphs referencing another paragraphs AND even field collections and vice versa.

Solution for (B):

  1. Don't migrate paragraph revisions if the active revision does not exist
  2. Prevent stub creation in value process pipelines.

Remaining tasks

  1. Don't migrate paragraph revisions if the active revision does not exist
  2. Prevent stub creation in value process pipelines.
  3. Apply a new, parent-based paragraphs derivation strategy, including new, or extended tests.
  4. Add new test content, e.g: new content with paragraphs referencing another paragraphs AND even field collections and vice versa.

API changes

Different paragraph and field collection migration derivatives are provided by paragraphs.

Data model changes

Data integrity is kept.

CommentFileSizeAuthor
#45 paragraphs-orphaned_log_instead_of_message-3145755-45.txt8.38 KBhuzooka
#43 interdiff-3145755-40-43.txt9.57 KBhuzooka
#43 paragraphs-bug_derive_by_parent_and_orphans-3145755-43.diff498.61 KBhuzooka
#41 paragraphs-bug_derive_by_parent_and_orphans-3145755-40.diff493.79 KBhuzooka
#40 interdiff-3145755-39-40.txt16.52 KBhuzooka
#40 paragraphs-bug_derive_by_parent_and_orphans-3145755-40.patch533.12 KBhuzooka
#39 interdiff-3145755-28-39.txt46.82 KBhuzooka
#39 paragraphs-bug_derive_by_parent_and_orphans-3145755-39.patch523.63 KBhuzooka
#36 paragraphs-derive_by_parent-3145755-23--complete--mr-5-700e9d11.diff547.63 KBhuzooka
#28 interdiff-3145755-26-28.txt11.75 KBhuzooka
#28 paragraphs-derive_by_parent-3145755--28.diff502.58 KBhuzooka
#26 paragraphs-derive_by_parent-3145755-26--complete.patch490.63 KBhuzooka
#24 interdiff-3145755-23-24.txt253.15 KBhuzooka
#24 paragraphs-derive_by_parent-3145755-24--fix-only.patch50.39 KBhuzooka
#24 paragraphs-derive_by_parent-3145755-24--complete.patch1018.77 KBhuzooka
#24 paragraphs-derive_by_parent-3145755-24--tests-only.patch968.38 KBhuzooka
#23 interdiff-3145755-19-23.txt217.78 KBhuzooka
#23 paragraphs-derive_by_parent-3145755-23--fix-only.patch50.38 KBhuzooka
#23 paragraphs-derive_by_parent-3145755-23--complete.patch900.02 KBhuzooka
#23 paragraphs-derive_by_parent-3145755-23--test-only.patch849.64 KBhuzooka
#20 paragraphs-derive_by_parent-3145755-19--fix-only.patch47.19 KBhuzooka
#19 interdiff-3145755-17-19.txt1.67 KBhuzooka
#19 paragraphs-derive_by_parent-3145755-19--complete.patch801.83 KBhuzooka
#17 interdiff-3145755-13-17.txt21.67 KBhuzooka
#17 paragraphs-derive_by_parent-3145755-17--complete.patch801.82 KBhuzooka
#17 paragraphs-derive_by_parent-3145755-17--test-only.patch754.63 KBhuzooka
#13 interdiff-3145755-7-13.txt13.6 KBhuzooka
#13 paragraphs-derive_by_parent-3145755-13--fix-only.patch44.59 KBhuzooka
#13 paragraphs-derive_by_parent-3145755-13--test-only.patch750.95 KBhuzooka
#13 paragraphs-derive_by_parent-3145755-13.patch795.54 KBhuzooka
#7 interdiff-3145755-5-7.txt42.68 KBhuzooka
#7 paragraphs-derive_by_parent-3145755-7--fix-only--do-not-test.patch39.19 KBhuzooka
#7 paragraphs-derive_by_parent-3145755-7--combined-with-2911244-146.patch866.87 KBhuzooka
#7 paragraphs-derive_by_parent-3145755-7--on-top-of-2911244-146--do-not-test.patch790.39 KBhuzooka
#6 interdiff-3145755-5-6.txt2.37 KBhuzooka
#6 paragraphs-derive_by_parent-3145755-6--combined-with-2911244-146.patch159.05 KBhuzooka
#5 interdiff-3145755-2-5.txt19.53 KBhuzooka
#5 paragraphs-derive_by_parent-3145755-5--fix-only--do-not-test.patch34.3 KBhuzooka
#5 paragraphs-derive_by_parent-3145755-5--combined-with-2911244-146.patch158.76 KBhuzooka
#5 paragraphs-derive_by_parent-3145755-5--on-top-of-2911244-146--do-not-test.patch73.41 KBhuzooka
#2 paragraphs-derive_by_parent-3145755-2--combined-with-2911244-141.patch157.44 KBhuzooka
#2 paragraphs-derive_by_parent-3145755-2--on-top-of-2911244-141--do-not-test.patch72.47 KBhuzooka

Issue fork paragraphs-3145755

Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

huzooka created an issue. See original summary.

huzooka’s picture

Patch paragraphs-derive_by_parent-3145755-2--on-top-of-2911244-141--do-not-test.patch requires #2911244-141: Field collections deriver and base migration applied, the combined patch contains both of them.

Wim Leers’s picture

  1. +++ b/src/ParagraphsMigration.php
    @@ -0,0 +1,57 @@
    +/**
    + * Class ParagraphsMigration.
    + */
    

    Nit: meaningless comment.

  2. +++ b/src/ParagraphsMigration.php
    @@ -0,0 +1,57 @@
    +   * Default derivation strategy (BC).
    ...
    +  const DERIVATION_METHOD_DEFAULT = self::DERIVATION_METHOD_LEGACY;
    

    Nit: "strategy" vs "method".

    I think "strategy" is preferable. I was going to say that that is the term that https://www.drupal.org/node/3105503 uses, but that's unfortunately not true. That uses "type" 🤷‍♂️

  3. +++ b/src/ParagraphsMigration.php
    @@ -0,0 +1,57 @@
    +    return in_array($method, $handled_methods, TRUE) ?
    +      $method :
    +      self::DERIVATION_METHOD_DEFAULT;
    
    +++ b/src/Plugin/migrate/D7FieldCollectionItemDeriver.php
    @@ -69,36 +72,95 @@ class D7FieldCollectionItemDeriver extends DeriverBase implements ContainerDeriv
    +          $values['label'] = $derivation_method_by_parent ?
    +            $this->t('@label (in @parent_entity_bundle @parent_entity_type, referenced from @field_label field)', $label_args) :
    +            $this->t('@label (@type)', $label_args);
    
    +++ b/src/Plugin/migrate/D7ParagraphsItemDeriver.php
    @@ -60,37 +62,107 @@ class D7ParagraphsItemDeriver extends DeriverBase implements ContainerDeriverInt
    +            $derivative_id = $derivation_method_by_parent ?
    +              "$field_name:$parent_entity_type:$parent_bundle:$paragraph_bundle" :
    +              $paragraph_bundle;
    

    🤓 Weird formatting IMHO. I think this is what core uses:

    return $a_very_long_condition
      ? $foo
      : $bar;
    
  4. +++ b/src/Plugin/migrate/D7FieldCollectionItemDeriver.php
    @@ -69,36 +72,95 @@ class D7FieldCollectionItemDeriver extends DeriverBase implements ContainerDeriv
    +          // No field collection field found in this source database.
    

    Übernit: s/field collection/field_collection/

  5. +++ b/src/Plugin/migrate/D7FieldCollectionItemDeriver.php
    @@ -69,36 +72,95 @@ class D7FieldCollectionItemDeriver extends DeriverBase implements ContainerDeriv
    +          /** @var \Drupal\migrate\Plugin\Migration $migration */
    +          $migration = \Drupal::service('plugin.manager.migration')
    +            ->createStubMigration($values);
    
    +++ b/src/Plugin/migrate/D7ParagraphsItemDeriver.php
    @@ -60,37 +62,107 @@ class D7ParagraphsItemDeriver extends DeriverBase implements ContainerDeriverInt
    +            /** @var \Drupal\migrate\Plugin\Migration $migration */
    +            $migration = \Drupal::service('plugin.manager.migration')
    +              ->createStubMigration($values);
    

    Nit: IMHO it's always preferable to do assert($something instanceof FQCN) instead of /** @var FQCN $something */. That gives you an extra assurance and is not IDE-specific.

  6. +++ b/src/Plugin/migrate/D7ParagraphsItemDeriver.php
    @@ -60,37 +62,107 @@ class D7ParagraphsItemDeriver extends DeriverBase implements ContainerDeriverInt
    +          // No paragraph fields found in this source database.
    

    Übernit: s/paragraph/paragraphs/

  7. +++ b/src/Plugin/migrate/field/FieldCollection.php
    @@ -66,16 +67,41 @@ class FieldCollection extends FieldPluginBase {
    +    $is_revision_migration = strpos($destination_plugin_id, 'entity_revision:') === 0 || strpos($destination_plugin_id, 'entity_complete:') === 0 || (strpos($destination_plugin_id, 'entity_reference_revisions:') === 0 && !empty($destination_config['new_revisions']));
    +
    +    if ($derivation_method_by_parent) {
    +      $destination_entity_type_id = strpos($destination_plugin_id, 'entity:') === 0 || strpos($destination_plugin_id, 'entity_revision:') === 0 || strpos($destination_plugin_id, 'entity_complete:') === 0 || strpos($destination_plugin_id, 'entity_reference_revisions:') === 0 ?
    
    +++ b/src/Plugin/migrate/field/Paragraphs.php
    @@ -64,22 +65,47 @@ class Paragraphs extends FieldPluginBase {
    +      $is_revision_migration = strpos($destination_plugin_id, 'entity_revision:') === 0 || strpos($destination_plugin_id, 'entity_complete:') === 0 || (strpos($destination_plugin_id, 'entity_reference_revisions:') === 0 && !empty($destination_config['new_revisions']));
    +      $destination_entity_type_id = strpos($destination_plugin_id, 'entity:') === 0 || strpos($destination_plugin_id, 'entity_revision:') === 0 || strpos($destination_plugin_id, 'entity_complete:') === 0 || strpos($destination_plugin_id, 'entity_reference_revisions:') === 0 ?
    

    🤔 What about node_revision and node_complete? These seem to only work for non-Node entity types?

  8. +++ b/src/Plugin/migrate/source/d7/FieldCollectionItem.php
    @@ -24,12 +24,21 @@ class FieldCollectionItem extends FieldableEntity {
    +   * The prefix of the field table that contains the entity properties.
    
    +++ b/src/Plugin/migrate/source/d7/FieldCollectionItemRevision.php
    @@ -21,6 +21,13 @@ class FieldCollectionItemRevision extends FieldCollectionItem {
    +   * The prefix of the field table that contains the entity properties.
    

    🤔 What are "entity properties"?

  9. +++ b/src/Plugin/migrate/source/d7/FieldCollectionItem.php
    @@ -37,23 +46,54 @@ class FieldCollectionItem extends FieldableEntity {
    +      // This configuration item may be set by a deriver to restrict the
    

    🤔 What is a "configuration item"?

  10. +++ b/src/Plugin/migrate/source/d7/ParagraphsItem.php
    @@ -46,22 +49,55 @@ class ParagraphsItem extends FieldableEntity {
    +      $query->join('paragraphs_item', 'p', "p.field_name = :field_name AND p.item_id = fd.{$field_name}_value AND p.revision_id = fd.{$field_name}_revision_id", [
    

    Übernit: why mix single and double quotes in one function call?

huzooka’s picture

Assigned: Unassigned » huzooka
Status: Needs review » Needs work
huzooka’s picture

Addressed everything from #3, except of #10:

In the third argument (in the join conditions) I need variable substitution, while the first and the second args are simple string literals.

huzooka’s picture

Used wrong strategy values in the tests. And they're failing with the new derivation strategy. 😕

huzooka’s picture

I also added a new test content with nested paragraphs. Its migration ends in orphaned paragraphs (and missing paragraph reference values) with the original derivation strategy – so I just skip testing this new content in ParagraphContentMigrationTest and also in MigrateUiParagraphsTest.

The combined and also the on top of #2911244-146: Field collections deriver and base migration patches contain the database fixture update as well. It still needs some cleanup, but it takes too much time, I have given up.

Forgot to address #3.7:
@Wim Leers, I think that you mix the migration (base) plugin ID with the migration destination (base) plugin ID.
There is no such thing as node_complete imho (fixme). There are d6_node_complete or d7_node_complete that are the migrations of complete nodes, but their destination plugin's base ID is entity_complete.

huzooka’s picture

Status: Needs review » Needs work
Issue tags: +Needs tests

I just noticed that the acceptance criteria (migrate even nested paragraphs and field collections successfully) isn't met here.

Right now, the new derivation strategy creates these kind of derivatives:

  • d7_paragraphs:paragraph_reference_field:node:article:paragraph_bundle, or
  • d7_paragraphs:paragraph_reference_field_in_nested_para:paragraphs_item:paragraph_bundle:paragraph_bundle

So the pattern is: d7_paragraphs:<parent-field-name>:<parent-entity-type>:<parent-entity-bundle>:<paragraph-bundle>.

The only problem with this is that you may have a nested paragraph entity that contains an another nested paragraph entity. In this situation, the current patch cannot properly set the required migration dependencies, and it can lead to circular dependencies (nested bundle 1 contains nested bundle 2 and vice versa), or to a case where a paragraphs migration depends on itself.

I think that we can handle this by introducing a yet-another derivation level that represents the nesting level of the paragraph entity migration.

So my proposal is: every paragraphs migration which parent entity type is a paragraph will be derived with an extra nesting level bit. Example:

  • Paragraphs referenced directly from a node paragraphs field can remain: d7_paragraphs:paragraph_reference_field:node:article:paragraph_bundle, but:
  • Paragraphs that are referred from an another paragraph will have derivatives like d7_paragraphs:paragraph_reference_field_in_nested_para:paragraphs_item:paragraph_bundle:paragraph_bundle:1, d7_paragraphs:paragraph_reference_field_in_nested_para:paragraphs_item:paragraph_bundle:paragraph_bundle:2 etc

, so the pattern will be: d7_paragraphs:<parent-field-name>:<parent-entity-type>:<parent-entity-bundle>:<paragraph-bundle>:<how-many-nesting-levels-you-have-to-go-back-to-get-a-non-paragraph-parent-entity>.

Back to needs work.

Wim Leers’s picture

RE #3.7 (I wrote a reply to that but apparently failed to post it 😬) — yep, you're right, oops! 🙈😅

#8: thanks for that detailed write-up! 🙏 I am really curious to @heddn's thoughts about this!

huzooka’s picture

Assigned: Unassigned » huzooka
ccjjmartin’s picture

@huzooka It looks like your patches have a bunch of extra files in them. An example I am seeing is file paths committed that show sites/default/files. Can you look at uploading your patches again? The ones closer to 1MB in size seem to be the ones affected.

huzooka’s picture

@ccjjmartin, the "big patches" contain database fixture updates and also add the missing files for being able to test this patch with Drupal core's Migrate UI.

This is exactly the same that I already shared in #7:

The combined and also the on top of #2911244-146: Field collections deriver and base migration patches contain the database fixture update as well. It still needs some cleanup, but it takes too much time, I have given up.

huzooka’s picture

Sad, but true: since the migration plugin IDs of the derived migrations of this patch are longer than the usual, this depends on (and is blocked by) #2845340: migrate mapping & messages table names are truncated, can lead to incorrect mapping lookups.

huzooka’s picture

Another bug discovered when migrating from PostgreSQL source: #3164520: FieldableEntity::getFieldValues() does not guarantee that the returned field values are sorted by their delta.

I will (and I can) work around this issue.

The last submitted patch, 17: paragraphs-derive_by_parent-3145755-17--test-only.patch, failed testing. View results
- codesniffer_fixes.patch Interdiff of automated coding standards fixes only.

huzooka’s picture

Adding a "fix-only" patch as well.

Wim Leers’s picture

#13:

  1. +++ b/src/MigrationPluginsAlterer.php
    @@ -51,6 +63,55 @@ final class MigrationPluginsAlterer {
    +      // fields are present. The raw (and at this point, not accurate) migration
    +      // dependency IDs are added in
    

    🤔 "at this point not accurate" — fascinating!

  2. +++ b/src/MigrationPluginsAlterer.php
    @@ -51,6 +63,55 @@ final class MigrationPluginsAlterer {
    +        $types = $this->paragraphTypes ?? NULL;
    +        if ($types === NULL) {
    +          $types = [];
    +          $type_migration_definition = $migrations['d7_paragraphs_type'] ?? NULL;
    +          if ($type_migration_definition) {
    +            $type_source = static::getSourcePlugin('d7_paragraphs_type');
    +            assert($type_source instanceof DrupalSqlBase);
    +
    +            try {
    +              $type_source->checkRequirements();
    +
    +              foreach ($type_source as $row) {
    +                assert($row instanceof Row);
    +                ['bundle' => $type] = $row->getSource();
    +                $types = array_unique(array_merge($types, [$type]));
    +              }
    +            }
    +            catch (DatabaseExceptionWrapper $e) {
    +            }
    +
    +            $this->paragraphTypes = $types;
    +          }
    

    This looks like it could/should be in a helper method.

  3. +++ b/src/MigrationPluginsAlterer.php
    @@ -51,6 +63,55 @@ final class MigrationPluginsAlterer {
    +            $finalized_dependencies = array_map(function ($type) use ($dependency_to_finalize) {
    

    "finalized" is a less ambiguous word 👍 Let's use "unfinalized" and "finalized".

  4. +++ b/src/ParagraphsMigration.php
    @@ -37,6 +37,13 @@ final class ParagraphsMigration {
    +   * Key of the paragraphs migration's raw migration dependencies.
    

    Does "raw" here mean "unfinalized"?

  5. +++ b/tests/src/Functional/Migrate/MigrateUiParagraphsTestBase.php
    @@ -313,22 +313,22 @@ abstract class MigrateUiParagraphsTestBase extends MigrateUpgradeTestBase {
             6 => 'Paragraph Migration Test Content EN > Field Collection Test (previous revision)',
    -        7 => 'Content with nested paragraphs ENG > Any Paragraph > Any Paragraph (nested)',
    -        8 => 'Content with nested paragraphs ENG > Any Paragraph > Any Paragraph (nested)',
    -        9 => 'Content with nested paragraphs ENG > Any Paragraph',
    -        10 => 'Paragraph Migration Test Content UND > Any Paragraph',
    -        11 => 'Paragraph Migration Test Content EN > Any Paragraph',
    -        12 => 'Paragraph Migration Test Content EN > Any Paragraph (previous revision)',
    -        13 => 'Paragraph Migration Test Content UND > Any Paragraph',
    -        14 => 'Paragraph Migration Test Content EN > Any Paragraph',
    -        15 => 'Paragraph Migration Test Content EN > Any Paragraph (previous revision)',
    -        16 => 'Paragraph Migration Test Content EN > Any Paragraph (previous revision)',
    -        17 => 'Content with nested paragraphs ENG > Any Paragraph',
    -        18 => 'Paragraph Migration Test Content UND > Paragraph One Only',
    -        19 => 'Paragraph Migration Test Content EN > Paragraph One Only',
    -        20 => 'Paragraph Migration Test Content EN > Paragraph One Only (previous revision)',
    -        21 => 'Paragraph Migration Test Content UND > Nested FC Outer > Nested FC Inner',
    -        22 => 'Paragraph Migration Test Content UND > Nested FC Outer',
    +        7 => 'Paragraph Migration Test Content UND > Nested FC Outer > Nested FC Inner',
    +        8 => 'Paragraph Migration Test Content UND > Nested FC Outer',
    +        9 => 'Content with nested paragraphs ENG > Any Paragraph > Any Paragraph (nested)',
    +        10 => 'Content with nested paragraphs ENG > Any Paragraph > Any Paragraph (nested)',
    +        11 => 'Content with nested paragraphs ENG > Any Paragraph',
    +        12 => 'Paragraph Migration Test Content UND > Any Paragraph',
    +        13 => 'Paragraph Migration Test Content EN > Any Paragraph',
    +        14 => 'Paragraph Migration Test Content EN > Any Paragraph (previous revision)',
    +        15 => 'Paragraph Migration Test Content UND > Any Paragraph',
    +        16 => 'Paragraph Migration Test Content EN > Any Paragraph',
    +        17 => 'Paragraph Migration Test Content EN > Any Paragraph (previous revision)',
    +        18 => 'Paragraph Migration Test Content EN > Any Paragraph (previous revision)',
    +        19 => 'Content with nested paragraphs ENG > Any Paragraph',
    +        20 => 'Paragraph Migration Test Content UND > Paragraph One Only',
    +        21 => 'Paragraph Migration Test Content EN > Paragraph One Only',
    +        22 => 'Paragraph Migration Test Content EN > Paragraph One Only (previous revision)',
    

    Why is this order changing? (Not that it matters, I would just like to understand.)


#17:

  1. +++ b/src/Plugin/migrate/process/ParagraphsDeltaSort.php
    @@ -0,0 +1,38 @@
    + * This plugin is required only because Drupal core's
    + * FieldableEntity::getFieldValues() does not guarantee that the returned
    + * field values are sorted by their delta, so when the source Drupal database
    + * is PostgreSQL, the values of some multi-value fields are migrated in reversed
    + * order.
    + * @todo remove when https://drupal.org/i/3164520 is fixed for all supported
    + *   Drupal core release.
    + *
    + * @MigrateProcessPlugin(
    + *   id = "paragraphs_delta_sort",
    + *   handle_multiples = TRUE
    + * )
    + */
    +class ParagraphsDeltaSort extends ProcessPluginBase {
    

    👏

  2. +++ b/tests/src/Functional/Migrate/MigrateUiParagraphsTestBase.php
    @@ -584,6 +586,14 @@ abstract class MigrateUiParagraphsTestBase extends MigrateUpgradeTestBase {
    +        if (count(array_filter(array_keys($expected_entity_labels), 'is_int')) === count($expected_entity_labels)) {
    

    🤔 Why do we need the array_filter(…, 'is_int') here?

huzooka’s picture

Assigned: Unassigned » huzooka
Status: Needs review » Needs work
huzooka’s picture

huzooka’s picture

Assigned: huzooka » Unassigned
Status: Needs work » Needs review
FileSize
968.38 KB
1018.77 KB
50.39 KB
253.15 KB

Addressing #21 and #23.

Re #21:

[ #13 ]

  1. 👍
  2. Obsolete lines (removed entirely)
  3. 👍
  4. Yes! "Raw" means "unfinalized".
  5. Well, I reordered these lines again, because of (my) DX.

[ #17 ]

  1. Thanks! 👍
  2. Added a comment. (When keys are numeric, I don't check the IDs; I oncl check the labels.)

I will try to reduce the database fixture changes as much as possible.

The last submitted patch, 24: paragraphs-derive_by_parent-3145755-24--tests-only.patch, failed testing. View results
- codesniffer_fixes.patch Interdiff of automated coding standards fixes only.

huzooka’s picture

D9.1 test failures are unrelated to this patch afaict.

This patch contains the db fixture with manually reduced size.

Only one single CS error was fixed in a test, no interdiff.

Wim Leers’s picture

Amazing work!!! 🤩🚀

huzooka’s picture

Wim Leers’s picture

+++ b/src/MigrationPluginsAlterer.php
@@ -100,88 +88,62 @@ final class MigrationPluginsAlterer {
+    switch ($current_value) {
+      case 'field_collection_item':
+        $migration['source']["paragraphs_$process_property"] = 'paragraph';
+        $migration['process'][$process_property] = "paragraphs_$process_property";
+        $migration['migration_dependencies']['required'][] = 'd7_field_collection_type';
+        break;
 
-    $source_is_dynamic = $source === NULL;
-    $source_is_paragraph_or_field_collection = in_array($source, array_keys(self::PARAGRAPHS_ENTITY_TYPE_ID_MAP));
+      case 'paragraphs_item':
+        $migration['source']["paragraphs_$process_property"] = 'paragraph';
+        $migration['process'][$process_property] = "paragraphs_$process_property";
+        $migration['migration_dependencies']['required'][] = 'd7_paragraphs_type';
+        break;
 
-    return $source_is_dynamic || $source_is_paragraph_or_field_collection;
+      case NULL:
+        $entity_type_process = &$migration['process'][$process_property];
+        $entity_type_process[] = [
+          'plugin' => 'static_map',
+          'map' => self::PARAGRAPHS_ENTITY_TYPE_ID_MAP,
+          'bypass' => TRUE,
+        ];
+        $migration['migration_dependencies']['optional'][] = 'd7_field_collection_type';
+        $migration['migration_dependencies']['optional'][] = 'd7_paragraphs_type';
+        break;
+    }

This is the essential change here — everything else is just infrastructure changes that make this code a lot simpler (and hence easier to understand).

What this is doing now is using required dependencies whenever possible. Only when the source value of the process pipeline is dynamic, we add both possible dependnecies, and we mark both as optional, since we cannot be certain that it will be required.

👍

Wim Leers’s picture

The two fails are for coding standards:

line 15	The class short comment should describe what the class does and not simply repeat the class name
line 66	Line exceeds 80 characters; contains 82 characters
iancawthorne’s picture

Deleted.

Wim Leers’s picture

Confirmation over at #3178043-4: Experience report: migrating D7 field collections to D8 paragraphs that this patch is helping making Paragraphs migrations feasible.

iancawthorne’s picture

As per my post mentioned on #33 above, I was initially struggling with this with a migration involving a site with both field collections and paragraphs plus some of the paragraphs being nested.

Some essential steps I found to make it all work were:

I had to run the field and field instance migrations twice, eg:

drush mim upgrade_d7_field
drush mim d7_field
drush mim upgrade_d7_field_instance
drush mim d7_field_instance

d7_field and d7_field_instance were required to get the d7 field collection originating paragraph type fields added while;
upgrade_d7_field and upgrade_d7_field_instance got the d7 paragraph originating paragraph type fields.
There seemed to be no side effects of doing this, but it seemed odd that it was necessary.

then for the data...

The field collections worked absolutely fine, as did the paragraphs which were not nested. For the paragraphs which were nested (to be a bit more clear, by nested I mean paragraph type 1 > paragraph type 2 > content type hierarchy) it was essential to run the first child items first, followed by the "middle" paragraphs, finally the content. Without doing this (which was not protected with migration order dependency - I had to work out which order it needed to be and do them one by one), the paragraphs would get very confused with the wrong "paragraph type" being assigned to the content with empty fields.

So the summary was; you can't just let it run through the migration on auto. It needs some manual working out of the order in which things need to happen.

Wim Leers’s picture

#34:

Without doing this (which was not protected with migration order dependency - I had to work out which order it needed to be and do them one by one), the paragraphs would get very confused with the wrong "paragraph type" being assigned to the content with empty fields.

So the summary was; you can't just let it run through the migration on auto. It needs some manual working out of the order in which things need to happen.

Thanks for sharing that — hopefully this means @huzooka and I can convince the migration system maintainers (and the Paragraphs maintainers) to use more ['migration_dependencies']['required'] than we've been able to convince them to use in Drupal core 🤞

huzooka’s picture

This is the current MR as diff.

huzooka’s picture

Title: Derive Field Collection and Paragraph entity migrations based on their parent entity type ID (and parent entity bundle) » Orphaned (nested) paragraphs entities after migration
Assigned: Unassigned » huzooka
Category: Feature request » Bug report
Issue summary: View changes
Status: Needs review » Needs work
huzooka’s picture

Title: Orphaned (nested) paragraphs entities after migration » Orphaned (nested) paragraphs entities after migration & (invalid) stub paragraph entity leftovers
huzooka’s picture

huzooka’s picture

Adding patch #40 as diff.

huzooka’s picture

Wim Leers’s picture

#39 interdiff:

+++ b/src/ParagraphsMigration.php
@@ -49,22 +14,4 @@ final class ParagraphsMigration {
-  public static function getParagraphsMigrationDerivationMethod() {
-    $method = Settings::get(self::DERIVATION_STRATEGY_KEY, self::DERIVATION_STRATEGY_DEFAULT);
-    $handled_methods = [
-      self::DERIVATION_STRATEGY_LEGACY,
-      self::DERIVATION_STRATEGY_PARENT,
-    ];
-
-    return in_array($method, $handled_methods, TRUE)
-      ? $method
-      : self::DERIVATION_STRATEGY_DEFAULT;
-  }

Well … that's a huge simplification! 🤩


#40:

  1. +++ b/migrations/d7_field_collection_revisions.yml
    @@ -14,6 +14,11 @@ process:
    +      no_stub: true
    +    -
    +      plugin: skip_on_empty
    +      method: row
    +      message: "This field collection entity revision belongs to a missing field collection on the source site. Skipping since there is no data we can migrate."
    

    If I understand it correctly, this simple addition is wholly responsible for ensuring no orphaned paragraphs are migrated — wow! 🤩

  2. +++ b/src/Plugin/migrate/field/FieldCollection.php
    @@ -46,34 +46,47 @@ class FieldCollection extends ParagraphsFieldPluginBase {
    +        // Try to restora some level of data integrity if the corresponding
    

    Übernit: s/restora/restore/


#43:

  1. +++ b/src/MigrationPluginsAlterer.php
    @@ -14,13 +14,19 @@ final class MigrationPluginsAlterer {
       const PARAGRAPHS_ENTITY_TYPE_ID_MAP = [
    -    'field_collection_item' => 'paragraph',
    -    'paragraphs_item' => 'paragraph',
    +    'field_collection_item' => [
    +      'bundle_migration' => 'd7_field_collection_type',
    +      'default_revision_migration' => 'd7_field_collection',
    +    ],
    +    'paragraphs_item' => [
    +      'bundle_migration' => 'd7_paragraphs_type',
    +      'default_revision_migration' => 'd7_paragraphs',
    +    ],
    

    This looks much more robust indeed, and simpler.

  2. +++ b/src/MigrationPluginsAlterer.php
    @@ -185,27 +208,63 @@ final class MigrationPluginsAlterer {
    +      case NULL:
    

    🤔 When can this occur?

huzooka’s picture

This is for logging data integrity issues of paragraphs (revisions) – instead of saving migrate messages. Works if you have the logger process plugin from here: #3232075: Create a logger migrate process plugin

I don't want to rebase my patches on top of the current paragraphs HEAD (lack of time), so I attach an interdiff only.

Wim Leers’s picture

#45:

+++ b/src/MigrationPluginsAlterer.php
@@ -360,6 +362,79 @@ final class MigrationPluginsAlterer {
+   * completely. By default, paragraphs (revision) migrations are saving migrate
+   * messages in these situations, but that means that devs might see a tons of
+   * messages they cannot solve (because its a data integrity issue of the
+   * source Drupal 7 instance), and it might be very frustrating.

… and it is not something that can solved! At least not automatically. And manually it'd also be really hard.

IOW: this is a data integrity issue affecting only past revisions: the default (live) revision MUST work correctly, otherwise the migration will fail.

And that is why this "log to inform but do not complain" solution is deemed acceptable: only if you go digging through the past history (the revisions) of these paragraphs will you run into problems.

pub497’s picture

I was only able to get this to work as #34 did, thanks for that -- side effect was my entity reference fields that used taxonomies got reset so those content types needed to have the taxonomy re-selected and content had to be re-imported

natefollmer’s picture

I can't get the patch to apply cleanly to the latest dev branch. I'm having some issues with nested paragraphs coming in with data-less fields and I'm hoping this patch may be the fix.