Migrate Process: Importing Multiple Paragraphs [#2825565]

OK, I tried working at this one and couldn't figure out the solution, so I'm writing up a request.

I'm trying to import Paragraphs using Migrate Source CSV. I've managed to get single paragraphs to import using Migrate.

However when I attempt to import multiple paragraphs using the explode plugin - everything seems to break.

Here's my migrate YML:

uuid: 88b8d4b8-0887-4dbf-884e-1d2483c3f2c0
id: import_content_test_1
label: Import Content Test 1

source:
  plugin: csv
  path: 'private://import_csv/content_test_1.csv'
  header_row_count: 1
  keys:
    - id
  column_names:
    -
      id: id
    - 
      title: title
    - 
      body: body
    -
      paragraph_test_1: paragraph_test_1
process:
  title: title
  body: body
  'field_paragraph_test_1/target_id':
    -
      plugin: explode
      source: paragraph_test_1
      delimiter: ,
    -
      plugin: migration
      migration: import_paragraph_test_1
      no_stub: true
    -
      plugin: extract
      index:
        - 0
  'field_paragraph_test_1/target_revision_id':
    -
      plugin: explode
      source: paragraph_test_1
      delimiter: ,
    -
      plugin: migration
      migration: import_paragraph_test_1
      no_stub: true
    -
      plugin: extract
      index:
        - 1
  type:
    plugin: default_value
    default_value: content_test_1
destination:
  plugin: entity:node

My CSV file has a row, with multiple values inside of it, e.g, "11,12", as follows:

id,title,body,paragraph_test_1
1,Title 1,Body 1,"11,12"
2,Title 2,Body 2,"21,22"

When I run the import, it fails, and generates the error:

Value is not a valid entity.                                                                            [error]
(***/modules/entity_reference_revisions/src/Plugin/DataType/EntityReferenceRevisions.php:114)

If I dig into what EntityReferenceRevisions.php is expecting around line 114:

  public function setValue($value, $notify = TRUE) {
    unset($this->target);
    unset($this->id);
    unset($this->revision_id);
  
    // Both the entity ID and the entity object may be passed as value. The
    // reference may also be unset by passing NULL as value.
    if (!isset($value)) {
      $this->target = NULL;
    }
    elseif (is_object($value) && $value instanceof EntityInterface) {
      $this->target = $value->getTypedData();
    }
    elseif (!is_scalar($value['target_id']) || !is_scalar($value['target_revision_id']) || $this->getTargetDefinition()->getEntityTypeId() === NULL) {
      throw new \InvalidArgumentException('Value is not a valid entity.');
    }

EntityReferenceRevisions::setValue taking in the following as $value:

Array
(
    [target_id] => Array
        (
            [0] => 54
            [1] => 56
        )

    [target_revision_id] => Array
        (
            [0] => 55
            [1] => 57
        )

)

This seems wrong to me.. presumably this should be an Array of two numbered objects, each containing an array with keys target_id, target_revision_id. I think *should* look closer to this:

Array
(
    [0] => Array
        (
            [target_id] => 54
            [target_revision_id] => 56
        )

    [1] => Array
        (
            [target_id] => 55
            [target_revision_id] => 57
        )

)

I'm guessing my YML file isn't quite what it should be. I've tried several iterations, usually attempting to put the explode before the breakdown into target_id/target_revision_id, something along the lines of the following (non working) YML code:

uuid: 88b8d4b8-0887-4dbf-884e-1d2483c3f2c0
id: import_content_test_1
label: Import Content Test 1

source:
  plugin: csv
  path: 'private://import_csv/content_test_1.csv'
  header_row_count: 1
  keys:
    - id
  column_names:
    -
      id: id
    - 
      title: title
    - 
      body: body
    -
      paragraph_test_1: paragraph_test_1
process:
  title: title
  body: body
  field_paragraph_test_1:
    plugin: explode
    source: paragraph_test_1
    delimiter: ,
    process:
      target_id:
        -
          plugin: migration
          migration: import_paragraph_test_1
          no_stub: true
        -
          plugin: extract
          index:
            - 0
      target_revision_id:
        -<ol>
  <li></li>
</ol>
          plugin: migration
          migration: import_paragraph_test_1
          no_stub: true
        -
          plugin: extract
          index:
            - 1
  type:
    plugin: default_value
    default_value: content_test_1
destination:
  plugin: entity:node

I can't quite figure this out (the sub-process isn't sure of its source?)

Any pointers would be helpful. Thanks in advance!

Comments

Comment #1

7 November 2016 at 23:26

TrevorBradley created an issue. See original summary.

Comment #2

trevorbradley commented 8 November 2016 at 21:50

I got it to work!

And I'm going to document the process here in its entirety so that people who have the same task can copy my work. It's non trivial.

This will take a couple hours to write up, just wanted to post this quick update before someone replied to my message and tried to tackle the problem.

Comment #3

trevorbradley commented 8 November 2016 at 23:00

Issue summary:

View changes

OK, so here's how to attach multiple paragraphs. In my context, I'm trying to use Migrate Source CSV to import multiple paragraphs, then attach those paragraphs to content.

In my example I have 4 paragraphs (paragraph_test_1.csv):

id,number_1,text_1
11,111,Text 11
12,112,Text 12
21,121,Text 21
22,122,Text 22

I have two nodes I want to attach those paragraphs to. Two paragraphs are to be added to each node (content_test_1.csv):

id,title,body,paragraph_test_1
1,Title 1,Body 1,"11,12"
2,Title 2,Body 2,"21,22"

There are a number of issues to address. Firstly it's critical that the EntityReference migrate destination patch be applied against Entity Reference Revisions, so that importing paragraphs will store their target_id and target_revision_id in the map: https://www.drupal.org/node/2809793

Here's my YAML file for importing paragraphs. It's fairly simple:

uuid: 88b8d4b8-0887-4dbf-884e-1d2483c3f2c1
id: import_paragraph_test_1
label: Import Paragraph Test 1

source:
  plugin: csv
  path: 'private://import_csv/paragraph_test_1.csv'
  header_row_count: 1
  keys:
    - id
process:
  field_number_1: number_1
  field_text_1: text_1
  type:
    plugin: default_value
    default_value: paragraph_test_1
destination:
  plugin: entity_reference_revisions:paragraph

After running this against your content, pop into sql and check the map tables for your import. There should be destid1 and destid2 fields in the table, your target entity id and target entity revision id.

So, here's where things get messy. The core problem here is that it's not possible to refer to the target_id and target_revision_id subfields using field_paragraph_test_1/target_id and field_paragraph_test_1/target_revision_id. That works fine for a single paragraph, but process is unable to iterate against the paragraphs properly.

It's possible to chain the process plugins like so: explode > migration > extract . However the values that come out of that are all wrong:

Array
(
    [target_id] => Array
        (
            [0] => 54
            [1] => 56
        )

    [target_revision_id] => Array
        (
            [0] => 55
            [1] => 57
        )

)

It's not possible to iterate through paragraphs when the target_id and target_revision_ids are lacking keys. As I said in my original post, you need to be able to iterate through the paragraphs as a list.

So in theory I'd like to run my process migrations in this chain: explode > iterator (migration > extract)

There's a big problem here. iterator expects an keyed array as input. But expode is just returning a series of strings. We need to somehow wrap these strings in a keyed array so that iterator can work against them.

Behold, the custom key_wrapper plugin:

<?php
namespace Drupal\migrate_process_key_wrapper\Plugin\migrate\process;

use Drupal\migrate\ProcessPluginBase;
use Drupal\migrate\MigrateExecutableInterface;
use Drupal\migrate\Row;
 
/**
 * Determine the most recent entity revision id given an entity id
 *
 * @MigrateProcessPlugin(
 *   id = "key_wrapper"
 * )
 */
class MigrateProcessKeyWrapper extends ProcessPluginBase {
 
  /**
   * {@inheritdoc}
   */
  public function transform($value, MigrateExecutableInterface $migrate_executable, Row $row, $destination_property) {
    $new_value = array(
        'key_wrapper' => $value,
    );
    $value = $new_value;
    return $value;
  }
 
}

This takes the string "11", and changes it to array('key'=>11). And now the iterator can do it's magic. Here's the YAML for my node import:

uuid: 88b8d4b8-0887-4dbf-884e-1d2483c3f2c0
id: import_content_test_1
label: Import Content Test 1

source:
  plugin: csv
  path: 'private://import_csv/content_test_1.csv'
  header_row_count: 1
  keys:
    - id
  column_names:
    -
      id: id
    - 
      title: title
    - 
      body: body
    -
      paragraph_test_1: paragraph_test_1
process:
  title: title
  body: body
  field_paragraph_test_1:
    - 
      plugin: explode
      delimiter: ,
      source: paragraph_test_1
    -
      plugin: key_wrapper
    -
      plugin: iterator
      process:
        target_id:
          -
            plugin: migration
            migration: import_paragraph_test_1
            source: key_wrapper
            no_stub: true
          -
            plugin: extract
            index:
              - 0
        target_revision_id:
          -
            plugin: migration
            migration: import_paragraph_test_1
            source: key_wrapper
            no_stub: true
          -
            plugin: extract
            index:
              - 1
  type:
    plugin: default_value
    default_value: content_test_1
destination:
  plugin: entity:node

And bam! The chain works. (source) "11,12" becomes (explode) 11, becomes array('key_wrapper' => 11), which is chewed up by (iterator), passed to migration to become array('target_id'=>54,'target_revision_id'=>56), and then (extract) to be just 54.

Having written this out, this still doesn't feel quite right. It feels like explode>migration just isn't returning that correct structure, and that if somehow the combo returned

Array
(
    [0] => Array
        (
            [target_id] => 54
            [target_revision_id] => 56
        )

    [1] => Array
        (
            [target_id] => 55
            [target_revision_id] => 57
        )

)

it would work properly.

Anyways, it was important that I documented this. It's likely I'll come back and revise this with a better solution...

Comment #4

trevorbradley commented 9 November 2016 at 22:39

I'm making some real progress here. It's possible to do this without added plugins, but it requires several patches.

#1: EntityReference migrate destination Patch #15: https://www.drupal.org/node/2809793
#2: Migrate SQL Map doesn't get array keys for compound keys Patch #14: https://www.drupal.org/node/2810907
#3: Scalar to Array Migration returns Null: https://www.drupal.org/node/2767643 (Either of my #1 or #3 patches in comments #10 and #12).

Then the simple YML file for import will work:

uuid: 88b8d4b8-0887-4dbf-884e-1d2483c3f2c0
id: import_content_test_1
label: Import Content Test 1

source:
  plugin: csv
  path: 'private://import_csv/content_test_1.csv'
  header_row_count: 1
  keys:
    - id
process:
  title: title
  body: body
  field_paragraph_test_1:
    -
      plugin: explode
      source: paragraph_test_1
      delimiter: ,
    -
      plugin: migration
      migration: import_paragraph_test_1
      no_stub: true
    -
      plugin: iterator
      process:
        target_id: id
        target_revision_id: revision_id
  type:
    plugin: default_value
    default_value: content_test_1
destination:
  plugin: entity:node

The plugins are almost perfect, and the migration plugin returns this structure to save into the node:

    [destination:protected] => Array
        (
            [title] => Title 2
            [body] => Body 2
            [field_paragraph_test_1] => Array
                (
                    [0] => Array
                        (
                            [id] => 82
                            [revision_id] => 84
                        )

                    [1] => Array
                        (
                            [id] => 83
                            [revision_id] => 85
                        )

                )

            [type] => content_test_1
        )

There is a problem here though. The SQL Key patch is returning the keys "id" and "target_id", but paragraph fields expect keys of "target_id" and "target_revision_id". Adding in the simple iterator plugin solves the problem.

Still looking into it. There has to be an even cleaner solution.

Comment #5

trevorbradley commented 9 November 2016 at 23:01

Comment #6

ruloweb commented 10 November 2016 at 12:43

Maybe we can close this as duplicated of #2809793: EntityReference migrate destination so discussions get in on topic.

Comment #7

trevorbradley commented 10 November 2016 at 18:23

@ruloweb: I agree in principle, but at the moment this is stream of conciousness documentation. And there are still changes going on over in #drupal-migrate. Once I settle on how this should actually be solved, I'll post back there. (Shortly!)

Comment #8

trevorbradley commented 10 November 2016 at 18:49

@heddn is saying in #drupal-migrate that the SQL patch may be a no-go, partly because of the key mismatch between paragraphs and their references.

Here is a variant of my YML that works without the SQL patch. It still requires the scalar=>array is NULL patch and the Entity Reference Relations patch.

uuid: 88b8d4b8-0887-4dbf-884e-1d2483c3f2c0
id: import_content_test_1
label: Import Content Test 1

source:
  plugin: csv
  path: 'private://import_csv/content_test_1.csv'
  header_row_count: 1
  keys:
    - id
process:
  title: title
  body: body
  field_paragraph_test_1:
    -
      plugin: explode
      source: paragraph_test_1
      delimiter: ,
    -
      plugin: migration
      migration: import_paragraph_test_1
      no_stub: true
    -
      plugin: iterator
      process:
        target_id: '0'
        target_revision_id: '1'
  type:
    plugin: default_value
    default_value: content_test_1
destination:
  plugin: entity:node

Comment #9

heddn

English

Illinois

commented 10 November 2016 at 18:57

I think this is about as good as we are going to get. DX is pretty decent. Let's start getting things documented now.

Comment #10

trevorbradley commented 10 November 2016 at 20:18

Not entirely sure why in the iterator plugin, 0 and 1 have to be in quotes. Otherwise the process plugin discards the data.

Comment #11

trevorbradley commented 10 November 2016 at 20:20

@heddn: Agreed. This is sufficiently unmessy for a complex import.

It still requires that migrate patch to function (permitting scalar source keys to map to 2 destination keys) which is a bit concerning.

Comment #12

heddn

English

Illinois

commented 16 November 2016 at 17:45

Here's why '0' is necessary, look at 'get' process plugin on line 35. It treats them as different.

Comment #13

Lowell commented 16 November 2016 at 20:33

for me the iterator would break on the single pair of destid1,destid2 returned by the migration plugin

I did a very similar migration and had to modify the data slightly for the iterator to work.

  field_conference_speaker:
    -
      plugin: migration
      migration:
        - paragraph_speech
      source: nid
    -
      plugin: nest_in_array
    -
      plugin: iterator
      process:
        target_id: '0'
        target_revision_id: '1'

nest_in_array does this

    $nested_value[0] = $value;
    return $nested_value;

and these are the patches in use
drupal 8.2.3
entity_reference_revisions 1.x-dev

    "patches": {
      "drupal/core": {
        "scalar handling": "https://www.drupal.org/files/issues/2767643-migration-scalar-handling-3.patch"
      },
      "drupal/entity_reference_revisions": {
        "destination plugin": "https://www.drupal.org/files/issues/entityreference_migrate-2809793-28.patch"
      }
    },

Comment #14

heddn

English

Illinois

commented 21 November 2016 at 16:49

Status:

Active

» Closed (duplicate)

I'm going to close as duplicate to #2809793: EntityReference migrate destination. This was just a WIP stream-of-conscience of how to get a migration work. See the actual work over there.

Comment #15

captaindav commented 28 September 2017 at 23:17

I have a similar case to the above example:

My paragraphs:
para_id,main_id,text_1
1,1,Text 1
2,1,Text 2
3,2,Text 3
4,2,Text 4

My nodes:
main_id,title,body,paragraph_test_1
1,Title 1,Body 1
2,Title 2,Body 2

So each node should have two paragraphs, using the main_id relationship between the CSV files.

Could someone describe how to construct the node migration, mainly how it would be different from the above migration which had inline id's to relate the CSV files?

Comment #16

trevorbradley commented 29 September 2017 at 01:25

Nodes point to paragraphs. Paragraphs don't point to nodes - they're their own "atomic" entity.

I'm thinking your nodes.csv would be along the lines of:

main_id,title,body,paragraph_test_1
1,Title 1,Body 1,"1,2"
2,Title 2,Body 2,"3,4"

The main_id field in your paragraphs.csv would be unused then. (It's fine to have it in the csv, but your yml files shouldn't make reference to it.)

The migration yml for your nodes would be similar to the one I posted above in #8:

The explode plugin takes the "1,2" and converts it to "1" and "2" and does each paragraph on its own loop.

The migration plugin converts "1" into a target_id and a target_revision_id (internal to drupal entity reference.) "1" would convert to something like: [0=>123, 1=>124] (The target_id and target_revision_id are assigned when the paragraphs are imported). Ditto for what was once "2".

Finally the iterator plugin assigns the target_id and target_revision_id to the appropriate fields in the entity reference. From there, you should have a paragraph attached to your node.

Good luck! It's complicated at first, cranky if you attempt to reference something that isn't imported yet (always import your paragraphs first!), but powerful once it's working.

Comment #17

captaindav commented 29 September 2017 at 12:28

The data I have been given does not have the paragraph ids in-line in the node csv.

In my use case, I have been given two csv files, one has data for "stores" (the nodes) and the other csv has data for "store hours" (the paragraphs). Each store has multiple store hour rows in the store hours csv, one row for each day of the week the store is open. The store hours csv has a "store_number" that points back to the store for which the hours are defined.

Something like:
(nodes)

store_number, store_name
1,Name 1
2,Name 2

(paragraphs)

id,store_number,day,open,close
1,1,Monday,8 am, 6 pm
2,1,Tuesday,8 am, 6 pm
3,2,Monday,9 am, 9 pm
4,2,Tuesday,9 am, 9 pm

So the data in the csv files is essentially reversed from the above example, the paragraphs csv has a pointer (store_number), which points back to the node csv. The data was exported from another system, and there isn't an option to have inline id's as in the above example.

Importing the paragraphs is straight forward:

id: store_hours
source:
  plugin: csv
  path: '../store_hours.csv'
  header_row_count: 1
  keys:
    - id
  column_names:
    -
       id: id
   - 
       store_number: store_number
   - 
       day: day
   - 
       open: open
   - 
       close: close

process:
  field_store_number: store_number
  field_day: day
  field_open: open
  field_close: close
  type:
    plugin: default_value
    default_value: store_hours

destination:
  plugin: entity_reference_revisions:paragraph

For the node import, I have tried several variations of the following code with no luck (all of the fields except the paragraph field import):

id: store
source:
  plugin: csv
  path: '../stores.csv'
  header_row_count: 1
  keys:
   - store_number
  column_names:
    -
      store_number: store_number
    -
      store_name: store_name

process:
  title: store_name
  field_store_number: store_number
  field_hours:
    -
      plugin: migration
      migration: store_hours
      source: store_number
      no_stub: true
    -
      plugin: iterator
      process:
        target_id: '0'
        target_revision_id: '1'
  type:
    plugin: default_value
    default_value: store

destination:
  plugin: 'entity:node'

Since each paragraphs csv row contains a pointer (store number) back to the node row, there must be some way to find the appropriate paragraphs for each node. But I am not sure how to configure this in the node migration, any help is appreciated!

Thanks,

captaindav

Comment #18

trevorbradley commented 5 October 2017 at 17:33

At the end of the day, the paragraph id's need to be connected at the node level. That's just how paragraphs work in Drupal.

I'm not sure if there's a great way of doing this other than to write your own process plugin. "Find and iterate through all Paragraphs that have this Store ID"

Alternately, if it's a one time import, and you can't get the CSVs in any other format, I might consider writing a pre-processor to "fix" the CSVs so they're in a format Drupal Expects. Read the hours csv and store the data, read the stores csv, and add a new column based on the data.

Alternately, you might consider changing your data model, so that the Store Hours are their own node object, and they externally reference your stores (as your CSVs do). This seems really backward to me though.

This is a closed thread, you might be asking in the wrong place. :)

Comment #19

captaindav commented 13 October 2017 at 00:17

I agree, a pre-processor might be simpler solution.

I will post a new issue in the Paragraphs issue queue.

Comment #20

edob commented 6 November 2017 at 03:28

Hi all, It's a quick question but does this work with PHP 7 or later??

Comment #21

nicolash commented 15 January 2019 at 23:41

Hey @captaindav,

Did you ever get anywhere with this? I know this is closed, but the discussion didn't really go into this over in the other thread.

I have a very similar data model and tbh, I think it's a far more likely one that people have than having inline references of the parent to the children, since it's how a relational db would also be set up and often that's replicated in CSV files.

I also think it is probably quite simple, as the migration lookup simply needs to return several references from the map table if the source data has some reference from child to parent. If this would be straight code and SQL it wouldn't be a problem at all, but I'm struggling with the yml abstraction and different combinations.

Comment #22

nicolash commented 16 January 2019 at 05:07

Ok, in case anyone gets here looking for something similar, once core migration_lookup allows allow_multiple: true, it's very straightforward.

It's being worked on here and I just used it in a custom processor until it'll be released:
https://www.drupal.org/project/drupal/issues/2890844

Migration example:


  field_test_para_reference:
    -
      plugin: migration_lookup
      migration: test_para
      allow_multiple: true
      no_stub: true
      source: id
    -
      plugin: sub_process
      process:
        target_id: '0'
        target_revision_id: '1'

Comment #23

silverham commented 7 December 2021 at 08:21

The above did not work me (likely because allow_multiple is not in core yet)

The following worked for me (Drupal 9.2):

Node data:
{
    {
        "my_id": 'node_1',
        "drupal_paragraphs_migration_ids": [
            "my_para_1",
            "my_para_2",
        ],
        [... other fields ...],
    },
}

Paragraph data:
{
    {
        "my_id": "my_para_1",
        [... other fields ...],
    },
}

id: my_migrate_plus_nodes
label: 'My Migration Nodes'
[...]
process:
  field_paragraphs:
    -
      plugin: multiple_values
      source: drupal_paragraphs_migration_ids
    -
      plugin: migration_lookup
      migration: my_migrate_plus_paragraphs
      no_stub: true
    -  
     plugin: sub_process
     process:  
       target_id: '0'  
       target_revision_id: '1'

Comment #24

somebodysysop commented 22 November 2022 at 09:36

@TrevorBradley

OK, I know it's been 6 years, but this is the only code I've found so far that I can halfway understand -- and allow_multiple is still not in core. I wish to migrate multiple paragraph ids into a single paragraph entity reference field.

As I understand it, if I format my source field like this with the source ids: "0,1,2,3,4,5", then the iterator code you provided in #3 https://www.drupal.org/project/entity_reference_revisions/issues/2825565... will work.

I figured out how to get all the paragraph source ids for each node in this format by using prepareRow() in my source plugin.

Do you, or anyone else, know if that code still works?

Comment #25

inversed commented 11 June 2023 at 20:28

If you only need a default paragraph created, there is a module to add this functionality: https://www.drupal.org/project/migration_tools

This is by far the quickest way to generate a paragraph entity during the migration process.

Here is their example YAML:

  field_some_paragraph_entity_ref_revison:
    plugin: create_default_paragraph_revision
    paragraph_default:
      create_paragraph_bundle: paint_recommendation
      field_color: blue
      field_paint_type: latex
      field_coats: 2
      field_exterior_use: true

Migrate Process: Importing Multiple Paragraphs

Comments

Comment #1

Comment #2

Comment #3

Comment #4

Comment #5

Comment #6

Comment #7

Comment #8

Comment #9

Comment #10

Comment #11

Comment #12

Comment #13

Comment #14

Comment #15

Comment #16

Comment #17

Comment #18

Comment #19

Comment #20

Comment #21

Comment #22

Comment #23

Comment #24

Comment #25

Related issues

News items

Our community

Documentation

Drupal code base

Governance of community