Problem/Motivation

The process plugins callback (core) and service (this module) now accept an array of arguments if the option unpack_source is set. What is still missing is an easy way to construct the array to pass to these plugins.

Proposed resolution

Add a process plugin that generates an array based on a template (part of configuration) making the following substitutions:

  1. 'source:foo' is replaced by the source property foo.
  2. 'dest:bar' is replaced by the destination property bar.
  3. 'pipeline:' is replaced by the pipeline value.

All three support sub-properties, like source:body/0/value. Any other value is copied directly from the template.

This plugin will provide an alternative, often simpler, to adding constants to the source plugin configuration and to using "pseudo fields" as temporary variables.

Remaining tasks

  1. Add tests.

User interface changes

None

API changes

Add a new process plugin.

Data model changes

None

Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Comments

benjifisher created an issue. See original summary.

benjifisher’s picture

In #3236774-9: Provide ability to reference current value of process pipeline as a source property, @danflanagan8 pointed out that passing null as part of the source array of a process plugin has the effect of inserting the pipeline value.

At #3436877: [meeting] Migrate Meeting 2024-03-28 2100Z, @mikelutz said that this behavior is a bug, and we should avoid using it.

The process plugin proposed here is more flexible, and it makes the intention clearer.

I think this process plugin is also more flexible than the one proposed in #3314502: Add wrapper process plugin to wrap/unwrap values in arrays.

benjifisher’s picture

Title: Process plugin to build an array » Process plugin: build an array from source, destination, pipeline
Status: Active » Needs work
benjifisher’s picture

Issue summary: View changes
Issue tags: +Needs tests
benjifisher’s picture

Some more examples, from the doc block:

Generic example:

process:
  bar:
    plugin: build_array
    source: foo
    template:
      key: literal string
      properties:
        - source:field_body/0/value
        - dest:field_body/0/value
      - pipeline:some/nested/key

Prepare an entity reference revision (ERR) field

process:
  field_paragraph:
    - plugin: migration_lookup
      # ...
    - plugin: build_array
      template:
        target_id: pipeline:0
        target_revision_id: pipeline:1

Here is an example from my current project.

Create a serialized array for the layout_paragraphs module

process:
  behavior_settings:
    # ...
    - plugin: build_array
      template:
        layout_paragraphs:
          parent_uuid: pipeline:0/value
          region: first
    - plugin: callback
      callable: serialize
danflanagan8’s picture

Very excited to see this, @benjifisher! I haven't reviewed the code yet, but my first impression on the name is that we're dangerously close to the core array_build plugin! The first best replacement name that jumps out at me would be array_template.

benjifisher’s picture

Status: Needs work » Needs review
Issue tags: -Needs tests

@danflanagan8:

Thanks for taking a look!

I added some tests, starting with yours from #3314502: Add wrapper process plugin to wrap/unwrap values in arrays. That way,

  1. Your excellent test coverage does not go to waste if we decide to close that issue in favor of this one.
  2. I confirm that this plugin can do anything the wrap plugin can do (with method: wrap).

When I first wrote this plugin, I called it array_build, but then I realized the problem. I confess I was not very creative when I turned it into build_array, and you are right about the potential for confusion.

I can go with array_template, but other variations are worth considering:

  • array_template
  • template_array
  • template

Or maybe the name should indicate that it can work with source, destination, and pipeline values.

How do you feel about using the short form "dest" instead of spelling out "destination"?

benjifisher’s picture

Assigned: benjifisher » Unassigned
benjifisher’s picture

Issue summary: View changes

I changed the plugin ID to array_template, and I changed the class names (plugin and test classes) to match.

benjifisher’s picture

Here is a more complicated example from my current project. A custom source plugin provides the source field filters that looks something like this:

[
  [
    'vid' => 'some_vocab',
    'tids' => [1, 2, 3, 5],
  ],
  [
    'vid' => 'another_vocab',
    'tids' => [8, 13, 21],
  ],
]

That represents two vocabularies and a few terms from each vocabulary.

Here is the pipeline:

  field_hwp_default_filter_values:
    - plugin: sub_process
      source: filters
      process:
        data:
          - plugin: array_template
            template:
              target_id: 'pipeline:'
            source: tids
        reference_field:
          - plugin: migration_lookup
            migration: hwp_vocabularies
            source: vid
          - plugin: array_template
            template:
              - field
              - 'pipeline:'
          - plugin: concat
            delimiter: _
    - plugin: single_value
    - plugin: array_template
      template:
        - 'pipeline:'
        - data
        - reference_field
    - plugin: callback
      callable: array_column
      unpack_source: true
    - plugin: callback
      callable: serialize

After the first step in the pipeline (sub_process), the example at the top is converted to this:

[
  [
    'data' => [['target_id' => 1], ['target_id' => 2], ['target_id' => 3], ['target_id' => 5]],
    'reference_field' => 'field_some_vocab',
  ],
  [
    'data' => [['target_id' => 8], ['target_id' => 13], ['target_id' => 21]],
    'reference_field' => 'field_another_vocab',
  ],
]

By default, a process plugin (like array_template) is applied to each element of a source array. In this example, migration_lookup is a no-op.

The single_value process plugin overrides that default behavior, so the next array_template prepares the input for the callback plugin:

[
  [['data' => ..., 'reference_field' => ...], ['data' => ..., 'reference_field' => ...]],
  'data',
  'reference_field',
]

and so the callback plugin returns

array_column(..., 'data', 'reference_field')

or

[
  'field_some_vocab' => [['target_id' => 1], ['target_id' => 2], ['target_id' => 3], ['target_id' => 5]],
  'field_another_vocab' => [['target_id' => 8], ['target_id' => 13], ['target_id' => 21]],
]

The last step in the pipeline uses callback with callable: serialize to serialize that array.

danflanagan8’s picture

Status: Needs review » Needs work

I played around with this in Migrate Sandbox with great success. I also took my findings over to one of the related issues. (#3314502: Add wrapper process plugin to wrap/unwrap values in arrays)

I love being able to easily mix string literals with source properties and destination properties and (perhaps best of all) the pipeline value.

The test coverage is expansive, too. And the documentation isn't bad at all. Really nice stuff, @benjifisher.

My only complaint (well, I complained about something else way back in #7 that Benji humored me on) is that I feel weird referring about the trailing colon in pipeline:. It's a strange thing to type.

At the same time, it's consistent with source: and dest:, though with those you have to put something after the colon. IT's just that with pipeline: you don't have to put anything after the colon and I would naively think that I would rarely put anything after the colon.

And what would I suggest in place of that syntax that wouldn't simply be my personal preference just based on my personal tastes?

At the end of the day, I think I've convinced myself that the syntax on the MR is fine. Phew!

EDIT: I should clarify that I requested a couple tiny changes on the MR. This was the "only complaint" I had that wasn't a focused comment on the MR. What can I say, I'm a complainer.

benjifisher’s picture

Status: Needs work » Needs review

@danflanagan8:

I think I fixed all the things you pointed out on the MR.

I agree that 'pipeline:' is awkward. I even considered allowing pipeline as a synonym, but I think the Migrate API already goes too far in making things "convenient". If I did that, then I would have to add test coverage for it, too. In the end, I decided to keep the PHP simple and accept a little ugliness in the YAML for the sake of consistency.

danflanagan8’s picture

Status: Needs review » Reviewed & tested by the community

This is great stuff. Thanks, @benjifisher!

heddn made their first commit to this issue’s fork.

  • heddn committed d2c5eba3 on 6.0.x authored by benjifisher
    Issue #3440904 by benjifisher, danflanagan8, heddn: Process plugin:...
heddn’s picture

Status: Reviewed & tested by the community » Fixed

Thxs for the contributions.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.