Problem/Motivation

Sometimes I wish a process plugin that is set to "handle_multiples" didn't handle multiples.

Proposed resolution

I ran into another issue that proposes an "iterate" process plugin. #3278748: Add an iterate process plugin. While reviewing that issue, it seemed like a lot of complexity that I'd rather defer to the sub_process plugin, which already handles this kind of thing. The problem is that sub_process requires an array of arrays as input.

A simple way around this would be to add a "wrapper" process plugin that could turn an array into an array of arrays. Then the process plugin I wished didn't handle multiples could be used inside a sub_process. Boom! Oh, and then the "wrapper" process plugin could "unwrap" after the sub_process if needed.

Keeping the "hard part" in sub_process instead of introducing a new "iterate" process plugin makes this very easy to unit test. It's also possible that wrapping or unwrapping a value could be useful outside the context of iteration/sub_processing.

Example:

 * For an input array of arrays, flatten each of child arrays.
 *
 * Assume the input array my_array is:
 * [
 *   ['a', 'b', ['c, 'd']],
 *   [1, 2, [3, 4]],
 * ]
 *
 * And we want to transform this into:
 * [
 *   ['a', 'b', 'c, 'd'],
 *   [1, 2, 3, 4],
 * ]
 *
 * Since the flatten plugin is designed to "handle_multiples",
 * doing this:
 *
 * @code
 * flatten_wont_work:
 *   plugin: flatten
 *   source: my_array
 * @endcode
 *
 * Will result in a single array:
 * ['a', 'b', 'c', 'd', 1, 2, 3, 4]
 *
 * This is where the wrapper plugin can save the day. We can
 * call wrap on my_array, then use a sub_process, and finally
 * unwrap the result of the sub_process.
 *
 * @code
 * my_desired_output:
 *   -
 *     plugin: wrapper
 *     method: wrap
 *     source: my_array
 *     key: element
 *   -
 *     plugin: sub_process
 *     process:
 *       '0':
 *         plugin: flatten
 *         source: element # This should match the 'key' used when wrapping.
 *   -
 *     plugin: wrapper
 *     method: unwrap
 * @endcode
 *
 * In general terms, the most powerful use case is when you wish
 * that an existing process plugin didn't "handle_multiples". In such
 * a case the pattern above can be used: wrap, sub_process, unwrap.

Remaining tasks

N/A

User interface changes

N/A

API changes

New "wrapper" process plugin

Data model changes

N/A

Comments

danflanagan8 created an issue. See original summary.

danflanagan8’s picture

danflanagan8’s picture

Assigned: danflanagan8 » Unassigned
Status: Active » Needs review
StatusFileSize
new11.93 KB

Here's a patch that has the proposed process plugin with substantial documentation and complete unit tests.

danflanagan8’s picture

Here's an example from Slack where this would help. This goes way back to March of 2021 but it stuck in my head. https://drupal.slack.com/archives/C226VLXBP/p1615575376103200

INPUT
  field_country_code: US
  field_custom_id:
	(
	  [0] => BBB
	  [1] => CCC
	  [2] => DDD
	)

OUTPUT
  field_result:
	(
	  [0] => US-BBB
	  [1] => US-CCC
	  [2] => US-DDD
	)

Here's a situation where we basically want to use concat on each element of an array, but we can't because concat is set to "handle_multiples".

This is cake with the help of wrapper. We turn field_custom_id into an array of arrays, at which point we can use sub_process.

field_output:
  -
    plugin: wrapper
    method: wrap
    source: field_custom_id
    key: id # This just helps us refer to the value inside a sub_process in a semantic way instead of using '0'.
  -
    plugin: sub_process
    include_source: true
    process:
      result:
        plugin: concat
        delimiter: '-'
        source:
          - source/field_country_code
          - id # This is the 'key' as set in the wrapper configuration above
  -
    plugin: wrapper
    method: unwrap
benjifisher’s picture

I think that wrapper with method: wrap can be replaced by the more flexible build_array plugin from #3440904: Process plugin: build an array from source, destination, pipeline; and method: unwrap can be replaced by callback with callable: array_pop.

For example, the pipeline from the issue description,

  -
    plugin: wrapper
    method: wrap
    source: my_array
    key: element
  -
    plugin: sub_process
    process:
      '0':
        plugin: flatten
        source: element # This should match the 'key' used when wrapping.
  -
    plugin: wrapper
    method: unwrap

is equivalent to

  - 
    plugin: build_array
    template:
      element: 'pipeline:'
    source: my_array
  -
    plugin: sub_process
    process:
      '0':
        plugin: flatten
        source: element
  -
    plugin: callback
    callable: array_pop

I verified that they give the same results on the example input using migrate_sandbox.

So I suggest we close this issue in favor of #3440904: Process plugin: build an array from source, destination, pipeline.

danflanagan8’s picture

I'm working my way through reviewing the related array_template process plugin.

I can confirm the part of comment #5 about replacing the wrap method using array_template.

I'm thinking about the array_pop as a replacement for unwrap...it's not exactly the same because the unwrap method fails if the argument is not a single-valued array, but that's not really a big deal.

So I'm convinced that the wrapper plugin could be completely reproduced as described by @benjifisher in #5

The only question is whether there's enough DX value in:
1. the symmetry that comes with wrap/subprocess/unwrap
2. the relative simplicity or the wrap syntax compared to the array_template syntax

joachim’s picture

> The only question is whether there's enough DX value in:
> 1. the symmetry that comes with wrap/subprocess/unwrap
> 2. the relative simplicity or the wrap syntax compared to the array_template syntax

For the syntax, the plugin from this issue is definitely nicer.

The MR from the other issue, #3440904: Process plugin: build an array from source, destination, pipeline, has this example for migrating a paragraph reference:

 * @code
 * process:
 *   field_paragraph:
 *     - plugin: migration_lookup
 *       # ...
 *     - plugin: array_template
 *       template:
 *         target_id: pipeline:0
 *         target_revision_id: pipeline:1
 * @endcode

But with the plugin from this issue, my process array is simpler:

  field_paragraph:
    -
      plugin: migration_lookup
      # SNIP
    -
      plugin: wrapper
      method: wrap

The array_template plugin allows much more complexity and is much more powerful for other use cases, but for this particular use case, having to specify a template for the array feels a bit redundant when I just want to nest the values down a level feels like overkill.

> 1. the symmetry that comes with wrap/subprocess/unwrap

I'd actually prefer two plugins, called 'wrap' and 'unwrap'.

This would then match the symmetry of the 'multiple_values' / 'single_value' plugins.

(An thought which may muddy the waters, if so, ignore -- what about instead of adding a wrapper plugin, we add an option to work with nesting to the 'multiple_values' / 'single_value' plugins? The functionality of the 'key' property here would be covered by the more complex 'array_template' plugin from #3440904: Process plugin: build an array from source, destination, pipeline.)

mikelutz’s picture

I just want to note that "wrap" is just syntactical sugar for

plugin: get
      source:
        - ~

and unwrap is just syntactical sugar for

plugin: extract
      index:
        - 0

With maybe some single_value, multiple_value plugins in there if you need them, though if the purpose is to pipe a value into a handle_multiples plugin that can't handle a single value, there will often not be much point, as those plugins don't care if the value is a multiple or not.

(BTW, I'm starting to come around to just documenting the 'hack' with the get plugin above and officially supporting it. I've found enough cases where it's useful)

benjifisher’s picture

Status: Needs review » Closed (outdated)

Now that #3440904: Process plugin: build an array from source, destination, pipeline is fixed (and part of the 6.0.5 release) I think we can close this issue.

@mikelutz:

(BTW, I'm starting to come around to just documenting the 'hack' with the get plugin above and officially supporting it. I've found enough cases where it's useful)

Your timing is interesting. Now that we have the array_template plugin, we can get the same functionality (a little more verbose, but perhaps easier to read/clearer intention). So now you decide to change your mind? ;)