Problem/Motivation

Several core migrate process plugins support a list of source properties. These include:

  • concat
  • callback
  • download
  • file_copy
  • menu_link_parent
  • null_coalesce
  • route
  • static map

Currently, these plugins cannot be used with a list of source plugins unless they are the first process plugin in a pipeline. For example let's say you had products and you wanted to process the name so that it has a lowercase i in front of a capitalized name. For example, "product" will become "iProduct". Currently we need a temporary destination property to deal with this.

source:
  plugin: embedded_data
  data_rows:
    -
      'id': 1
      'field_product': 'product'
  ids:
    id:
      type: string
  constants:
    i: 'i'
process:
  temp_value:
    plugin: callable
    source: field_product
    callable: ucfirst
  processed_value:
    plugin: concat
    source:
      - 'constants/i'
      - '@temp_value'

Proposed resolution

Create a keyword for referencing the current value of the process pipeline. I propose %pipeline. The process section above could then be rewritten as:

process:
  processed_value:
    -
      plugin: callable
      source: field_product
      callable: ucfirst
    -    
       plugin: concat
       source:
         - 'constants/i'
         - '%pipeline'

Either way the result for processed_value would be iProduct.

Remaining tasks

TBD

User interface changes

TBD

API changes

TBD

Data model changes

TBD

Release notes snippet

TBD

CommentFileSizeAuthor
#3 3236774--3.patch8.41 KBdanflanagan8
#2 3236774--2.patch8.38 KBdanflanagan8

Comments

danflanagan8 created an issue. See original summary.

danflanagan8’s picture

Status: Active » Needs review
StatusFileSize
new8.38 KB

Here's a patch with the proposed solution. I added a test class that shows how this would work with several of the core process plugins that support a list of source properties.

danflanagan8’s picture

StatusFileSize
new8.41 KB

Cspell....Trying again...

danflanagan8’s picture

Issue summary: View changes

My example in the IS was not so good. I've updated it.

lendude’s picture

Nice!

Just some things I see:

  1. +++ b/core/modules/migrate/src/MigrateExecutable.php
    @@ -67,6 +67,13 @@ class MigrateExecutable implements MigrateExecutableInterface {
    +  protected $value;
    

    If this is public, should it be added to the interface?

  2. +++ b/core/modules/migrate/src/MigrateExecutable.php
    @@ -67,6 +67,13 @@ class MigrateExecutable implements MigrateExecutableInterface {
    +  protected $value;
    

    Should this be something more descriptive than $value? $currentProcessPipelineValue?

  3. +++ b/core/modules/migrate/src/MigrateExecutable.php
    @@ -407,18 +415,28 @@ public function processRow(Row $row, array $process = NULL, $value = NULL) {
    +    return $this->value ?? NULL;
    

    Do we have/need coverage for the NULL return?

danflanagan8’s picture

Thanks for the comments, @Lendude.

1. I believe this was supposed to be in reference to public function getCurrentProcessPipelineValue() I think you're right. It would need to be added to MigrateExecutableInterface. Based on this comment in an unrelated issue, this addition to the interface should be ok from a BC standpoint.

2. I was trying to be as undisruptive as possible, so I just changed $value to $this->value. I agree that this is not descriptive at all! You're probably right that a more descriptive name would be nice.

3. I don't think the null coalesce operator does anything. It should be removed.

benjifisher’s picture

@danflanagan8:

Thanks for opening this issue. I think it would be great to have better ways of supplying arguments to process plugins, and this issue gets the discussion going.

We discussed this issue at #3236461: [meeting] Migrate Meeting 2021-09-23. Two of the maintainers of the migration subsystem (@mikelutz and I) agreed that we do not want to modify the core get plugin for this purpose. From memory, here are some of the reasons:

  1. It does not add new functionality, since you can accomplish the same thing using "pseudofields".
  2. It adds complexity to the get plugin.
  3. We need the get plugin to be reliable, since it is the start of nearly every process pipeline.
  4. The get plugin is already too weird.

I have never seen a pipeline that does not start with get, but I said "nearly" to avoid inciting people to prove me wrong.

Usually, get is implicit, called when another process plugin sets a source option.

I realize that (3) cuts both ways: since get is used so much, and used implicitly, it will be a lot easier to use this feature if it is implemented in get.

I did not bring it up at the meeting, but I also do not like the implementation here, since it relies on the executable class. There are other executable classes out there (Drush, Migrate Tools, Migrate Source UI). I am afraid that migrations using this feature might work with some executables and not others.

Again, the get plugin is weird: its input is an array of source and destination properties. Shouldn't that be configuration instead?

I think a better idea would be to create a new plugin:

  • It has direct access to the pipeline value, since that is the argument to its transform() method. There is no need to get it from a class property on the executable.
  • It has configuration like the arguments to get: an array of source and destination properties.
  • Besides source and destination properties, it can have literal strings and the pipeline value.
  • If we can think of use cases, the configuration could be a nested array

For the sake of discussion, let's use the following syntax for the configuration items:

  • source:field_foo for a source property
  • dest:field_bar for a destination property
  • pipeline: for the pipeline value (argument)
  • literal:any string for a literal string

That is verbose, but it is clear. We can choose something shorter if we actually implement this plugin.

With this proposal, I would handle the example from the issue summary like this:

process:
  processed_value:
    -
      plugin: callable
      source: field_product
      callable: ucfirst
    -
      plugin: get_arguments
       inputs:
         - 'literal:i'
         - 'pipeline:'
    -
       plugin: concat

I think that is only 2 lines longer than the example in the summary, and some of that is saved because you do not need to define contants.i in the source.

If we do create a new process plugin, it will probably be a candidate for the migrate_plus module, not core.

danflanagan8’s picture

Status: Needs review » Closed (works as designed)

Thanks for the careful review, @benjifisher.

It has direct access to the pipeline value, since that is the argument to its transform() method. There is no need to get it from a class property on the executable.

That to me is the most compelling reason to make a new plugin. Well, that and the non-core implementations of MigrateExecutable, of which I was unaware.

We can discuss further if and when we open a new issue for a new plugin in migrate_plus.

danflanagan8’s picture

My mind was just blown. I was thinking about this again, this time thinking that I'd have a SuperGet class that would override Get. Then I saw that Get can already be coaxed into referencing the current value of the pipeline. From its transform method:

foreach ($properties as $property) {
      if ($property || (string) $property === '0') {
        $return[] = $row->get($property);
      }
      else {
        $return[] = $value;
      }
    }

That "else" case puts the current pipeline value into $return. When does that case run? When $property is null an empty string or an empty array. Those are essentially keywords that do the thing I was introducing with %pipeline So we already can do this! The snippet from the IS would be:

process:
  processed_value:
    -
      plugin: callable
      source: field_product
      callable: ucfirst
    -    
       plugin: concat
       source:
         - 'constants/i'
         - null

It's weird and not intuitive and hard to read, but it totally works!!!