Problem/Motivation

When importing data using the Migrate module and the concat process plugin (http://cgit.drupalcode.org/drupal/tree/core/modules/migrate/src/Plugin/m...), if the data source has empty values, the delimiter is included anyways.

Example:

id value_1 value_2
1 'Value 1' 'Value 2'
2 'Value 1'
3 'Value 2'
4

If I'd wanted to have a migration that used concat to join value_1 and value_2 I'd have these settings:

my_field:
    plugin: concat
    source:
      - value_1
      - value_2
    delimiter: ' & '

Which will result in the following data:

id join value
1 'Value 1 & Value 2'
2 'Value 1 & '
3 ' & Value 2'
4 ' & '

I don't think 2, 3 and 4 are the desired result.

Proposed resolution

At a migrate meeting #3160932: [meeting] Migrate Meeting 2020-07-30 it was decided to not change the concat process plugin because this same result can be achieved with a callback using array_filter followed by concat. Improvements to the callback process plugin may also help, there is an issue in Migrate Plus for that, #2882276: Extended callback process plugin to call functions with multiple parameters.

original proposal
Add an array filter to strip out empty values when concat'ing. I don't see a use case of wanting these empty values, but this could be optional.

Comments

pcambra created an issue. See original summary.

pcambra’s picture

Status: Active » Needs review
StatusFileSize
new710 bytes
pcambra’s picture

Issue summary: View changes
jofitz’s picture

Status: Needs review » Needs work
Issue tags: +Needs tests

Firstly, this will need a test to demonstrate the problem (and prove that the patch fixes it). Secondly, perhaps we should add "allow_empty" to the plugin configuration (probably set to false by default) for those (rare) cases when I user might want ' & Value 2' (or similar).

pcambra’s picture

Thanks for the tags Jo, obviously this needs tests and a decision on the allow_empty option, I'll wait for a component maintainer to chime in before investing in adding those.

quietone’s picture

Yea, Jo Fitzgerald is on it, as usual. Everything in #4 sounds like a good implementation, with the addition of updating the API documentation will need updating.

jofitz’s picture

Status: Needs work » Needs review
Issue tags: -Needs tests
StatusFileSize
new5.6 KB
new3.79 KB
new5.56 KB
  • Added allow_empty to the plugin configuration (and relevant documentation).
  • Added tests (and tidied up the existing tests in the process which I think is acceptable as part of this issue considering the amount the tests are changing).

The last submitted patch, 7: 2913428-7-test_only.patch, failed testing. View results

heddn’s picture

Version: 8.5.x-dev » 8.4.x-dev
Status: Needs review » Reviewed & tested by the community

Looks good to me. Docs are added and tests too. I don't think we need a CR for the very minor API addition. This can probably make its way into 8.4, since migrate is still beta/experimental. So I've switched back versions. And adding a test for good measure.

quietone’s picture

Status: Reviewed & tested by the community » Needs work

Really sorry to set this back to NW. Using the structure 'if set' was pointed out in another migration issue (which of course I can't find) so let's use it here.

+++ b/core/modules/migrate/src/Plugin/migrate/process/Concat.php
@@ -16,6 +16,8 @@
+ * - allow_empty: (optional) Whether empty values should be included in the

It would be better and be consistent with the other process plugin docs to use something like; '(optional) If set, empty values will be included in the output. Defaults to FALSE.'

jofitz’s picture

Status: Needs work » Needs review
StatusFileSize
new642 bytes
new5.56 KB

Updated the comment for consistency, as suggested by @quietone.

heddn’s picture

Status: Needs review » Reviewed & tested by the community

Docs fixes added in response to #10. Back to RTBC.

quietone’s picture

@Jo Fitzgerald, thanks!

alexpott’s picture

Status: Reviewed & tested by the community » Needs review

I wonder if changing the default behaviour is a good idea - someone's migrations could be relying on this? Maybe we should swap the configuration default?

alexpott’s picture

Status: Needs review » Needs work
Issue tags: +Needs change record

Also migrate is beta and this plugin is part of the API so this needs a change record.

jofitz’s picture

Status: Needs work » Needs review
StatusFileSize
new5.35 KB
new6.17 KB

I have attempted to address @alexpott's very valid argument in #14. I have corrected that oversight, but in doing so I have come across a problem: when exclude_empty is TRUE zero values are excluded (when they shouldn't be, imo). Any suggestions?

heddn’s picture

Honestly, this whole piece of work seems a little like we could either 1) move it to contrib - migrate_plus is a good landing place or 2) leave it to a custom extention. Where in core do we need this functionality? Can we get a convincing argument why filtering empty is needed? we can always run array_filter with a callback process plugin beforehand, no?

Status: Needs review » Needs work

The last submitted patch, 16: 2913428-16.patch, failed testing. View results

Version: 8.4.x-dev » 8.5.x-dev

Drupal 8.4.4 was released on January 3, 2018 and is the final full bugfix release for the Drupal 8.4.x series. Drupal 8.4.x will not receive any further development aside from critical and security fixes. Sites should prepare to update to 8.5.0 on March 7, 2018. (Drupal 8.5.0-alpha1 is available for testing.)

Bug reports should be targeted against the 8.5.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.6.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

sylvainm’s picture

StatusFileSize
new6.21 KB

@Jo Fitzgerald
we could use strlen as second paramater of array_filter, as suggested in a comment of php.net: https://secure.php.net/manual/en/function.array-filter.php

$value = array_filter($value, '\Drupal\Component\Utility\Unicode::strlen');

I just changed that in attached patch

sylvainm’s picture

Status: Needs work » Needs review

Status: Needs review » Needs work

The last submitted patch, 20: 2913428-20.patch, failed testing. View results

quietone’s picture

Although initially my reviews suggested that I agree with this change, after all it seems quit logical. However, I am no longer convinced it should be in core. It would be a nice addition to migrate_plus, giving folks an option and could reduce the number of processes in some pipelines.

First, the concat process plugin implements implode. Once you know that you will expect implode not some variation. Now, the name concat is somewhat unfortunate where implode seems to be the best choice but I don't know the history. And after all, there is explode.

Two, it isn't needed in core. And as heddn points out in #17 there are ways to filter out the empty strings if that is what is needed.

Three, changing the default behavior of concat is likely to cause grief as alluded to by alexpott in #14.

heddn’s picture

After thinking about it some more, if we moved to contrib... What would we name this thing? Process plugins work best when they do single chunks of work. And then chain them. Perhaps that's what should be done here. Just use callback with array filter. And this get closed won't fix?

heddn’s picture

Category: Task » Feature request

Re tagging as feature rather than task.

quietone’s picture

Good point about the name.
We could add documentation showing how to use concat with array_filter somewhere in https://www.drupal.org/docs/8/api/migrate-api/migrate-process-plugins. Would that be sufficient?

sylvainm’s picture

With the array_filter callback solution, we can't choose the second parameter to pass to array_filter.

And we need it to avoid the problem of zero values

heddn’s picture

Then perhaps we should add an array_filter process plugin to migrate_plus?

sylvainm’s picture

With such a plugin, we could solve the problem, think I.

I can provide a patch (and an issue in migrate_plus project) if you want

Version: 8.5.x-dev » 8.6.x-dev

Drupal 8.5.6 was released on August 1, 2018 and is the final bugfix release for the Drupal 8.5.x series. Drupal 8.5.x will not receive any further development aside from security fixes. Sites should prepare to update to 8.6.0 on September 5, 2018. (Drupal 8.6.0-rc1 is available for testing.)

Bug reports should be targeted against the 8.6.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.7.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.6.x-dev » 8.8.x-dev

Drupal 8.6.x will not receive any further development aside from security fixes. Bug reports should be targeted against the 8.8.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.9.x-dev branch. For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

Version: 8.8.x-dev » 8.9.x-dev

Drupal 8.8.7 was released on June 3, 2020 and is the final full bugfix release for the Drupal 8.8.x series. Drupal 8.8.x will not receive any further development aside from security fixes. Sites should prepare to update to Drupal 8.9.0 or Drupal 9.0.0 for ongoing support.

Bug reports should be targeted against the 8.9.x-dev branch from now on, and new development or disruptive changes should be targeted against the 9.1.x-dev branch. For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

quietone’s picture

Status: Needs work » Needs review
Issue tags: -Needs change record

Reviewing the issue there seems to be consensus to move this to Migrate Plus. This came after a comment by alexpott that the patch in #14 would break existing migrations. While a later patch, #20, may have fixed that it was after heddn suggested, #17 that this be moved to Migrate Plus. He pointed out that this isn't needed by core, there is a workaround and no one has presented a argument why this is needed. The following comments support the move. The final discussion was to create an array_filter process plugin instead of changing concat.

An issue needs to be made in Migrate Plus and then this becomes a won't fix.

quietone’s picture

Issue summary: View changes
Status: Needs review » Closed (won't fix)
Related issues: +#2882276: Extended callback process plugin to call functions with multiple parameters

Discussed this at the migrate meeting today, #3160932: [meeting] Migrate Meeting 2020-07-30.

The desired result can be achieved by using a pipeline of callback, with array_filter before the concat to obtain the desired result. That kind of processing is what the pipeline is for and it means there isn't a need for change to concat. Creating an array filter process plugin can also be avoided if the callback process plugin is improved. There is an issue in Migrate Plus for that, #2882276: Extended callback process plugin to call functions with multiple parameters.

@pcambra, thank you for suggesting this change to concat and for the patch. Also to jofitx and SylvainM for adding patches and to alexpott for the review that started a rethink of this issue.

byrond’s picture

Here is an example using the pipeline described above in #34:

my_field:
  -
    plugin: callback
    callable: array_filter
    source:
      - value_1
      - value_2
  -
    plugin: concat
    delimiter: ' & '