Problem/Motivation

Currently the SubProcess migrate process plugin has some issues, when dealing with XMLs as source data (in combination with migrate_plus).

Example XML structure:

<?xml version="1.0" encoding="UTF-8" ?>
<article>
  <elements>
    <id>element.text.001</id>
  </elements>
  <elements>
    <id>element.text.002</id>
  </elements>
</article>

The corresponding migration fetches all elements nodes via the following XPath statement: /article/elements.

As a result, I get an array of SimpleXMLElement objects - one for each elements XML node.

I am currently trying to pass this array of SimpleXMLElement objects to the SubProcess plugin. In its transform method, there is the following line of code that creates the new data sub-row (where $new_value is the currently subprocessed SimpleXMLElement of the passed in array):

$new_row = new Row($new_value + $source);

This code unfortunately errors out, due to the fact that you can't merge a SimpleXMLElement object with an array like this.

Proposed resolution

Ensure that the $new_value variable is an array before trying to merge it with $source. Otherwise throw a MigrateException with helpful debugging message.

Remaining tasks

  • Create a patch

User interface changes

n/a

API changes

The SubProcess plugin can throw a MigrateException exception.

Data model changes

n/a

Comments

hctom created an issue. See original summary.

hctom’s picture

Title: SubProcess migrate process plugin does not play nicely with XMLs » SubProcess migrate process plugin does not play nicely with XML-based source data
hctom’s picture

Status: Active » Needs review
StatusFileSize
new759 bytes

Here is a patch with a simple workaround to ensure $new_value is an array.

I'd appreciate your feedback ;)

heddn’s picture

Project: Drupal core » Migrate Plus
Version: 8.8.x-dev » 8.x-4.x-dev
Component: migration system » API

Let's move this over to migrate plus. I think its xml processing should be return array perhaps. I doubt this will ever change in core.

hctom’s picture

...erm, but shouldn't Drupal core catch the case that there might not be an array passed in by whatever preceeding process/source plugin?

heddn’s picture

Good point. It should complain about that. But that fix (throw skip row exception) and this solution here (xml not passed as array) are different things. Let's fix those things separately.

hctom’s picture

Project: Migrate Plus » Drupal core
Version: 8.x-4.x-dev » 8.8.x-dev
Component: API » ajax system
StatusFileSize
new1.06 KB
new1003 bytes

Okay so let's just throw a MigrateSkipRowException instead of casting the value to an array. Attached is an updated patch. But as this has nothing to do with migrate_plus anymore, I also change the project back to Drupal core.

Version: 8.8.x-dev » 8.9.x-dev

Drupal 8.8.0-alpha1 will be released the week of October 14th, 2019, which means new developments and disruptive changes should now be targeted against the 8.9.x-dev branch. (Any changes to 8.9.x will also be committed to 9.0.x in preparation for Drupal 9’s release, but some changes like significant feature additions will be deferred to 9.1.x.). For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

Version: 8.9.x-dev » 9.1.x-dev

Drupal 8.9.0-beta1 was released on March 20, 2020. 8.9.x is the final, long-term support (LTS) minor release of Drupal 8, which means new developments and disruptive changes should now be targeted against the 9.1.x-dev branch. For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

Version: 9.1.x-dev » 9.2.x-dev

Drupal 9.1.0-alpha1 will be released the week of October 19, 2020, which means new developments and disruptive changes should now be targeted for the 9.2.x-dev branch. For more information see the Drupal 9 minor version schedule and the Allowed changes during the Drupal 9 release cycle.

quietone’s picture

Component: ajax system » migration system
quietone’s picture

Status: Needs review » Needs work
Issue tags: +Needs tests

This needs a test.

Version: 9.2.x-dev » 9.3.x-dev

Drupal 9.2.0-alpha1 will be released the week of May 3, 2021, which means new developments and disruptive changes should now be targeted for the 9.3.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

joegl’s picture

Is there a corresponding issue to fix this in Migrate Plus?

We are using the patch in #4 because it actually resolves the issue, whereas #8 only throws an exception which is unhelpful while there are no other solutions or workarounds.

danflanagan8’s picture

Status: Needs work » Needs review
StatusFileSize
new92.42 KB
new93.47 KB

I took a stab at some tests. A fail test is attached. I also combined the patch from #8 with the fail test to show that it makes the test pass.

If the tests look good someone can remove the Needs tests tag.

danflanagan8’s picture

StatusFileSize
new1.86 KB
new2.91 KB

Just when I think I'm good at git I do something like that...

Here are patches without a bunch of accidental local changes.

danflanagan8’s picture

StatusFileSize
new1.84 KB
new2.9 KB
new933 bytes

Oh boy...let's try that again again.

The last submitted patch, 18: 3048464--18--FAIL.patch, failed testing. View results

danflanagan8’s picture

Issue tags: -Needs tests

Finally! Hopefully this looks good to the community. I'm going to take off the needs tests tag. This definitely recreates the situation described in the IS.

quietone’s picture

Status: Needs review » Needs work
Issue tags: +Bug Smash Initiative

@danflanagan8, sorry I didn't get to this sooner. This looks great, just a few improvements.

1) I started by looking at and running the test. I got the expected error. Since we are here lets add a few more test cases, say for integers, booleans and NULL and an iterator, that is, other non arrays. I was also wondering why simple values like 1 or TRUE are not tested.

2) Then I looked at the changes to the process plugin and see that this is not testing the value on input but on each element in the array. That makes sense here. Nice to keep this limited to that case.

3) Now, why throw a SkipRow instead of a SkipProcess? I guess because sub_process can't do the work expected of it it is better to skip the whole row, similar to static map throwing a skip row.

4) Then I looked at the exception message. Let's make this like the one in the Extract process plugin because it provides more data. Now that #2976098: MigrateExecutable should add details for the migration & destination property to exceptions that cause a row failure the message will contain the process plugin ID so there will be no confusion on which plugin threw the exception.

danflanagan8’s picture

Status: Needs work » Needs review
StatusFileSize
new3.6 KB
new2.91 KB

Here are some updates, @quietone. Please let me know if I missed anything or did anything extra. I was a little confused about which of you bullet points were actual requests vs. which ones were simply your musings on the issue. :)

Here's a comparison of your bullets and what I did!

  1. I added a bunch of cases. These are unit tests so it's super fast. No problem having lots of cases.
  2. No action
  3. No action
  4. I updated the exception message. Please let me know if the wording could be improved. I had like 10 different variations of this...

Status: Needs review » Needs work

The last submitted patch, 22: 3048464-22.patch, failed testing. View results

danflanagan8’s picture

Status: Needs work » Needs review
StatusFileSize
new6.08 KB
new2.91 KB

Ugh, see I wasn't kidding when I said I went through 10 versions of that language. That was lazy.

Here's a patch I swear passes. I'm not going to trigger tests in an attempt to save resources.

Nope. Ignore this patch because I messed it up also.

danflanagan8’s picture

StatusFileSize
new3.6 KB
new1.01 KB

Ignore the last patch despite my claim that "I swear [it] passes" because I made a git error and didn't inspect the patch manually. This is good advice: don't trust me ever!

So here's a new patch. I'm just going to run the tests again. I'm not clever enough to try to save resources...

quietone’s picture

Status: Needs review » Needs work

@danflanagan8, this looks good and I don't think there is a test case missing.

Unfortunately, the patch no longer applies. And I think the title needs an update, 'Change SubProcess to throw an exception on invalid input'?

danflanagan8’s picture

Title: SubProcess migrate process plugin does not play nicely with XML-based source data » SubProcess migrate process plugin should throw exception on invalid input
Issue summary: View changes
Status: Needs work » Needs review
StatusFileSize
new3.62 KB

Thanks @quietone

This was a trivial reroll (thankfully!). It applies to 9.3.x and 9.4.x. I'm never 100% sure what branch we should be targeting. Probably 9.4.x? I'll leave it as is for now.

I updated the title with a slightly different wording than suggested so that it still sounds like a bug rather than a task. I also touched up a couple things in the IS.

mikelutz’s picture

Status: Needs review » Needs work

This seems to be a duplicate of #2880278: SubProcess should throw an explicit error on bad input.. I'm closing that one because more work seems to have been done here.

As far as the merits of the issue, the original reporter wanted sub_process to work with an array of simplexml elements. It currently doesn't and this works as designed. If you want to feed an array of simplexml elements in you need to convert them to an array of arrays first. The documentation for the plugin clearly states that the input needs to be an array of arrays.

As far as then throwing an exception on bad input, I'm fine with that, but a skip row exception is wrong, we need to throw a MigrateException on bad input. Looking through the process plugins just in the core migrate module, the following test for valid input before processing:

ArrayBuild,
Callback
Concat
Explode
Extract
FileCopy
Flatten
FormatDate
MakeUniqueBase
StaticMap
SubStr
UrlEncode

Of those, Callback throws an InvalidArgumentException and every single other process plugin that checks for valid input throws a MigrateException on invalid input. We have no process plugins now that throw a skip row on invalid input, and for good reasons. If there is invalid input getting into your process plugins, then there is something wrong with your migration, either in your source data, or in your processes. In most of the cases, skipping the row is just going to end up happening over and over anyway due to a bad configuration, so better to dump out of the migration and just fix it. We shouldn't blindly continue processing rows when the migration is broken. We should also do it for consistency, to match what other process plugins do.

mikelutz’s picture

mikelutz’s picture

Category: Bug report » Task
Related issues: +#2880278: SubProcess should throw an explicit error on bad input.
danflanagan8’s picture

Issue summary: View changes
Status: Needs work » Needs review
StatusFileSize
new3.63 KB
new2.49 KB

Thanks @mikelutz

You make a very convincing case! I have changed the exception to MigrateException. I also updated the proposed solution in the IS to reflect that change.

Note: Comment 30 was posted as I was posting these patches. I don't want to rename and reupload so I'm just posting with the wrong comment number. :)

quietone’s picture

Status: Needs review » Reviewed & tested by the community

Ah, so good that mikelutz was able to chime in here. I completely missed that!

The changes asked for in #28 have been made and this is much better now. We already had tests and now we have the correct exception, MigrateException. This is ready now.

@danflanagan8, thanks!

Time to get this in.

danflanagan8’s picture

Yeah, this exception is better. And @mikelutz went ahead and created an issue to get callback in line with the other plugins: #3247950: Throw consistent exceptions on invalid process plugin configurations.

larowlan’s picture

Issue tags: +Needs change record

Discussed with @quietone, @amjad1233 and @fubarhouse at Drupal South and we agreed that because of the new exception being thrown we should write a change record here.

quietone’s picture

Issue tags: -Needs change record

Change record added.

danflanagan8’s picture

Change record text looks good to me. Thanks, @quietone!

Version: 9.3.x-dev » 9.4.x-dev

Drupal 9.3.0-rc1 was released on November 26, 2021, which means new developments and disruptive changes should now be targeted for the 9.4.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Status: Reviewed & tested by the community » Needs work

The last submitted patch, 31: 3048464-30.patch, failed testing. View results

mikelutz’s picture

Status: Needs work » Reviewed & tested by the community

Rerunning tests

Status: Reviewed & tested by the community » Needs work

The last submitted patch, 31: 3048464-30.patch, failed testing. View results

mikelutz’s picture

Status: Needs work » Reviewed & tested by the community

Status: Reviewed & tested by the community » Needs work

The last submitted patch, 31: 3048464-30.patch, failed testing. View results

quietone’s picture

Status: Needs work » Reviewed & tested by the community

Restoring RTBC because the test failure was due to #3255836: Test fails due to Composer 2.2 which has been fixed.

quietone’s picture

StatusFileSize
new3.63 KB

Re-uploading the patch so it will test with 9.4.x every two days. At least, I hope so.

alexpott’s picture

Status: Reviewed & tested by the community » Fixed

Committed and pushed bcb89890b24 to 10.0.x and 200a93269d9 to 9.4.x. Thanks!

  • alexpott committed bcb8989 on 10.0.x
    Issue #3048464 by danflanagan8, hctom, quietone, mikelutz, heddn:...

  • alexpott committed 200a932 on 9.4.x
    Issue #3048464 by danflanagan8, hctom, quietone, mikelutz, heddn:...

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.