All my migrations previously worked with XML files encoded in UTF-16LE but were suddenly broken after upgrading to Migrate Plus 4.2.
Drupal\migrate\MigrateException: Fatal Error 73: expected '>'
Line: 542
Column: 20
File: in Drupal\migrate_plus\Plugin\migrate_plus\data_parser\SimpleXml->openSourceUrl() (line 51 of modules/contrib/migrate_plus/src/Plugin/migrate_plus/data_parser/SimpleXml.php).
It turns out that the issue #3046753 Make XML parser more resilient introduced a call with trim() before simplexml_load_string()
protected function openSourceUrl($url) {
// Clear XML error buffer. Other Drupal code that executed during the
// migration may have polluted the error buffer and could create false
// positives in our error check below. We are only concerned with errors
// that occur from attempting to load the XML string into an object here.
libxml_clear_errors();
$xml_data = $this->getDataFetcherPlugin()->getResponseContent($url);
$xml = simplexml_load_string(trim($xml_data));
foreach (libxml_get_errors() as $error) {
$error_string = self::parseLibXmlError($error);
throw new MigrateException($error_string);
}
$this->registerNamespaces($xml);
$xpath = $this->configuration['item_selector'];
$this->matches = $xml->xpath($xpath);
return TRUE;
}
The function trim() is not safe when working with multibyte encoded string, whereas SimpleXML can perfectly handle multibyte data. I don't think it necessary to call trim() before simplexml_load_string. If your XML has an empty line before the openning tag, your XML is not well-formed and required special treatment. Adding trim() to the generic parser will prevent it from working properly with Unicode data.
| Comment | File | Size | Author |
|---|---|---|---|
| #5 | 3051858-migrate_plus-simplexml_remove_trim-5.patch | 676 bytes | nadim hossain |
Comments
Comment #2
sonnyktPatch to remove the trim call.
Comment #3
sonnyktComment #4
3li#2 removing the trim method has solved my issue.
Comment #5
nadim hossain commentedRe-rolled the patch against 6.x
Comment #6
nadim hossain commentedComment #8
heddnThanks for your contributions.
Comment #10
heddnReverted the commit as it broke tests. More work needed.