Problem/Motivation
When migrating into formatted text fields (text_long, text_with_summary), Drupal requires each field delta to include both a value and a format. The documented approach uses sub-property mapping:
field_body/value: source_body field_body/format: plugin: default_value default_value: full_html
This works for single-value fields (cardinality = 1), but breaks for multi-value fields (cardinality > 1). When the source is an array of strings — common with XML sources like itemList/item/html — there is no clean way to set the format per-delta using existing plugins:
- The
/format+/valuesub-property syntax sets one format for the whole field, not per-delta sub_processrequires arrays of associative arrays, but XML parsers return arrays of plain stringsarray_chunk+sub_processfails when the XML parser returns a scalar string (single child element) instead of a 1-element array
The result: format is stored as NULL in the database, and the HTML renders as raw escaped text instead of being processed by the text format filter pipeline.
This is a longstanding gap. Core issue #2632814 documented the single-value workaround in 2015 but was closed without addressing multi-value fields.
Steps to reproduce
- Create a content type with a multi-value
text_longfield (cardinality = unlimited) - Create a migration that maps an XML source with repeated child elements to that field:
field_features: field_features - Run the migration
- Check the database:
SELECT field_features_format FROM node__field_features LIMIT 5 - Result:
NULLfor every row — HTML renders as raw text on the page
Proposed resolution
A text_format process plugin that wraps each value with the specified format, handling both scalar strings and arrays:
process: field_features: plugin: text_format source: field_features format: full_html
The plugin uses handle_multiples = TRUE and multiple() = TRUE so it receives the raw source (scalar or array), wraps each value into ['value' => $v, 'format' => $format], and returns the structured array that Drupal's entity field system requires.
Configuration keys:
format: (optional) The text format machine name. Defaults tobasic_html.
The attached patch includes:
src/Plugin/migrate/process/TextFormat.php— the process plugin (~70 lines)tests/src/Unit/process/TextFormatTest.php— 11 unit tests covering scalar, multi-value, NULL, empty, HTML entities, and format configurationtests/src/Kernel/Plugin/migrate/process/TextFormatTest.php— 3 kernel tests that demonstrate the actual problem (bare strings → NULL format) and the fix (plugin output → correct format) against real entity field storage
Remaining tasks
- Review the patch
- Consider whether default format should be
basic_htmlor configurable-only (no default) - Add change record documentation
User interface changes
None.
API changes
New process plugin text_format added. No changes to existing APIs.
Data model changes
None. The plugin produces the standard ['value' => ..., 'format' => ...] array structure that Drupal's entity field system already expects.
Issue fork migrate_plus-3586125
Show commands
Start within a Git clone of the project using the version control instructions.
Or, if you do not have SSH keys set up on git.drupalcode.org:
Comments
Comment #3
diamondsea