Migrate process overview

Last updated on
9 May 2017

The process key of a migration configuration describes, property-by-property, how the destination is to be constructed from the source data. The value of the process key is an associative array, where each key is a destination property. The values associated with each key describe how the destination value is created. Core supports the most common cases with shorthands. Less common cases with a more verbose syntax and what can not be expressed in this way can be coded in a custom plugin.

The shorthands

Simple copying

The get plugin is used to copy a value from a source property. Unlike all other process plugins, it can be used without being explicitly named. For example, to copy the value of the source property subject into the destination title:

process:
  title: subject

To import the created and changed date of a node, use the following syntax, where Post Date is a field containing a timestamp.

process: 
  created: Post Date 
  changed: Post Date

Created by one plugin

The destination might be created by one plugin (in addition to the implicit get). In this case, the value associated with the destination property is an associative array containing a plugin value identifying the plugin to use, along with any additional values used by that particular plugin. In this example we use the migration plugin (the source: author uses the get plugin to access the source value initially before passing it to the migration plugin):

process:
  uid:
    plugin: migration
    migration: users
    source: author

The full pipeline

Sometimes, a source value must pass through multiple plugins to end up with the right value and structure for the destination property. In this case, the value associated with the destination key is a list of associative arrays, each containing at least a plugin key and its configuration much like the single plugin case above. The incoming source value is passed into the first plugin, the output of that is passed to the second plugin, and so on. For example, consider how we translate a Drupal 6 text format user-visible name to a unique Drupal 8 machine name: the filter format machine name is created from the label by first applying the machine_name plugin to create a machine name and then the deduplication plugin. The second plugin and so on does not need a source as their input is the output of the previous plugin. This is why it's called a pipeline.

process:
    format:
        -
            plugin: machine_name
            source: name
        -
            plugin: dedupe_entity
            entity_type: filter_format
            field: format

What this says is that the source property named name is passed into the machine_name plugin, to convert the original string to a lower-case alphanumeric (plus underscores) name. Since this could potentially result in the same machine name for different incoming strings, and we need unique machine names for our filter_format entities, we next invoke the dedupe_entity plugin. The dedupe_entity plugin does not have a source specified; the result of the machine_name plugin is implicitly fed to dedupe_entity, which also takes entity_type and field configuration keys. The result of the dedupe_entity plugin, as the last in the pipeline, is assigned to the destination format property.

Nested values

If you want to set $destination['display_settings']['label']['format'] or read from $source['display_settings']['label']['format'] you need to use display_settings/label/format. Example:

process:
  source: 'display_settings/label/format'

Don't forget the quotes.

If you are trying to migrate two nested fields, for example the content of body field and its format, don't forget to use nested values for both fields, like this:

process:
  'body/format':
    plugin: default_value
    default_value: full_html
  'body/value': example_matched_field

If you use  body: example_matched_field without /value in the second place, the result is the whole body field will be overridden with the example_matched_field value, losing the format value. 

Handling multiple values

Many plugins work with a single value as an input. The migration system automatically recognizes if the pipeline is a list instead of a single value and calls the plugin repeatedly for every single value.

No source

If a destination property should not have a value set at all it is still advised to add to the migration like this:

process:
  foo: { }

To denote an empty array ([] can be used as well). The system recognizes and handles the empty pipeline specially and does not set the foo property at all. However, this allows an analysis tool or UI to recognize the migration is aware of this property and simply does not want it to be set. Without this, a warning might be issued that a destination property is left dangling.

See constant values on how to set NULL instead of not setting a value.

Which plugins can you use?

Use drupal console to get a list:

drupal plugin:debug migrate.process