Problem/Motivation

Currently some non-one-to-one Drupal migrations can result in empty destination properties of really important fields e.g. ID.

e.g. doing a migration lookup against the primary ID, and returning a NULL would result in the ID being set as an "empty property", and subsequently set to NULL everytime the Migration entity is updated as part of the migration, which might not necessarily be the expected behaviour especially for Migration stubs.

And there's currently no way for process plugins to remove empty properties.

Steps to reproduce

Provide a migration which does a lookup against the exisiting ID in some way in order to detect uniqueness, if the lookup fails, the ID will always be set to NULL.

Or at a more abstract level:
1. A migration process sets field 'foo' to NULL, and is subsequently flagged as empty.
2. A separate dynamic process or destination sets 'foo' to 'bar' on the row.
3. The 'foo' destination field has a non-empty value, but is still flagged as empty and will be removed on import.

Proposed resolution

Provide API that allows pre row save hooks to easily stop certain important properties from being explicitly flagged as NULL.

Or provide a plugin property/API which can be used to skip the setEmptyDestinationProperty behaviour similar to the $plugin->multiple(); method.

Or we could explicitly ignore IDs and other special fields from being flagged as "empty destinations" (unsure what consequences this might have).

Remaining tasks

Provide issue fork/patch with tests.

User interface changes

N/A

API changes

Potentially new Migrate methods:

  • \Drupal\migrate\Row::unsetEmptyDestinationProperty($property)

And potentially:

  • \Drupal\migrate\Plugin\MigrateProcessInterface::emptyDestination()
  • \Drupal\migrate\ProcessPluginBase::emptyDestination()

To determine when $row->setEmptyDestinationProperty($destination); is called by the MigrateExecutable.

Data model changes

N/A

Release notes snippet

Issue fork drupal-3278368

Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Comments

codebymikey created an issue. See original summary.

codebymikey’s picture

Attached a patch adding the \Drupal\migrate\Row::unsetEmptyDestinationProperty($property) method.

Version: 9.3.x-dev » 9.4.x-dev

Drupal 9.3.15 was released on June 1st, 2022 and is the final full bugfix release for the Drupal 9.3.x series. Drupal 9.3.x will not receive any further development aside from security fixes. Drupal 9 bug reports should be targeted for the 9.4.x-dev branch from now on, and new development or disruptive changes should be targeted for the 9.5.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 9.4.x-dev » 9.5.x-dev

Drupal 9.4.9 was released on December 7, 2022 and is the final full bugfix release for the Drupal 9.4.x series. Drupal 9.4.x will not receive any further development aside from security fixes. Drupal 9 bug reports should be targeted for the 9.5.x-dev branch from now on, and new development or disruptive changes should be targeted for the 10.1.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

amaisano’s picture

Not sure if this is 100% the same problem, but I solved this by completely rejecting migration mappings from entering the database if arbitrary criteria was not met. I did this by extending the skip_on_empty plugin to what I call exclude_on_empty.

class ExcludeOnEmpty extends SkipOnEmpty {

  /**
   * Excludes the current row when value is not set.
   *
   * {@inheritDoc}
   */
  public function row($value, MigrateExecutableInterface $migrate_executable, Row $row, $destination_property) {
    try {
      return parent::row($value, $migrate_executable, $row, $destination_property);
    }
    catch (MigrateSkipRowException $e) {
      throw new MigrateSkipRowException($e->getMessage(), FALSE);
    }
  }

}

That FALSE argument says "don't save mapping to DB." So any migration can pass a NULL/empty value to this plugin if the entity being imported isn't "ready" yet.

Version: 9.5.x-dev » 11.x-dev

Drupal core is moving towards using a “main” branch. As an interim step, a new 11.x branch has been opened, as Drupal.org infrastructure cannot currently fully support a branch named main. New developments and disruptive changes should now be targeted for the 11.x branch. For more information, see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

danflanagan8’s picture

Here's a process-plugin approach to the same basic problem: #3446932: Add set_on_condition process plugin

I'm not sure it's a good idea, but I wanted to throw it out there.

codebymikey’s picture

Issue summary: View changes

Thanks @danflanagan8, I think this issue addresses a different situation, which is related to the fact that the empty value gets out of sync with the actual state of the row field.

I've updated the issue summary accordingly.

dylan donkersgoed’s picture

I also encountered a problem related to this and codebymikey's solution fixed it for me. I don't think a process plugin would've worked for my use case.

Essentially I have several migrations that would conditionally set the node or user ID (e.g. because the node might already have translations or revisions which I think is a fairly common use case).

So, e.g., I most recently ran into the issue with a user migration that looked like this:

uuid: 337cbd24-5039-4a30-b646-1c2032f68163
langcode: en
status: true
dependencies: {  }
id: upgrade_d7_user
class: Drupal\user\Plugin\migrate\User
field_plugin_method: null
cck_plugin_method: null
migration_tags:
  - 'Drupal 7'
  - Content
migration_group: migrate_drupal_7
label: 'User accounts'
source:
  plugin: hh_migration_d7_user_with_group
process:
  # A subset of users already exist, target their UIDs if they are found.
  uid:
    - plugin: skip_on_value
      source: uid
      method: process
      not_equals: true
      value:
        - 1
        - 14347
        - 14927
        - 16473
        - 18850
    - plugin: static_map
      map:
        1: 1
        14347: 6
        14927: 4
        16473: 5
        18850: 3
      default_value: null
  name:
    -
      plugin: get
      source: name
    -
      plugin: skip_on_value
      method: row
      value:
  pass:
    -
      plugin: get
      source: pass
  mail:
    -
      plugin: get
      source: mail
  created:
    -
      plugin: get
      source: created
  access:
    -
      plugin: get
      source: access
  login:
    -
      plugin: get
      source: login
  status:
    -
      plugin: get
      source: status
  timezone:
    -
      plugin: get
      source: timezone
  langcode:
    -
      plugin: user_langcode
      source: entity_language
      fallback_to_site_default: false
  preferred_langcode:
    -
      plugin: user_langcode
      source: language
      fallback_to_site_default: true
  preferred_admin_langcode:
    -
      plugin: user_langcode
      source: language
      fallback_to_site_default: true
  init:
    -
      plugin: get
      source: init
  roles:
    - plugin: static_map
      source: roles
      map:
        authenticated user: employee
        administrator: administrator
      default_value: employee
  user_picture:
    -
      plugin: default_value
      source: picture
      default_value: null
    -
      plugin: migration_lookup
      migration: upgrade_d7_file
  groups:
    - plugin: static_map
      source: og_gid
      map:
        6093: 3
        18179: 4
      default_value: 1
    - plugin: default_value
      default_value: 1
  field_about_me:
    -
      plugin: get
      source: field_about_me
  field_contact_me:
    -
      plugin: get
      source: field_contact_me
  field_first_name:
    -
      plugin: get
      source: field_first_name
  field_last_name:
    -
      plugin: get
      source: field_last_name
  field_photo:
    -
      plugin: sub_process
      source: field_photo
      process:
        target_id:
          plugin: migration_lookup
          migration: upgrade_d7_file
          source: fid
        alt: alt
        title: title
        width: width
        height: height
destination:
  plugin: 'entity:user'
migration_dependencies:
  optional:
    - upgrade_d7_file

The important section is the uid bit. It will assign the UID for certain users only if the user already exists in Drupal so the migration will map it to the existing user. This is needed to make sure all the user relationships are preserved in revision history etc. even for the users that were already manually created. I've run into the problem in several other places though for migrations that handle node revisions/translations.

The problem was when I ran a migration with the --update flag I would get numerous errors like:

[error] SQLSTATE[23000]: Integrity constraint violation: 1062 Duplicate entry '775cbb48-ed5d-4fb0-aa15-09b315905fe7' for key 'user_field__uuid__value': INSERT INTO "users" ("uuid", "langcode") VALUES (:db_insert_placeholder_0, :db_insert_placeholder_1); Array
(
[:db_insert_placeholder_0] => 775cbb48-ed5d-4fb0-aa15-09b315905fe7
[:db_insert_placeholder_1] => en
)

this happened when the uid was configured in the migration even if a process plugin skipped processing that field and no value was set. I think because the migration would somehow end up trying to do a mix of updating the already existing entity based on the existing destination IDs and creating a new one because the uid was not set.

Removing the empty destination property seems to have fixed the issue, but there is no way to do that without codebymikey's patch.

Version: 11.x-dev » main

Drupal core is now using the main branch as the primary development branch. New developments and disruptive changes should now be targeted to the main branch.

Read more in the announcement.