Problem/Motivation
Currently some non-one-to-one Drupal migrations can result in empty destination properties of really important fields e.g. ID.
e.g. doing a migration lookup against the primary ID, and returning a NULL would result in the ID being set as an "empty property", and subsequently set to NULL everytime the Migration entity is updated as part of the migration, which might not necessarily be the expected behaviour especially for Migration stubs.
And there's currently no way for process plugins to remove empty properties.
Steps to reproduce
Provide a migration which does a lookup against the exisiting ID in some way in order to detect uniqueness, if the lookup fails, the ID will always be set to NULL.
Or at a more abstract level:
1. A migration process sets field 'foo' to NULL, and is subsequently flagged as empty.
2. A separate dynamic process or destination sets 'foo' to 'bar' on the row.
3. The 'foo' destination field has a non-empty value, but is still flagged as empty and will be removed on import.
Proposed resolution
Provide API that allows pre row save hooks to easily stop certain important properties from being explicitly flagged as NULL.
Or provide a plugin property/API which can be used to skip the setEmptyDestinationProperty behaviour similar to the $plugin->multiple(); method.
Or we could explicitly ignore IDs and other special fields from being flagged as "empty destinations" (unsure what consequences this might have).
Remaining tasks
Provide issue fork/patch with tests.
User interface changes
N/A
API changes
Potentially new Migrate methods:
\Drupal\migrate\Row::unsetEmptyDestinationProperty($property)
And potentially:
\Drupal\migrate\Plugin\MigrateProcessInterface::emptyDestination()\Drupal\migrate\ProcessPluginBase::emptyDestination()
To determine when $row->setEmptyDestinationProperty($destination); is called by the MigrateExecutable.
Data model changes
N/A
Release notes snippet
| Comment | File | Size | Author |
|---|---|---|---|
| #3 | drupal-unset-empty-row-destinations-3278368-3.patch | 2.65 KB | codebymikey |
Issue fork drupal-3278368
Show commands
Start within a Git clone of the project using the version control instructions.
Or, if you do not have SSH keys set up on git.drupalcode.org:
Comments
Comment #3
codebymikey commentedAttached a patch adding the
\Drupal\migrate\Row::unsetEmptyDestinationProperty($property)method.Comment #6
amaisano commentedNot sure if this is 100% the same problem, but I solved this by completely rejecting migration mappings from entering the database if arbitrary criteria was not met. I did this by extending the skip_on_empty plugin to what I call exclude_on_empty.
That FALSE argument says "don't save mapping to DB." So any migration can pass a NULL/empty value to this plugin if the entity being imported isn't "ready" yet.
Comment #8
danflanagan8Here's a process-plugin approach to the same basic problem: #3446932: Add set_on_condition process plugin
I'm not sure it's a good idea, but I wanted to throw it out there.
Comment #11
codebymikey commentedThanks @danflanagan8, I think this issue addresses a different situation, which is related to the fact that the empty value gets out of sync with the actual state of the row field.
I've updated the issue summary accordingly.
Comment #12
dylan donkersgoed commentedI also encountered a problem related to this and codebymikey's solution fixed it for me. I don't think a process plugin would've worked for my use case.
Essentially I have several migrations that would conditionally set the node or user ID (e.g. because the node might already have translations or revisions which I think is a fairly common use case).
So, e.g., I most recently ran into the issue with a user migration that looked like this:
The important section is the uid bit. It will assign the UID for certain users only if the user already exists in Drupal so the migration will map it to the existing user. This is needed to make sure all the user relationships are preserved in revision history etc. even for the users that were already manually created. I've run into the problem in several other places though for migrations that handle node revisions/translations.
The problem was when I ran a migration with the --update flag I would get numerous errors like:
[error] SQLSTATE[23000]: Integrity constraint violation: 1062 Duplicate entry '775cbb48-ed5d-4fb0-aa15-09b315905fe7' for key 'user_field__uuid__value': INSERT INTO "users" ("uuid", "langcode") VALUES (:db_insert_placeholder_0, :db_insert_placeholder_1); Array
(
[:db_insert_placeholder_0] => 775cbb48-ed5d-4fb0-aa15-09b315905fe7
[:db_insert_placeholder_1] => en
)
this happened when the uid was configured in the migration even if a process plugin skipped processing that field and no value was set. I think because the migration would somehow end up trying to do a mix of updating the already existing entity based on the existing destination IDs and creating a new one because the uid was not set.
Removing the empty destination property seems to have fixed the issue, but there is no way to do that without codebymikey's patch.