See #2748609: [meta] Preserving auto-increment IDs on migration is fragile.
Problem/Motivation
Preserving serial IDs in the upgrade process (e.g., source site nid 53 => destination site nid 53) means that if any content with the same ID is manually created on the destination side before upgrade, it will be overwritten with data from the source. In case users don't take proper precautions before performing a migration, we could try to detect at import time if the incoming source data would overwrite pre-existing non-migrated content.
Proposed resolution
Add a process plugin to vulnerable migrations which checks to see if the ID being migrated already exists on the destination side and was not migrated (is not in the migration map table). If the source entity would overwrite a non-migrated entity, throw an error.
- Maintain data integrity
Destination nodes will not be overwritten. Some source records will not be imported. - No surprises
If someone hasn’t read the docs, they may be surprised that they only got a partial migration. - Provide a path forward
Difficult path forward - manually migrate the content that got rejected? - Preserve URLs (e.g., node/17).
Only for the content which was successfully imported. If you’ve manually created, say, node/17, then node/17 on the destination site will show different content from node/17 on the source site. - Minimize technical debt.
A fairly straight-forward process plugin. - Minimize effort to implement.
A fairly straight-forward process plugin.
Remaining tasks
Implement it.
User interface changes
None.
API changes
None.
Data model changes
None.
Comments
Comment #2
cilefen CreditAttribution: cilefen commented@xjm, @alexpott, @effulgentsia, @lauriii, @catch and I discussed this issue at a recent meeting and agreed that since the parent is critical, we would make each potential solution major priority.
Comment #3
rakesh.gectcrComment #5
rakesh.gectcrIsn't similar to https://www.drupal.org/node/2876085 or duplicate?
Comment #6
quietone CreditAttribution: quietone at Acro Commerce commentedSimilar in intent, provide useful information to the user about ID conflicts.
#2876085: Before upgrading, audit for potential ID conflicts does that by auditing migrations before they are run. I've been testing that via the UI and it will display a message if there is existing content on the destination site. Whereas, here the proposed resolution is to add a process plugin that will throw and error if there is an ID conflict.
I do wonder if the work on the other issue has anything to inform or change this issue.
Comment #7
rakesh.gectcr@quietone, Thank you for the confirmation.
Comment #8
rakesh.gectcrJust confirming,
EntityExists
process plugin we already have it. Can't we use that, instead of creating a new one?If we are creating a new one, what should be the name?
Comment #9
rakesh.gectcrComment #10
rakesh.gectcrI am also seeing that, there can be some difference Because we are planning to search migration map table and Entity exists. Still need some more confirmation and clarification to convince.
Comment #11
quietone CreditAttribution: quietone at Acro Commerce commentedYes, I think entity_exists will do the job.
Comment #12
heddnI think this is duplicate of #2876085: Before upgrading, audit for potential ID conflicts. Should this be closed duplicate?
Comment #13
cilefen CreditAttribution: cilefen at Institute for Advanced Study commentedNot as per #6.
Comment #14
heddnWe have 'entity_exists', which returns the ID or FALSE and 'log' process plugins. We also have skip_on_empty. If you chain these together, then it will skip and you can log things. So, I think this could be closed as outdated. Since not all of these things existed when this was opened.
Comment #15
heddnDropping priority here. With the critical blocker in core now, this isn't as important. If it is necessary, the building blocks for doing this already exist. And as there is no patch, changing status.
Comment #21
quietone CreditAttribution: quietone commented@rakesh.gectcr, Hi! How are you? I'm doing some triage and since you haven't gotten to this I am going to un-assign you.