Closed (works as designed)
Project:
Drupal core
Version:
9.5.x-dev
Component:
migration system
Priority:
Normal
Category:
Task
Assigned:
Issue tags:
Reporter:
Created:
3 Jul 2020 at 13:52 UTC
Updated:
5 Sep 2022 at 15:56 UTC
Jump to comment: Most recent, Most recent file
Comments
Comment #2
wim leersComment #3
benjifisherI have not tested, but I am afraid that the proposed resolution will lead to other problems.
Suppose that the files with
fid1, 2, 3 haveuidequal to 3, 2, 1. If stub users are created during thed7_filemigration, then themigrate_maptable will map 1 to 3 and 3 to 1, right? What happens when we get to thed7_usermigration, which hasuid: uid?I hope I am wrong, since then it will be much easier to fix this problem.
Either way, we need some test coverage before we can accept a patch on this issue.
It might be easier to add the tests after we finish #2974445: [Meta] Refactor Migrate Drupal UI Functional tests.
Comment #4
wim leersInteresting question.
But isn't the same true when nodes with
nid1, 2, 3 haveuidequal to 3, 2, 1?Comment #5
benjifisherThat would be a problem if the node migration ran before the user migration and the node migration used
migration_lookup.The only place I see
migration_lookupused in core node migrations is for D6 body formats:When I do a custom migration, I never try to preserve entity IDs, so I use
migration_lookupeverywhere. The core migrations do not work that way.Comment #6
wim leers#5: Fascinating! I never realized that. 😅 So that is intentional? Is this documented somewhere?
Do you see an alternative solution that can solve the problem I reported?
Comment #7
benjifisherI do not know if it is documented anywhere. While thinking about this issue the other day, I did notice this comment in
core/modules/file/migrations/d7_file.yml:IMO the right fix is to change core migrations to do what I do in my custom migrations: do not try to preserve entity IDs. But then you have to worry about
<a hfref="/node/123">About Us</a>in the Body field. We added the DOM process plugins to the Migrate Plus module to handle that. I guess we would also have to bring those into core.If we are going to stick with the core strategy of preserving entity IDs, then we have the chicken-and-egg problem that managed files have a
uidfield and users have auser_picturefield. I see that thed7_usermigration usesmigration_lookupfor that, which is odd becaused7_filepreserves the file ID. I guess the point is thatmigration_lookupwill tell us when a file (user picture) was not migrated.Maybe the simplest solution is to use
default_valuein thed7_filemigration to setuid = 1for all files. Then addd7_file_depends_on_d7_userthat only has two fields in theprocesssection:fid(usemigration_lookupbased ond7_file) anduid(usemigration_lookupbased ond7_user). If we skip any rows, then at least they will have a validuid.Comment #8
wim leersMuch worse: then you need to worry about https://www.w3.org/Provider/Style/URI 😨
I'll need to think more about the last two paragraphs you write in #7. I'm not sure I fully comprehend the consequences yet.
I will already say that I am surprised that your analysis seems to confirm that there is a chicken-and-egg problem WRT users and files (on sites where users have a user picture and hence a file), and that core has not yet solved this very common scenario 🤔
Comment #9
quietone commentedI know the related issue is closed as won't fix but it provides some related history.
Comment #11
quietone commentedWhy set all the files to be owned by uid 1 if the owner is going to be changed in a later migration?
Adding a new migration 'd7_file_depends_on_d7_user' looks like the way to solve this. A new source plugin that just gets files not owned by user 1 may be worth the effort. Then, all that is required is a new source plugin test and new Kernel migration test. All the files in the d7 fixture are owned by uid 1 so that should be updated in the Kernel test before the migration is executed.
I think d6 has the same problem, so that can be done in this issue as well.
Comment #13
wim leers@benjifisher: do you agree with @quietone's proposed plan of attack in #11?
Comment #15
omkar.podey commentedUpdated patch to cover d7_file_private migration.
Comment #17
mikelutzIf your d7 database has files owned by users that have been deleted, then the data integrity problem exists in D7, and it is not the responsibility of the core migrations to manage that or fix it.
Regarding #3 that's not an issue. The destination ids on stubs aren't random or incremental. the stub is created by running the user migration with only the uid in the source data. The destination uid is managed by the migration, i.e. the uid:uid that is used in the user migration is used for both the stub and the actual user migration. Stubs would be rather useless if they weren't creating destination ids the same way that the migration is going to when it runs.
The bigger problem here is that you would end up creating stubbed users that would never be migrated because they don't exist on D7. If I recall correctly Wim is not a big fan of this either( #3156730: Stubs should only be created if the referenced source row actually exists ) If you combine that with this one, Then you are most likely going to end up reassigning these files to anonymous. That's not a decision the migration system can make. Whether these files should be deleted, assigned to uid 0, assigned to uid 1, or left assigned to the deleted user id is a question the administrator needs to answer, but more importantly It has nothing to do with the migration. The admin could fix it in d7, fix it in d9, or ignore it, but the migration system is about duplicating the d7 content in d9 faithfully to the best of its abilities. If that is files owned by deleted users in d7, then they should be files owned by deleted users in d9.
Regarding validation, if you choose to create custom migrations that use validation to try to fix data integrity errors on D7, by all means do so, but you are responsible for the whole process of stubbing references and all that. The core migrations are designed to copy the d7 data over as-is.
Comment #18
anybody@mikelutz: Totally agree with your points in #17!
In my comment here over at #3307506: "Missing migrations upgrade_d7_field_instance" because of duplicate field_name_bundle entries in D7 field_config_instance I suggested the following, where the reason is also an inconsistency in the Drupal 7 source site:
So would it perhaps make sense to discuss any of the suggestions in a separate issue?
We're very technical, while by far not every Drupal 6 or Drupal 7 website owner is a developer and I guess the migration is still very frustrating. With such a preflight-check module (or whatever way) we might be able to provide a solution for some of these typical issues in the (also Drupal!) source.
What do you think?