
See #2748609: [meta] Preserving auto-increment IDs on migration is fragile.
Problem/Motivation
Preserving serial IDs in the upgrade process (e.g., source site nid 53 => destination site nid 53) means that if any content with the same ID is manually created on the destination side before upgrade, it will be overwritten with data from the source. Since the core upgrade process does not provide an option to not preserve IDs, we should warn the user before executing the upgrade process of any potential conflicts.
Proposed resolution
Run a process before the actual import which identifies conflict risks - cases where an ID in the content to be imported matches an ID that already exists in the destination.
- Maintain data integrity
If this is a required step, the user can choose whether to proceed and overwrite any conflicts, or take an alternative approach (per the docs). - No surprises
They are told exactly what content could be affected by conflicts. - Provide a path forward
Yes. - Preserve URLs (e.g., node/17).
Yes. - Minimize technical debt.
Audit code should be reasonably isolated and maintainable. - Minimize effort to implement.
We’d need to first do what the import process normally does, identify and instantiate the necessary migrations. For each migration, we need to identify the destination ID field, see if that field is mapped in the process section, look for any of destination ID values which are not in the migration’s map table, and see if any of *those* IDs are in the source. Oh, but we need to consolidate this across all migrations to the same entity type… Ugh.
Remaining tasks
Code it- Architectural Review
- Usability Review
Change record
User interface changes
The upgrade summary page (after entering db credentials, before executing migrations) should run the audit and display any conflicts.
API changes
EntityContentBase destination plugin and Sql Id map plugin are now implementing HighestIdInterface.
New classes:
- AuditorInterface
- IdAuditor
- AuditResult
- AuditException
- HighestIdInterface
Data model changes
None.
Comment | File | Size | Author |
---|---|---|---|
#257 | interdiff-2876085-254-257.txt | 565 bytes | phenaproxima |
#257 | 2876085-257.patch | 62.87 KB | phenaproxima |
#254 | interdiff-2876085-245-254.txt | 1.33 KB | phenaproxima |
#254 | 2876085-254.patch | 62.75 KB | phenaproxima |
#251 | Selection_011.png | 49.46 KB | quietone |
Comments
Comment #2
vasi CreditAttribution: vasi at Evolving Web commentedAt the triage, I discussed the possibility of making this much simpler:
The current approach requires looking at all of the source, process and destination in complex ways, to identify exactly which new IDs will be migrated.
Instead, we could simply check "are there any new destination entities of certain types since the last migration (if any)?" If so, warn the user that these new entities might be overwritten.
This risks false positives: It could be that you have created nid 5, and are about to migrate nids 3, 4 and 6. Even though the migration would work, the simple check will warn you. But this case sounds quite rare, and careful users who know they fall in this case can click through the warning.
Comment #3
cilefen CreditAttribution: cilefen commented@xjm, @alexpott, @effulgentsia, @lauriii, @catch and I discussed this issue at a recent meeting and agreed that since the parent is critical, we would downgrade this to major but also make the other related children major.
Comment #4
heddnWe discussed this in the weekly migrate meeting and mikeryan, quiteone and heddn all agreed with vasi's approach as outlined in #2.
Comment #5
vasi CreditAttribution: vasi commentedOk, so here's the proposed implementation plan:
</code>$id_map->getHighestId() == $destination->getHighestId()
. If not, throw an exception. (Maybe we want to do this as a global pre-flight check instead, so as to not waste user time?)Comment #6
mikeryanGood thought - and if/when we do #2876086: Manipulate auto-increment values to avoid conflicts, we'd implement this on sources as well, so we'd know how high to set the auto-increment value.
Yes, that's the plan as specified in the issue summary here - this is a pre-flight check, runtime checks would be done in #2876090: Warn if migrated content will overwrite manually-created content.
Comment #7
vasi CreditAttribution: vasi commentedHere's a teensy implementation. Still needs tests, and vetting.
Comment #8
vasi CreditAttribution: vasi commentedFor easier review: https://github.com/drupal/drupal/compare/8.4.x...vasi:2876085
Comment #10
vasi CreditAttribution: vasi commentedNote also that the patch doesn't really do a good job of deciding when to audit IDs. We'll probably want to add 'preserve_ids: true' to all the relevant YAML files? Or we can try to use heuristics to do the equivalent automatically.
Comment #11
vasi CreditAttribution: vasi commentedAt migration triage, it was decided to prefer a top-level property for this. Also, the patch needs to be modified to handle multiple migrations to the same destination type (eg: nodes of different types).
Comment #12
heddnNit.Space.
Why is this needed? Should this be public and on the interface?
Can we just check !empty()?
Why is this public but ECB is protected?
Nit.Space.
Nit. Period at end of sentence.
Comment #13
heddnComment #14
heddnNits in my review are now addressed and we're now looking at a top-level attribute. Next up is manual testing and modifying to handle multiple migrations to the same destination type (eg: nodes of different types).
Plus we'll need automated tests.
Comment #15
quietone CreditAttribution: quietone commentedJust a wee manual test. Installed D8 using my script which will then enabled migrate modules and admin_toolbar. Then went to '/upgrade' and completed the form for a D7 site. The result was the warning message about entities may be overwritten. Yes, there are shortcuts but 'preserves_id' is false for the shortcut migration.
Maybe change to isset? And since ids is plural we should change preserves_ids to preserve_ids.
I also think the working needs improvement. But it is too late in the day to do anything about that.
Comment #16
maxocub CreditAttribution: maxocub commentedI would also suggest, and I think it was mentioned in a previous meeting, that we should use an 'opt in' tag instead of an 'opt out'. Something like
audit_ids: true
to audit ids instead ofpreserves_ids: false
to not audit ids.Comment #17
heddnWhat a bummer. That bumps the diff from 15k to 97k. We'll have to work on a review only patch from here on out.
Comment #18
maxocub CreditAttribution: maxocub commentedAbout the multiple migrations to the same destination type (eg: nodes of different types), I don't understand what needs to be done.
If there's an id conflict with migrated nodes, does it makes a difference if it's an article or a page? All of them are stored in the same table and have the same id field.
Comment #19
heddnFor reviewers, above are the relevant changes. The rest are all yaml changes in #17/#18. #18 is a new patch where I backed out the audit_ids config on some test assets:
core/modules/migrate/tests/modules/migrate_external_translated_test/migrations/migrate.migration.external_translated_test_node.yml
core/modules/migrate/tests/modules/migrate_external_translated_test/migrations/migrate.migration.external_translated_test_node_translation.yml
core/modules/taxonomy/tests/modules/taxonomy_term_stub_test/migrations/taxonomy_term_stub_test.yml
Comment #22
heddnTo generate the review.txt, I used the follow:
I also agree with #18. We don't care about the types. Probably up next is to fix the UI tests. What's after that?
Comment #24
heddnComment #25
quietone CreditAttribution: quietone commentedWhy have 'audit_ids: ' defined in all migrations? Surely we can have a default? Will it be the expectation that custom migrations add it? Can we use 'no_audit_ids: true' when we don't want to audit?
Comment #26
maxocub CreditAttribution: maxocub commentedI don't think it's necessary to put the audit_id tag everywhere. My understanding was that we were going to put it only in the migrations where we know it is needed.
If we don't want to audit IDs, we don't need to put
audit_ids: false
since this line evaluate as false if the tag is not present:Comment #27
maxocub CreditAttribution: maxocub commentedFor now, only EntityContentBase is an instance of MigrateIDAuditInterface, so only migrations with a destination extending EntityContentBase should use audit_ids (if needed). All other migrations that are not content will never have there IDs audited, so the audit_ids tag is really not necessary there.
Also, I tested the patch manually with the D7 fixtures and I found that if you enable the Forum module you will have a taxonomy term (General discussion) created on the destination site, and so you will be presented with the conflict warning page. That is why this line was needed in the tests:
Is this a problem? It might confuse people.
Comment #28
quietone CreditAttribution: quietone commentedYes, the new screen needs work. Do we work on it some more ourselves or tag for Usability review?
Comment #29
vasi CreditAttribution: vasi commentedSorry I haven't had time to help with this lately!
Unless I'm missing something, the patch still needs to be rearranged to correctly handle multiple migrations with the same destination types. Eg:
There are several potential fixes. I think the most resilient one would be to aggregate migrations with the same destination plugin ID—then compare
$destination->highestDestinationId()
with max(array_map($migrations, 'highestIdField')) (pseudocode). Sound reasonable?We probably also have to somehow handle revision IDs, ick.
Comment #30
heddnWe discussed this in the weekly migrate meeting today. The summary of the conversation: forum module actually introduces the situation that we are trying to resolve with this audit. It inserts a term into a new vocabulary with a term name of 'General Discussion', probably as term #1. This is the topic for the initial forum. If you migrate over from D6/D7 and you have term data (which you probably do) then it will overwrite term #1 with the data that was in D6. This results in your General Discussion term magically getting replaced as 'Cats' or whatever the term was in D6/D7.
To address this, we remembered we had initial thought that we would link out to a handbook page discussing some potential situations and solutions. Forum module is a bit of an edge case, it isn't as popular as some of the other modules. It might just be easiest to add it as a scenario to the docs.
But one thing we didn't know in the meeting was if Forum was unique here. So, let's do some additional testing to see if any other modules in core do similar things.
Comment #31
heddnNext steps
Comment #32
maxocub CreditAttribution: maxocub commentedBack to NW
Comment #33
krystalcode CreditAttribution: krystalcode at Acro Commerce commentedAdded a section at https://www.drupal.org/docs/8/upgrade/known-issues-when-upgrading-from-d..., based on my understanding of the situation. Feel free to make any corrections or additions.
Comment #34
heddnre #27: I found a listing of entity things over in #2711099: Categorize migrations according to their type. Going to work on that and adding a test.
Comment #35
heddnok, this only templates from the list above. To reduce the list from *all* templates, I took the list from the related issue and did a quick search/replace to make a series of commands like:
git reset HEAD core/*/action_settings.yml
, etc.Interdiff attached if you finding it interesting. Next up is tests.
Comment #36
heddnComment #38
maxocub CreditAttribution: maxocub commentedFirst attempt to handle multiple migrations to the same destination (d7_node:artice, d7_node:page).
Tests will follow.
Comment #40
maxocub CreditAttribution: maxocub commentedOups
Comment #42
maxocub CreditAttribution: maxocub commentedSince this will be a hard issue to review, would it be OK to concentrated on the code and tests here, with only a few audit_ids to test with, and to open a follow up to tag all migrations that needs auditing? It think it would make our job and the reviewer's job a lot easier.
Comment #43
heddn@maxocub to create a review patch, see #22 for instructions.
Now a review of #40.
I see what we are doing here now. I was curious why this was needed. So bear with me while I verbalize my thoughts. If there is a page and page_custom content types in the source, there would be 2 derivative migrations on the destination aligned with those 2 sources. However, if someone built a custom migration and piped those two sources into the same page content type (because that's what they want them to be on their d8 site), then there will be problems. There would be 2 source bundles but only a single mapping table. Let's call this scenario #1.
Alternatively, some builds a completely custom solution from CSVs and has only a single source but multiple bundles on the destination... then we'd have multiple bundles but only a single destination map. Scenario #2.
Last example. A single page content type on the legacy site. Custom migration. The single source is getting split into 3 content types on the destination with 3 different migrations and ignoring (skipping) content that isn't destined for that content type. Perhaps its breaking things up by media type so videos go one place. Audio another direction and photos the last content type. Scenario #3.
All the above are valid scenarios and aren't even that uncommon. Let's explore why bundle is important now. I'm not sure it is.
In any scenario, if the highest nid, media id, term_id, etc is greater than any mapping table's, we'll throw a warning. It indicates that the destination entity has more entities created than which came from a migration. In many cases, this is fine. Especially if you aren't mapping ids from old system to new. But if you mapping entity ids, then it does become important.
Comment #44
heddnLet's see if we can resolve the test failures. Interdiff and changes against #36 (for now). If we decide to, we can add in #40 at a later point.
And from a UX perspective, I've also grabbed a screenshot of the conflicts page. It now has a link off to the recent handbook page entry. Maybe not the prettiest, but I'd don't think its too ugly either. It serves the purpose for solving a critical. If we want to bikeshed much on it, let's fork that to a follow-up, please.
Comment #46
maxocub CreditAttribution: maxocub commentedQuick review of last patch.
1.
Extra line.
2.
Should use {@inheritdoc}.
3.
Should be 'we can have conflicts".
4.
This is maybe too much of a nit, so fell free to ignore, but the coding standard says to use same types of variable names (snake_case / lowerCamelCase) inside a file.
5.
Unused use statements.
6.
Should use {@inheritdoc}
7.
This is already tested in the parent class.
Comment #47
heddnRe #38 & @maxocub: I think I understand the problem now. To solve, I think we need to inspect all the migration destinations and for all that "match" the entity type, see what they have recorded in their mapping table. We loop over all migrations in #2711099: Categorize migrations according to their type, so a similar approach is what I'd expect here.
Next up, I'll work in responses to #46. I'm leaving #38 /
#22 for a later point. Maybe even by someone else.
Comment #48
heddnSetting to NR for test bot, but we still need to address #22.
Comment #50
maxocub CreditAttribution: maxocub commentedHere's a second POC for handling multiple migrations to the same destination. What do you think? (Still needs tests)
I also changed back the assertSame to assertEquals to make the migrate_drupal_ui tests pass.
Comment #53
maxocub CreditAttribution: maxocub commentedHere's fixing the Unit tests.
@heddn: I'm not sure I like the idea of extending MigrateUpgrade6Test & MigrateUpgrade7Test for the conflicts test. Those are so long to run and they often need to be updated when we modify the fixtures. If we extend them, it means two times as long to run and two more places to update the entity count.
I did spent a lot of time trying to make them pass, and when I finally got the counts right, there was this block migration that was failing due to the forum module not being enabled.
If you're OK with that, I'm going to try to work on dedicated tests.
Comment #55
heddnre #53 we might have to do that, but then we'd be dropping a "test all the things" test case, more or less. Let's try a little longer to see if we can make things pass. Or if you have an idea to keep the "test all the things" and a non-conflicts test case, that might be another option.
Comment #56
maxocub CreditAttribution: maxocub commentedSorry that I didn't uploaded my patch with the updated counts, but it's not just the field_storage_config that needs updating, but also field_config, taxonomy_vocabulary, rdf_mapping, base_field_override (and maybe another one, I don't remember) and you have to lower it in the test where Forum is not enabled and keep the initial value in the test where Forum is enabled. ;)
If we keep the "test all the things" scenario, we could just test that there is conflicts, but not migrate and count entities, and in other tests, do the migration and entity count but on selected migrations?
Comment #58
maxocub CreditAttribution: maxocub commentedHere's the correct entity counts.
There should now be only one test failing (MigrateUprade6Test) with this error:
Maybe it's a problem with the bloc migration, but it's unable to migrate the forum blocs if forum is not enabled.
Comment #60
maxocub CreditAttribution: maxocub commentedI started to work on tests, using KTB instead of the slow BTB. This is just a begining, I plan on adding more of them.
The interdiff is a bit bloated because I did a little refactoring of the API to make it cleaner, I think.
Comment #62
maxocub CreditAttribution: maxocub commentedOups, wrong class name.
Comment #63
maxocub CreditAttribution: maxocub commentedI added tests auditing the obvious entity migrations for D6 & D7. I'll add more tests for the other places where we need to audit IDs.
Comment #64
heddnWe should remove this helper function from the interface so we can have this solution work with non-entities as well. Plus, we should look at i18n and revisions.
Comment #65
maxocub CreditAttribution: maxocub commentedI removed the getEntityType() method from the interface and moved things a bit. Now the ID auditor is using the migration's base_id & label for it's report, instead of the entity's id & label, since we want to support more than just entities.
Let's start by auditing IDs of the obvious entity types and then move on to the revisions, translations and other exceptions.
Comment #66
heddn@catch, if we fix this for everything but i18n and revisions but can prove that this solution in the API will work, can we mark the API module stable? Or is the API module's stability dependent on completely solving this for those two pieces as well. I'm hoping we can prove it *can* solve the problem for revisions, but leave the solving of the problem to migrate drupal. And then mark the API module stable a little bit earlier.
Comment #67
heddnComment #68
heddnComment #69
heddnMissed some hunks on the re-roll.
Comment #70
heddnComment #72
webchickInventing a new tag to illustrate that this is a Migrate critical specifically blocking Migrate API's stability. (Please advise if this is wrong.)
Comment #73
heddnLet's see if that fixes the test failures.
Comment #75
heddnAdded initial support for revision. One question I have for i18n, do we need to make the audit API support more than a single "key". I have a feeling we might, but perhaps someone else can confirm?
Comment #77
heddnComment #78
heddnSorry for all the noise. Not sure why the tests are failing. And I cannot reproduce it locally.
Comment #81
heddnThis should fix the tests. Interdiff attached that goes back to #75.
'Entities may be overwritten' != 'Entities may be overwritten?'
Comment #83
maxocub CreditAttribution: maxocub commentedRe #75
Since translating a node does not create a new node, the highest ID does not increase. So if we only support one key (the nid) the audit won't show any conflicts in this scenario:
1. You migrate d7_node first
2. You add translations manually (no new nids, only new langcodes)
3. You migrate d7_node_translation. There should be conflicts even though there's no new nids.
So I guess only supporting one key (the nid) is not sufficient, but how can we find the highest destination ID when we are dealing with langcodes, that I really don't know.
Comment #84
heddnThe other thing to consider is that paragraphs uses a composite identifier of an id and its revision. I'm not sure if the revision is sequential and you can "just" use it, or if we need to handle the both.
Comment #85
heddnFor i18n, we can probably check if the destination supports translation and use that as part of the conflict identification.
Maybe here we just need to check if translations are enabled on the destination and then check if en or es, etc was already inserted into the destination. basically filter getHighestMigratedId by langcode.
I wonder if this should be a break instead of a return. If a single mapping table doesn't exist, it doesn't mean that other mapping tables don't.
Potentially we could filter by langcode. Which would change the interface of this function.
Comment #86
heddnChecked with Berdir and paragraphs, while it uses a composite id, the revision_id is sequential. This answers my question from #84 that getAuditedIdFieldName can just be a single field name's value. We just need to decide if getHighestDestinationId should key its values by langcode.
Comment #87
heddnThis is a blocker for #2859225: [policy, no patch] Mark Migrate (not Migrate Drupal) as Stable
Comment #88
heddnI've opened #2905743: Add UI components of audit/conflicts to drush runner to create the drush equivalent to migrate_drupal_ui's execution of the audit.
Comment #89
heddnThe tests came back green on local. Fingers crossed.
Comment #91
heddnI think we've figured out how to handle revisions in this patch. I could use someone else's eyes to confirm that.
What's remaining is the i18n/translations. That one is proving hard. Per #83, we might have to solve it entirely differently. I'm proposing a very ugly solution, but it would at least keep us moving.
If the destination is translatable, then just warn the user that it *could* result in issues. Link to a d.o. page that discusses some of the technicals. If a site has i18n on it, I cannot think of a scenario where there isn't going to be some custom migration work anyway. So warn and work on fixing it with follow-ups. And for the rest of the world who isn't i18n, they won't even know about this scary part of the Drupal universe.
Comment #92
heddnI've opened a follow-up for the i18n. I haven't currently marked it migrate critical, since I'm hoping that we can just handle i18n translations with documentation, as I'm doing in the attached patch. Tests will need some work as this new approach is untested. Looking for feedback.
Follow-up: #2905759: Before upgrading, audit for potential i18n ID conflicts
Screenshot of the new conflicts page:
Comment #93
quietone CreditAttribution: quietone commented@heddn, @maxocub, @vasi, Awesome work!
Had a read through and found a bunch of nits. There are two code questions, the first item and then another about which assertion to use.
If there is no label nothing will be recorded? I know we check for the existence of labels in core, but if someone adds audit to their migration without a label, there will be no notification.
Remove line to be consistent with previous method.
Remove extra line
s/later/latter/ and s/is/are/
s/Get/Gets/
Isn't this supposed to have a one line summary followed by a blank line and the a more detailed description?
The use of 'should' here doesn't make it clear, to me, what to do. My first thought, was what happens if I don't?
s/should be/will be/
Having the doc here can make one believe that audit_ids is a configuration on the destination, when it is the migration. What is a better place?
Don't think this is throw an error. Just providing a nice warning message in large friendly letters?
Is the phrase, 'to have access to the list of migrations.' needed? It reads funny to me.
At first I thought this foreach was just for derived migrations. It isn't. Can we change the comment to say that it gets the ID from all tables in the list?
Why not use $this->assertEmpty?
Comment #94
quietone CreditAttribution: quietone commentedTested the patch on a clean install and the warning message was not displayed. Canceled that and then used Devel to general taxonomies, users, content and menus. The warning screen displayed but unexpectedly it lists 'files' twice. Maybe one for private files and one for public files? It was a d7 site I tested with. I expected the list to include users as well, I have made 50 users.
Comment #95
heddnAll feedback in #93 and #94 is addressed.
Comment #97
heddnComment #98
quietone CreditAttribution: quietone commented@heddn, thx. Yes, looks like all my points are addresses. Glad to see the audit_id configuration explained in Migration.php. Much better.
Haven't run a manual test, yet.
And, at what point can we get UX folks to look at the warning screen?
Comment #99
quietone CreditAttribution: quietone commentedRan the test and the results look perfect! Both public and private files are shown and so are users. I also generated comments this time, and they are included in the warning notice.
Are translations being done here?
Setting to NW for the change record.
Comment #100
quietone CreditAttribution: quietone commentedUsing the same site I added translations for a node, comment, term and menu. The warning message now includes 'node translations' but not the comments, terms or menu.
Comment #101
quietone CreditAttribution: quietone commentedDiscussed at the migrate meeting and I agree with heddn's suggestion in #91. That is, provide warnings about translations, documentation how to handle it, and make follow up issues. As much as I want to be inclusive waiting to got these 'tricky' issues solved will just take too long. And sites using translations are more likely to need custom migrations.
Also, tagging Usability for the warning screen.
And, from the call I realized that the testing I did above was with a D7 site. The previous warning is correct when migrating from a D7 site.
I repeated the test with s D6 source sites. The warning message is again correct.
Comment #102
heddnbased on #101, I think this can go back to NR.
I think the outstanding things are:
Comment #103
heddnUpdated IS and added a screenshot to it to aid in UX review.
Comment #104
heddnAnd found an issue when looking at that screen. Fixed and posting new patch.
Comment #105
heddnComment #106
phenaproximaThis is pretty nice and clear, but I have a lot of concerns about the current state of the patch, including architectural concerns.
This service has no dependencies or state, so it doesn't need to be a service. Consuming code can simply create a
new MigrateIdAuditor()
.This is a nit, but to reduce the amount of nesting (thus increasing readability), can we do this instead:
This could use more detail -- perhaps a longer description, after the introductory line?
These are both missing descriptions.
I don't understand why we're keying by the base ID only. Wouldn't we want to collect more specific information, and key by plugin ID?
This is a bit messy, and I don't like the repeated calls to getArguments(). Can it look more like this:
Nit: Should be "if the former is greater than..."
The description here seems to conflict with what I'm seeing in buildConflict(). At the very least, buildConflict() needs some inline comments to clarify.
Maybe add a @see comment or two referencing MigrateIdAuditorInterface?
This is a long class name. Seeing as how this is already in Migrate's namespace, would it be kosher to rename this to DestinationAuditInterface?
I'm not sure zero should be an acceptable return value. I would think that, if the concept of a highest ID is not meaningful, the destination should not be implementing this interface.
How is this related to ID auditing? I think this might need more explanation in the doc comment.
Okay, so...as far as I can tell, we are now going to be providing two separate interfaces to do the same exact thing, which is "return the highest value". This surely could be a single interface implemented by both ID maps and destinations: MaximumInterface or something. Not sure how to name it. Hell, maybe even just \Countable would suffice. Then we wouldn't have to implement stuff like getAuditedIdFieldName() on destinations that support auditing.
Point is, I feel like we're adding a lot of moving parts to Migrate indiscriminately. The API is already very expansive and difficult to grok, so I feel like we should be sparing about adding even more moving parts when existing things (like \Countable) could work just fine.
Let's add a @see referencing MigrateIdAuditorInterface.
It'd be preferable to use Prophecy here (also ::class syntax, rather than writing out the entire class name).
Ditto.
There's a lot of commonality between this test and the D6 one. Can we combine some of the shared functionality in a trait?
Is $form supposed to be passed by reference?
Let's not repeatedly call getDestinationPlugin().
Not sure $form should be a reference.
Again, not sure $form should be a reference here.
I don't get why we'd clone this. How about just making MigrateIdAuditor::buildConflict() a public static method, so it can be reused? In fact, we could add that to the interface too.
Comment #107
phenaproximaThis patch refactors #105 and addresses a lot of my expressed concerns. I condensed a bunch of the stuff we had and introduced only one interface for destinations and ID maps to share. Amazingly...this passed the tests on the first try. O_o
Not bothering with an interdiff this time, because the changes are numerous enough make it impossible to read. :)
Comment #109
phenaproxima#107 was rolled against 8.5.x.
Comment #110
jofitzUnable to replicate test fails locally so corrected minor coding standards errors and resubmitting to the testbot. Will look again if the errors occur.
Comment #112
maxocub CreditAttribution: maxocub commentedThe tests are failing because there's two class keys in d7_user.
Before, we were using a tag 'audit_ids', but if we are now using class, I don't know how to fix this. Any idea @phenaproxima?
Comment #113
heddnWould this fix it?
Comment #114
maxocub CreditAttribution: maxocub commentedNice, I haven't realized that IdAuditingMigration was extending Migration, so I guess that will fix it.
Comment #116
heddnHere's a quick review.
Do we really need this? If you are using the IdAuditingMigration, can we just assume a certain set of implementation details? I almost want to throw an exception/warning if these don't implement, rather than just silently skip.
This isn't needed/used any more.
We cannot assume this will be available from the destination, unless we check if it is EntityContentBase, no?
Comment #117
heddnHopefully this picks up the failures and my feedback from #113. I also added an interface for the auditing and moved over the comments to its class doxygen from the base migration plugin class.
Comment #119
heddnComment #121
heddnGood call adding the InvalidPluginDefinitionException, it caught that we were trying to audit field migrations, when that simply won't work.
Comment #123
xjmPer discussion with @catch, we're promoting the "Migrate critical" issues to critical given the importance of a stable migration path at this point in the release cycle. (The framework and release managers will confirm this in a later triage for individual issues.)
Comment #124
phenaproximaI did some detective work and determined, as you guys already did (I should read more) that the patch in #121 is failing tests is because UserMigration extends FieldMigration, which doesn't extend IdAuditingMigration.
I think the solution is to make the contents of IdAuditingMigration a trait, which can fulfill the implementation of an interface (AuditableMigrationInterface is one possibility for what to call that). Then we can use it on anything, without worrying about breaking the inheritance chain. That should fix it. Then we can get this thing in.
Comment #125
maxocub CreditAttribution: maxocub for Acquia commentedAdding a tag so we can see this issue in the kanban.
Comment #126
maxocub CreditAttribution: maxocub for Acquia commentedOups, forgot the tag.
Comment #127
maxocub CreditAttribution: maxocub for Acquia commentedAssigning for fixing the test and adding the trait.
Comment #128
phenaproximaI'm going to reassign this to myself because honestly, my thinking about this patch is kind of scattered and I want to settle down on a preferred architecture rather than send someone else through a labyrinthine maze of my personal thought process.
Comment #129
phenaproximaLet's try this on for size. It passed all Migrate tests locally...
No interdiff because it's significantly different from #121.
Comment #130
phenaproximaAdded doc comments.
Comment #131
heddnLet's move this to another issue. That would let a fairly trivial change get fixed pretty easily.
Nit: sprintf would make this easier to read.
Won't this cause d6_node:page and d6_node:blog to overwrite each other? With a key of d6:node.
This isn't on the interface, so we cannot trust it will exist. Ah, I see. We are checking later on the form if this is an ECB. That's fine.
Does this cause a BC break? It could break a lot of contrib tests as we'll have to add more to the mock object. We'll break BC if we need to, but I think this is the first real breaking change.
This means either needs work to add the issue to fix this or we need to add the logic now.
Comment #132
quietone CreditAttribution: quietone commentedTo me this sounds like the whole approach is different. Can you say a few words about that?
Comment #133
phenaproximaThe whole approach is not enormously different; there are just enough changes between the two patches that there wasn't going to be much benefit to an interdiff. I can post one anyway if you want.
Comment #134
heddnI've added #2916642: Private and Public file migrations use the same name. We can still keep those changes in this patch, but let's hope that the follow-up will get committed first.
Comment #135
quietone CreditAttribution: quietone commented@phenaproxima, no need for an interdiff. Thanks for the offer.
Comment #136
phenaproximaAddressed all of @heddn's points in #131. Except for:
Yes, it will, but since both migrations have the same destination, and that destination's max() method is not filtering by node type, conflicts detected for any given derivative of d6_node will be detected for any other derivative. All migrations that are opting into auditing follow a similar pattern. So it makes sense for the base ID to be used as the key for the conflicts.
I'm un-tagging this as a BC break, because...it doesn't break BC.
Comment #137
quietone CreditAttribution: quietone commentedI read the code (skimmed the changed to all the yml) but didn't apply the patch and test this time. Most of what gives me pause is the language in the displayed messages.
Interesting. Can this use DERIVATIVE_SEPARATOR?
Just a bit confused by the names. To me, i18n refers to the i18n module. But, here i18n is being used to identify 'translation' destinations. If one of these is for a destination then what is conflicts? I guess it is content_conflicts and translated_content_conflicts.
Can we expect all users of the UI to know what an entity is?
This should use the same language as on /upgrade.
s/brand new sites/clean and empty new install of Drupal 8.
And 'sites with previously upgraded data' conflicts with having a clean and empty new install.
Ditto about entities.
s/translatable entities/translated content/
Again with entities
quotes not needed.
Comment #138
heddn#2916642: Private and Public file migrations use the same name landed, so this needs a reroll.
Comment #139
phenaproximaThose name changes were removed in #136, so I don't think we need to reroll this...
Comment #141
maxocub CreditAttribution: maxocub as a volunteer commentedIt needed a re-roll because of the change in context lines. Here it is.
Comment #142
jofitzAddressing @quietone's review in #137.
Comment #143
jofitzResponse to #137:
Comment #144
maxocub CreditAttribution: maxocub as a volunteer commentedI very much like the new approach. Every thing seems clear. I just have a few nits and a question:
Comment should be on next line.
I don't think that's needed anymore.
So I understand that this is for not breaking BC, but just out of curiosity, when will this @TODO be addressed? Drupal 9?
Comment #145
phenaproximaI think it'll have to be.
Comment #146
phenaproximaAddressed points 1 and 2 from #144.
Comment #147
jofitz#146 addresses the first 2 points from #144, #145 addresses the 3 point. Therefore this can be moved to RTBC.
Comment #149
phenaproximaFailed to apply? Methinks a reroll is in order.
Comment #150
maxocub CreditAttribution: maxocub as a volunteer commentedHere it is!
Comment #151
heddnBack to RTBC. I think this was caused by #2905227: Migrate UI: Improve 'Review Upgrade' page UX
Comment #152
larowlanThis is progressing well - nice work folks
This is a bit nasty. Are we sure we want to embed this kind of deep knowledge of translatable markup behaviour here? What is the cost of just using
$conflict[$base_id] = (string $label;
I think the name here should be a bit more verbose, e.g
getMaximumId()
, max doesn't really convey what is going on that clearlynit: the second sentence here is disjointed.
suggest: 'If the highest destination is greater than the highest source ID, a warning will be displayed that entities might be overwritten'
intended change?
What if there is more than one ID field? Do we care? Should we throw an Exception because all bets are off?
nit, ===
let's do that here
nit ===, these are strings
the highest of the highest?
out of scope/unintended?
nice
missing a docblock?
duplicated? could go in a trait and share between both tests?
sidenote: this form is getting rather large now. follow-up to split it into separate helper classes for each step for maintenance sake?
If we have an interface for auditors, and the description above effectively says 'could audit in any way they see fit', how does one use a custom auditor? Should this be done in an extensible way?
ah nvm comment above about protected->public change
intended?
nit ===, the parent casts the return to an integer so we should be type-safe here
Also, looks like we still need a change record?
Comment #153
heddnWorking to address feedback in #152.
Comment #154
heddnAll but #5 from #152 is now addressed. I agree we need to handle multiple destination ids. Paragraphs requires it. I just don't have time to work on it right now. I'm figuring we should do something similar to EntityContentBase::getIds().
For #14, I've opened #2918761: Break up MigrateUpgradeForm into smaller forms
Setting to NR for testbot. We, still need to add a CR and address #5.
Comment #156
maxocub CreditAttribution: maxocub as a volunteer commentedThis should fix the failing tests.
These fails show that this BC break might cause a lot of test failures in contrib, as mentioned in #131.5.
Comment #157
maxocub CreditAttribution: maxocub as a volunteer commentedThis property should be in lowerCamelCase.
Comment #158
maxocub CreditAttribution: maxocub as a volunteer commentedI don't know if we can do that, but one way we could avoid the BC break is by changing the visibility of the getMigrationPluginManager() method on the Migration Plugin:
If it was public, the Sql class could have access to the manager without changing it's constructor.
Is that something we can do or does it have a very good reason for being protected?
Comment #159
quietone CreditAttribution: quietone commented#152.14 - Glad to see this. I've been wanting to do that for some time. Even tried when I was just learning but found I prefer migraty stuff to forms.
Comment #160
maxocub CreditAttribution: maxocub as a volunteer commentedBack to NW for #152.5 and the change record.
Comment #161
heddnre #158, one good reason to not do that is getMigrationPluginManager is not on the MigrationInterface. So you cannot be always sure that it will be available. We're stuck with either not DI and saving BC or doing DI and breaking BC.
Comment #162
heddnLet's see how this fares with tests. I've picked up #152.5
Comment #163
maxocub CreditAttribution: maxocub as a volunteer commentedWe should do two change records, one for the BC break (Sql class) and one for the new audit step.
Comment #164
heddnTagging for BC break. The alternatives are to add an TODO and open a follow-up issue for 9.x to fix this more better.
Comment #165
larowlanI think if we're breaking BC just for DI, we should just use the singleton behind a protected accessor with a todo to resolve in 9.0
Plugin constructors are considered internal, so we don't have to retain BC, but providing a change-record is a considerate thing to do.
We had this discussion recently, where I pointed out that those sub-classing plugins should use setter methods from the factory.
This means you can extend a plugin, and be insulated against changes in the parent constructor.
Comment #166
maxocub CreditAttribution: maxocub as a volunteer commentedMy understanding of #165 is that we can either break BC if we provide a nice change record, or not break BC and add a todo for Drupal 9.
So let's not break BC since Migrate is in beta. Here's the follow up for Drupal 9: #2919158: Add the MigrationPluginManager to Drupal\migrate\Plugin\migrate\id_map\Sql.
(We can still go back and forth with this BC / no BC thing if everyone is not on board.)
Comment #167
heddnI like it. But I cannot RTBC.
Comment #168
quietone CreditAttribution: quietone commentedI started reviewing last night and now have time to finish and comment.
From the review in #152, everything has been done, except #4. That was a asking a question which has not been explicitly answered. The question was if the change from protected to public on isTranslationDestination is intended. The answer is yes. That method is part of the destination plugin EntityContentBase and until now it was only needed by destinations. The auditor needs to know about translations to function properly.
That leaves responding to #165, which maxocub did and heddn likes. I don't quite get that so will refrain from comment.
I tested the patch, with D7 and 1 article. A screen shot is attached. I am still concerned about the text. I think it is a bit long and still mentions entities. And 'Conflicting content' is a strange way to say, 'these may be destroyed'. What is the way to get UI input to this? Or do we do that in follow up? Sorry, marking as NW for this.
Updated the screenshot in the IS to use this one.
Comment #169
quietone CreditAttribution: quietone commentedOh, and why 'highest' and not 'maximum'?
Comment #170
maxocub CreditAttribution: maxocub as a volunteer commentedRe #169: I'm not sure which one is better, 'getHighestId()', 'getMaximumId()' or 'getMaximumValue()' (since it's the MaximumValueInterface)?
Re #168: There's actually a problem in your screenshot, the fact that the label is displaying the node type (Article).

It's ok in your case because you added an article, but if you added a page, it would also display 'Node (Article)'. The problem is that the auditor, have no idea what the bundle is and it should just display the entity type. That is what we were doing before with the clumsy translatable markup and arguments:
That way it was only displaying 'Node' or 'Node revision' without the (maybe) wrong and misleading bundle.
This was removed after @larowlan's comment #152:
I don't know how it could be done otherwise, any ideas?
Comment #171
quietone CreditAttribution: quietone commentedGiven those choices, I prefer using a similar name as the interface, either getMaximumId or getMaximumValue. This is really looking as ids so getMaximumId(). But, if everyone else is happy with getHighestId(), I can live with the status quo.
Re #168. Good catch. I just added a single page and tested again. Sure enough, the message states that the conflicting content is Article node types no mention of Page. Clearly this needs to be changed. We can either have a generic message, "One or more of the Content Types on this site" (not suggesting that text, just making the point) or return to the more specific one.
As a user, I would prefer to know exactly content type I might lose. If there is agreement on that the user should know, then restore the code, add a explanatory comment that this isn't ideal. And a follow to fix it if and when a better way is found.
Comment #172
larowlanYeah agree - my comment was more of an observation, even if we had a helper class with the same logic in it would be better
Comment #173
heddnIn the migrate maintainers meeting in IRC we discussed this issue. In attendance were @rakeshjames, @phenaproxima, @quietone, @maxocub, @heddn.
We decided to use Highest for the interface and for labels the following:
We like "node" (or rather "content", the human-readable plural label of the node entity type) without bundle -- but we have to find a way to do that without assuming the destination is an entity destination. To do that, get the destination plugin's label. $destination->getPluginDefinition()['label']. And just have EntityContentBase override that, in the constructor, with the human-readable name of the entity type it's going to save. That way we are compatible with all destinations, and the auditor is not making assumptions.
Comment #174
maxocub CreditAttribution: maxocub commentedThis needed a re-roll since #2917883: Rename 'migration_templates' directories in core modules to 'migrations' got in.
Comment #176
maxocub CreditAttribution: maxocub commentedOups, 2nd try.
Comment #177
maxocub CreditAttribution: maxocub commentedWe discussed this issue in yesterday's maintainers meeting on IRC with @heddn, @phenaproxima, @quietone, @masipila and myself.
We thought that the auditor should not return only migration labels, but that it should return an object (AuditResult?) which references the migration, and also an array of strings (information), plus boolean flags indicating if the audit passed or failed, with getter methods to access those properties (getMigration(), getReasons(), getStatus()?).
Comment #178
phenaproximaI propose the following things about AuditResult:
Is there any reason not to do these things?
Comment #179
phenaproximaHere's a first attempt at implementing all the stuff that has been discussed amongst the maintainers since @larowlan's review.
Comment #181
phenaproximaI should try running the tests first. Hopefully this fixes 'em.
Comment #182
masipila CreditAttribution: masipila as a volunteer commentedHi,
Nice to see how this is evolving! A couple of cosmetic coding standard nits.
1. Class descriptions should start with a third person verb.
2. Why don't we just call this getStatus()?
3. Extra line break here before the bullet list.
4. Description should start with a third person verb i.e. 'Returns the migration plugin manager'.
Markus
Comment #183
masipila CreditAttribution: masipila as a volunteer commentedComment #184
phenaproximaAll fixed, except for this one:
In the context of a value object encapsulating the result of a testing process, I think passed() more clearly expresses the intent of the method in calling code. I'll change it if enough people want me to, but for now I like it the way it is :)
Comment #185
masipila CreditAttribution: masipila as a volunteer commentedI manually tested #184 with both Migrate Drupal UI (i.e. web browser user interface) and with Drush.
I have read through the patch and commented in #182 the nits that I was able to spot and those have now been addressed. I think @heddn and @maxocub have been contributing to the patch so @quietone, could you please give this another pair of eyeballs and RTBC this when you have had time to review this?
Cheers,
Markus
Comment #186
phenaproximaThey'll have to address it in follow-ups, since it's the responsibility of the calling code to instantiate the auditor(s) and run them.
Comment #187
heddnre #185: I was going to suggest we open a follow-up for migrate tools, then I found #2840154: Implement migrate-key-audit command for D8.
Comment #188
yoroy CreditAttribution: yoroy at Roy Scholten for Dropsolid commentedJust skimming the t() strings here I found "I acknowledge I may lose data, continue anyway.", so linking #2773205: Come up with a design for highly destructive operations in confirm forms here :)
Comment #189
masipila CreditAttribution: masipila as a volunteer commentedRe: #188. I created a follow-up for improving this warning UI, see #2923052: Improve the UI of upgrade page when ID conflicts are detected.
I read through the patch again today on my way home from work and I couldn't find a nit (I love all the 'wonkiness' and 'craziness' in the tests). However, I'll let the other maintainers to do a second review for this.
Cheers,
Markus
Comment #190
maxocub CreditAttribution: maxocub commentedI really like the new AuditResult. Just a few nits:
I don't think this is required anymore.
Unused use statement.
Unused use statements
I also tested this patch with a D7 site with all the supported entities (aggregator feed, aggregator feed items, custom blocks, comments, custom menu links, files, nodes, node revisions, taxonomy terms, users). Here's what it looks like:
I think there's two 'file entities' because of the public and private files. I only uploaded one public file on the D8 site.
I think there's four 'content items' because I have two content types, Page & Article, and that node revisions are also displayed as 'content items'. Also, I only created 1 page on the D8 site.
Comment #191
phenaproximaThanks for the review and thorough test, @maxocub! This oughta fix all of that :)
Comment #192
maxocub CreditAttribution: maxocub commentedI did a quick test of the new patch:
This is better.
The file entities now appears only once
The content items still appears twice.
I guess it's the 'node' and 'node revision' which use the same label...
Also, I was not able to display the warning about i18n.
I did create translations on both the D7 and the D8 site, so there should be a warning.
Comment #193
heddnThis hopefully disambiguates the revisions from the original content. Still need test coverage for it and figure out why i18n translations didn't appear.
Comment #194
maxocub CreditAttribution: maxocub as a volunteer commentedRe #193: Good idea for the revisions disambiguation, but let's also make it translatable.
About the i18n conflicts form not showing, it was because the translations are not yet audited so they are not yet failing, and thus they were not added to the conflicts array. This should fix this.
Comment #196
jofitzCorrect the test failures.
Comment #197
maxocub CreditAttribution: maxocub as a volunteer and commentedAdded tests for the conflicts on the migrate upgrade form.
Comment #198
xjmComment #199
maxocub CreditAttribution: maxocub as a volunteer commentedLet's not forget that we still need a change record. I can write it tomorrow if nobody do it before.
Comment #200
maxocub CreditAttribution: maxocub as a volunteer commentedMaximumValueInterface
toHighestIdInterface
as was decided in #173Comment #201
neclimdulIs this how its different from watermarks? The method is confusingly close to me.
s/implemented plugin/destination/ ?
I'm not following the conflicts result here. Its just results right?
This should be an attribute shouldn't it? Seems we should go ahead and give it a sane default for contrib migrations.
why?
as was brought up in #106.12, not following this change. Don't see a follow up comment.
this doesn't seem needed. isn't that forward compatibility exactly why the create method exists?
Handy for these tests but doesn't it seem a bit too generically named?
Making the assumption these are connected to the trait, couldn't they be in a helper to help maintenance with less repetition and avoiding drift if some other content gets added?
Comment #202
heddnHighwater can use ID, but it can just as easily use any other value like a timestamp.
And I ran out of time here, so I'll leave someone else to correct my comments and finish out the rest of questions.
Comment #203
maxocub CreditAttribution: maxocub as a volunteer commented#201.8: What about the mouthful
MigrateDrupalCreateTestContentTrait
?#201.9: Moved all the
installEntitySchema()
in a helper method of the trait, but left all theinstallSchema()
in place since they are not related.#201.7: Not sure what to do here. We added that to maintain BC. If we add something to the constructor, a lot of contrib tests are going to be broken since they'll have to mock more things. That's what we're trying to avoid. But if there's a way to avoid having this getter and not break contrib test, I don't know it yet.
But according to Drupal 8 backwards compatibility and internal API policy, this kind of constructor is considered @internal, so in theory we could break BC:
Comment #204
heddnCan we move it out of the Kernel namespace since it seems we can also use it for Functional tests too?
Some thoughts on the naming (which is hard). MigrateDrupalCreateTestContentTrait repeats the namespace of migrate_drupal What about CreateTestContentEntitiesTrait
Comment #205
maxocub CreditAttribution: maxocub as a volunteer commentedComment #206
maxocub CreditAttribution: maxocub as a volunteer commentedRe #204:
Comment #207
maxocub CreditAttribution: maxocub as a volunteer and commentedOups, previous patch will fail, I forgot to remove Kernel in the namespace, it's been a long day.
Also, I found out that we cannot put classes directly in the Drupal\Tests\migrate_drupal namespace so I put the trait in the Drupal\Tests\migrate_drupal\Traits namespace.
Comment #209
heddnRe #207 Trait namespace: I think I knew that. There's an open issue on it somewhere. I just don't know the nid off-hand.
Can anyone RTBC this thing? I think I'm disqualified since I've been fairly involved from the beginning. But +1 from me.
Comment #210
quietone CreditAttribution: quietone commentedGreat job everyone!
Installed Welsh and Icelandic and used devel to add content, users, taxonomies, well everything really. My test db was the d7 test fixture. It worked as expected. Read through the issue again, taking note of items for follow ups and found that everything needing a follow up, including my own niggles, has a follow up. I considered if comments needed to be added for the questions in #201, and decided not is not necessary.
To be inclusive and show what this can do I recommend that this change record, When migrating from the UI, a warning is now displayed if content may be overwritten uses an image that includes translations. Such as the result of my testing, attached.
The other change record, New classes for auditing migrations for ID conflicts, is accurate but would benefit from a clearer section what contrib developers need to do to add auditing. For example, if my contrib module is doing a migration with
destination: entity:xyzzy
, what exactly do I need to do or add? Is addingaudit: true
sufficient?I also found 2 nits, which don't prevent this going RTBC. However, I am setting to NW to get feedback on the comments about the change records.
Found 2 nits.
Extra semicolon
These could be changed to use
$session = $this->assertSession();, which is defined later, after this new block of code.
Comment #211
maxocub CreditAttribution: maxocub as a volunteer commentedThanks @quietone for the review!
Comment #212
masipila CreditAttribution: masipila as a volunteer commentedI had a biref chat with quietone on IRC and reviewed the change records for readability. I made some small updates to both of them.
https://www.drupal.org/node/2928118
https://www.drupal.org/node/2928113
I also created a documentation follow-up for adding the ID auditing examples to Migrate API documentation handbook, see #2929671: Add ID auditing to Migrate API handbook
Cheers,
Markus
Comment #213
quietone CreditAttribution: quietone commentedThank you maxocub!! Especially for the extra step of sorting the conflict results for the user. How kind of you.
Thank you masipila!! You stopped whatever you were doing and reviewed and updated the change records. And gave yourself a new issue to update documentation. Good call.
Oh my, everything has been addressed here. Awesome work from all on a fiddly problem
I have kept myself away from this issue, only doing review, not only because the problem is fiddly but because I could now do this.
Pause, drum roll.....
Setting to RTBC!
Committers, please include masipila.
Comment #214
larowlanTagging as needs ux review to get the text reviewed, I'll ping @yoroy to let him know.
Will review the patch either today (Sun) or tomorrow (Mon)
Comment #215
larowlanAdding review credits
Comment #216
larowlanCouple of questions - glad to see we're getting close here
Nice, this is much cleaner than the previous time I reviewed
Should this call
->allRevisions()
to ensure the highest ID considers all revisions, not just the default one (i.e. a draft forward revision)? If so, means we're missing test coverage for that scenario too.This is a reasonable trade off, pragmatism++
:)
Is there a reason why we don't check
$result->passed()
for translations? If so - can we add a comment to that effect? If not and this is a bug, we're missing tests? I think its intentional because for translations we're just not sure (the warning message seems to indicate that), but a comment here to clarify why would help in the future.Comment #217
maxocub CreditAttribution: maxocub as a volunteer commentedThanks @larowlan for the review!
2. Good catch.
This code does not even do what we want. Instead of returning the highest revision ID, it's returning the entity ID of the highest revision ID.
I wrote a failing test to show this bug, but I haven't found a solution yet.
5. No, translations are not yet supported by the audit system, it will be done in this follow-up: #2905759: Before upgrading, audit for potential i18n ID conflicts. I added a comment in the code.
Comment #218
masipila CreditAttribution: masipila as a volunteer commentedI updated the Known Issues page quite heavily for better readability (diff).
I have a question on the text that we're showing in the UI, see conflicts.png in #211.
We are using quite strong language when warning about the translations. The Known Issues page needs an example of this scenario, currently the examples don't say anything about translations.
Cheers,
Markus
Comment #219
larowlanSomething like this might work.
When you call
::allRevisions
in an entity query, the returned values are keyed by revision ID. But the values are still the entity IDs, so you can pass it along to::loadMultiple
- which is the typical use case.Comment #220
masipila CreditAttribution: masipila as a volunteer commentedRe #218. I discussed this with maxocub in IRC.
So how about changing the wording something like this (commenting here instead of providing a patch to save time).
Current warning
It looks like you are migrating translated content. Be extra cautious and make sure you aren't introducting any conflicts. For more on the subject, see the online handbook entry for handling migration conflicts.
Proposed warning
It looks like you are migrating translated content. The possible ID conflicts for translations are not automatically detected in the current version of Drupal. Refer to the upgrade handbook for instructions on how to avoid ID conflicts with translated content.
Cheers,
Markus
Comment #221
masipila CreditAttribution: masipila as a volunteer commentedAnd now that we're anyway touching the patch, I propose some rewording also for the first warning.
Proposed wording
The upgrade should be performed on a clean Drupal 8 installation. It looks like you have content on your site which may be overwritten if you continue to run the upgrade. For more information, refer to the upgrade handbook.
--clips--
I also suggest to add a prefix 'WARNING' to the heading i.e. 'WARNING - Content may be overwritten'
Comment #222
maxocub CreditAttribution: maxocub as a volunteer commentedGreat, I didn't know that about allRevisions(), this should fix the failing tests.
I also updated the warning messages as suggested by @masipila.
I'll update the screen shots later tonight or tomorrow.
Comment #225
quietone CreditAttribution: quietone commentedTest failures seem to be in code unrelated to the patch. Try a retest.
Comment #226
quietone CreditAttribution: quietone as a volunteer commentedGood, tests are passing. Here is a screenshot
edit: add comma
Comment #227
heddnLeaving at NR but I think the only thing stopping RTBC again is the UX review. Am I right?
Comment #228
maxocub CreditAttribution: maxocub as a volunteer commentedYes, the UX review is the only thing left (for now).
Comment #229
phenaproximaTagging for usability review (I believe "Needs UX review" is now considered defunct).
Comment #230
webchickGreat, we have UX calls on Tuesdays at 12:30pm Pacific time in #ux on Drupal Slack (what is that in my time zone?). Feel free to come and demo your patch and we can get you unblocked. :)
Comment #231
heddn@webchick we discussed who on the team could be available and I'll be joining the call to present.
Comment #232
heddnI was also kindly reminded about #2923052: Improve the UI of upgrade page when ID conflicts are detected by @masipila.
Comment #233
neclimdulSorry for the review and the sudden disappearance. Sudden flu.
Say no to complexity guys. This is an anti pattern we accepted to deal with hard BC rules but this is clearly not a BC, there isn't any pragmatism here. lets not dig a bigger hole and just do it right. If someone's code does breaks using an internal method, they move to the actual interface(::create) and life is good, nothing breaks for them again in d8 and they learn their lesson about contain object constructors.
Comment #234
yoroy CreditAttribution: yoroy at Roy Scholten for Dropsolid commentedApologies for this drive-by style comment in a long hard issue, but we're discussing this in depth in our upcoming UX call later today and I want to put in some thoughts on the texts:
Comment #235
heddnre #233, but sure to comment over in #2919158: Add the MigrationPluginManager to Drupal\migrate\Plugin\migrate\id_map\Sql to that effect.
re #234: I've addressed each of your points in this latest patch. Screenshot added to IS.
Comment #236
masipila CreditAttribution: masipila as a volunteer commented#234: I love these improvements, thanks yoroy! And thanks heddn for taking the time to prepare this for the UX call!
@yoroy, I'll comment on the general usability issue #2773205: Come up with a design for highly destructive operations in confirm forms, hopefully you'll notice the comment before the call today.
Cheers,
Markus
Comment #238
maxocub CreditAttribution: maxocub as a volunteer commentedThis should fix the failing tests.
Comment #239
yoroy CreditAttribution: yoroy at Roy Scholten for Dropsolid commentedNotes from our review during the UX call:
- In the first screen we establish definitions for old site, new site. We should incorporate those here to be as clear as possible about where the conflicting items live (in the warning message and in the explanatory paragraphs, maybe even in the two sub headings?)
- Wording of the button: replace comma with full stop or semi-colon, not sure what we ended up on.
- Reshuffle the text so that the explanatory paragraphs are under their respective heading. We considered Heading, Paragraph, List and Heading, List, Paragraph. Although the first option seemed to be "nicer", the latter is actually better: we don't have to rewrite the current headers, don't have to introduce another header to define what's in the list and by ending with the paragraph we end with the links for more information in handbook pages so that's a good thing to end on:
So I think we should go with the latter: Heading, List, Paragraph.
Comment #240
maxocub CreditAttribution: maxocub as a volunteer commentedThanks for reviewing this during your UX call!
Here's the updated patch and a screenshot.
Comment #241
maxocub CreditAttribution: maxocub as a volunteer commentedOups, I forgot to modify the warning message and the button text. Will do in an upcoming patch.
Comment #242
maxocub CreditAttribution: maxocub as a volunteer commentedComment #244
maxocub CreditAttribution: maxocub as a volunteer commentedForgot to update the tests, this should do the trick.
Comment #245
heddnMinor wording change on the i18n warning to incorporate the language of old site.
- It looks like you are migrating translated content.
+ It looks like you are migrating translated content from your old site.
Comment #248
quietone CreditAttribution: quietone as a volunteer commentedOoo, the testbot is back. The errors look unrelated to the work at hand. Let's retest.
Comment #249
quietone CreditAttribution: quietone as a volunteer commentedThe last items here were about not using DI (see #233) and then engaging with and responding to a usability review.
For the former, changing to use dependency injection for the MigrationPluginManager will done in 9.x #2919158: Add the MigrationPluginManager to Drupal\migrate\Plugin\migrate\id_map\Sql. For the latter, the result of the usability review (#239) has been implemented, tests updated accordingly and there is consistent wording on the new form and the upgrade form when referring to each site. Thanks yoroy helping to get this right and providing the summary.
However, this change record needs a new screenshot.
Comment #250
masipila CreditAttribution: masipila as a volunteer commentedI have updated the Change Record to contain the latest screenshot. https://www.drupal.org/node/2928113
Comment #251
quietone CreditAttribution: quietone as a volunteer commentedGreat thanks, masipila. Since you beat me to uploading a screenshot to the CR I've added the one I made to the IS.
Back to RTBC.
Comment #252
neclimdulI don't think this is RTBC, lets slow down.
This is still incorrectly done. It needs to be an annotation attribute.
Comment #253
phenaproximaEDIT: After discussing with @neclimdul on IRC, I realized that what is being asked here is for the 'audit' property to be defined as a canonical part of the migration's plugin definition. Normally, this would be done by adding the property to the annotation class for migrations; however, migration plugins are entirely YAML-based (for backwards compatibility reasons) and therefore have no annotation class. One determines a migration's "auditability" thusly:
$migration->getPluginDefinition()['audit']
That's certainly not the most optimal thing in the world, but I don't think we should block this critical issue on that. I propose we open a follow-up, then, to add an isAuditable() method to MigrationInterface (along with a base implementation in Migration), in order to seal the abstraction correctly.
Comment #254
phenaproximaDiscussed further with @neclimdul on IRC and we arrived at a compromise.
This patch moves the $audit property, and its associated documentation, to the Migration class itself. We will open a follow-up to add a new isAuditable() method to MigrationInterface, which will seal the abstraction.
Comment #255
neclimdulPerfect. Thanks for sticking out and listening even though I was using the wrong word to get my meaning across.
For committers this is making a concrete property for something that's already happening in the constructor and consistently documenting it. The part that might seem missing is that existing constructor code but its there.
Comment #256
phenaproximaOpened the follow-up: #2930832: Add isAuditable() method to MigrationInterface, postponed on this one.
Comment #257
phenaproximaAdded a @TODO pointing to the follow-up, at @heddn's request. Leaving at RTBC because all I did was add a comment.
Comment #258
heddn+1 on RTBC!
Comment #260
catchPatch looks good. I think we need to leave the meta issue open (if not blocking) to keep looking at ways to stop people getting into this situation i the first place (like set autoincrement).
Did find two issues though, small but important so CNW for that:
This needs an ->accessCheck(FALSE) - if the latest node is hidden by node grants we might not get it.
So does this.
Comment #261
maxocub CreditAttribution: maxocub as a volunteer commentedAdded the accessCheck().
Comment #262
phenaproximaThanks, @maxocub! Can we also add a test of the access check, just to cover our asses? I figure that we can modify the existing tests to create at least one inaccessible piece of content (along with the accessible pieces of content), then ensure that it's considered during the audit...
EDIT: @catch said this to me in Slack, and it may prove useful for testing this: "it should be possible to re-use a test entity grants module for it"
Comment #263
maxocub CreditAttribution: maxocub as a volunteer commentedHere's some tests.
I used the node_test module which implements hook_node_grants() and hook_node_access_records(), I hope that's what was meant by "it should be possible to re-use a test entity grants module for it".
The test-only patch should prove that the bug was real and that the fix works.
No interdiff since the test-only patch is an interdiff (except for the commented fix).
Comment #265
maxocub CreditAttribution: maxocub as a volunteer commentedOups, this test-only patch is just wrong. Let's try the right way.
Comment #267
phenaproximaWe’re good!
Comment #268
larowlanAdding @neclimdul to review credits
Comment #269
larowlanAnd @catch
Comment #271
larowlanCommitted as 5ea62f1 and pushed to 8.5.x.
Published the change record
Unpostoponed the follow up
Thanks everyone for the continued efforts here
Comment #272
heddnI feel this is appropriate,happy dance. Thanks everyone for such a nice early Christmas present.
Comment #274
jhodgdonI think this audit is too aggressive and just filed a related issue.