TLDR: If you have a site that was originally a D6 site which was upgraded to D7 using the D7 upgrade system, the machine names of your text formats are numeric values, and those values do not get handled correctly in a D7 to D8 migration. This is a problem that is likely to affect a lot of sites, so it is going to become more of a problem. I've run into it on several sites already.
In D6 text formats all were numeric. In D7 text formats were changed to use machine names. But if you upgraded your site from D6 to D7 using Drupal's upgrade system, your text formats were never renamed to the standard D7 machine names, they were left as numeric values, or the string value of the original numeric value.
The Drupal to Drupal migration from D6 to D8 correctly handles the switch of the numeric values to machine names. The Drupal to Drupal migration from D7 to D8 makes the assumption that the D7 values are the ones you would have had if you built a D7 site from scratch, machine names that match what D8 uses.
The problem this creates for legacy D7 sites being migrated to D8 is that every text field ends up with an invalid format, the original numeric value, which does not match any actual format. If you edit your content you'll see that no format is selected for any of these fields. If you examine the entity, you'll see that the format is set to '1' or whatever number it originally had, rather than any valid format name.
It seems like the D7 migration could use the same system to convert numeric values that is used for the D6 migration. I tried to figure out how to make that change and couldn't come up with anything that worked. I've spent several hours trying to figure out a way to get this to migrate correctly but nothing I've tried works.
In the meantime my fix was to go back to my D7 site and update it to change the numeric formats to the machine names the migration expects. I created some code at https://github.com/karens/Pre-8-Cleanup. That is obviously a less than ideal solution but I posted it in case it helps anyone else until this gets taken care of.
Comments
Comment #2
karens commentedComment #3
karens commentedComment #4
karens commentedComment #5
karens commentedComment #6
mikeryanHmm, while not pretty I would expect that the D7 formats with numeric names would get created on D8 with numeric names - they wouldn't align, say, the numeric format corresponding to Filtered HTML with the existing Filtered HTML, but they should at least work.
A fail test would be a good start here, which will need such a format added to the fixture.
Thanks.
Comment #7
karens commentedFor some reason the numeric formats don't get created, at least not for me. And then the content that uses those formats is orphaned. And I've run into this on several different sites so far, so I don't think this is specific some odd configuration setting.
Comment #8
chx commentedComment #9
quietone commentedComment #12
heddnDoes the migration of formats actually fail? Or just fail to rename to a non-numeric name? The first is a major issue, but one I haven't encountered either. And I've done a migration from d6=>d7 that was later upgraded d7=>d8. And I didn't encounter the problem. So I wonder if this is a more an issue with auto-renaming doesn't function. Tagging and changing status requesting more details.
Comment #13
jeffwpetersen commentedLegacy text format machine names in Drupal 6 were numeric. These numeric machine names are being imported correctly and assigned correctly (D7 -> D8). But they are duplicates of the default formats. A plugin can be easily be written to reassign the node text format in node_contentType.yml, except for the Plain Text format.
Plain Text needs to be reassigned when the field is created in the migration process. I have yet to locate the plugin that manages assigning text formats for fields during creation.
In my installation.
Plain Text = 4
Full HTML = 3
Filtered HTML = 1
What is needed is a way to map the numeric text formats to the default formats. I assume this will need to happen when the text format is created, when a field is created and when a node is migrated.
Comment #14
heddnre #13: this is possible with static_map. However, doing that is an advanced feature and should be done on a case-by-case basis. Not by default. Migrating from a previous version of Drupal sucks over the data without any assumptions. If a specific site wants to map things, then they can. Remember, the Full HTML, etc filters are all provided by the standard profile. Many folks start out with the Minimal profile when migrating, which doesn't provide those filters.
Comment #15
jeffwpetersen commentedThe issue is the Plain Text format. Need to find a way to make sure that number (4 in my case) maps to plain_text when the field is created as well as the node is created.
So we need an optional static_map and instructions on optionally implementing it to reassign text field formats, where they are used, optionally.
Comment #16
heddnChanging this to a support request. I think the original question is answered.
Comment #17
quietone commentedComment #18
heddnBased on the logic in #2720271: Document how to perform format mappings in D6/D7 upgrade, closing this as won't fix.
Comment #19
jeffwpetersen commentedhttps://events.drupal.org/losangeles2015/sessions/writing-custom-migrati...
Having absorbed the valuable DrupalCon Los Angeles 2015 presentation on migrate I have gathered that what I need to do to assign a static_map is...
Hack Core.
And add a static map process to core/module/text/src/migrate/cckfield/TextField.php function processCckFieldValues() or getFieldFormatterMap()
or some such business.