Problem/Motivation

Right now, when you complete a Drupal 8 migration, AFAICS in all cases, you get your node/comment titles moved over, but the bodies display as blank:

Titles, but no bodies.

The data for the fields is in the database, but getting stripped out on the front end.

According to the migrate team, this happens when filter formats not recognized by Drupal 8; they get migrated as filter_null, a filter that simply returns an empty string. However, to an "end user" it just appears that migrations don't work, even for very simple core cases where no fancy non-core input formats were used.

Major/Migrate critical because this represents a pretty significant usability hurdle, and harms user confidence in the system early on.

Proposed resolution

Instead, map mismatched filter formats to fallback_format (defaults to plain text).

The one possible reason not to do this is PHP code filter, where you could inadvertently expose database credentials that would now be exposed in plain text. However, we could probably deal with this case specially. (For example, explicitly setting that one to filter_null to indicate it needs attention.)

Remaining tasks

  • Write a patch + tests
  • Review it
  • Commit it!
  • Publish change record

User interface changes

By default, migrations will now display at least something for body field content for nodes and comments. (As well as other rich text fields.)

API changes

?

Data model changes

N/A?

Release notes snippet

Migrations now default to the fallback filter format, fixing a problem where rich text fields were not displaying post-migration.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

webchick created an issue. See original summary.

mikelutz’s picture

Of course, in actually looking into the code after the call and reminding myself how it actually works and why it's not going to be quite that simple. :-)

Formats such as plain_text, full_html, etc in the standard profile consist of a collection of filter plugins such as filter_html_escape(Display any HTML as plain text) or filter_url(Convert URLs into links) or the dreaded php_code filter included in D7. Each filter is run in succession, taking the output of the previous filter transforming it somehow, and then passing it on.

Formats are all migrated from the D7 site. If you had plain_text in D7, migrate creates a plain_text in D8. If you have a php format in d7, you get a php format in D8. When we migrate the formats, we migrate each filter in the list. They changed from an id and callback returned from hook_filter_info() in D7 to to a plugin in D8 but the core filters all have the same id.

Filters live in code. They are defined in hooks in D7 and as plugins in D8, but they cannot be migrated, they must be provided by core or an installed module.

During the format migration, the system loops through the format's filters if a D7 filter id doesn't match a D8 filter plugin id, it's the filter that is replaced with filter_null in the format's chain (which takes anything and spits out an empty string to the next filter in the chain). In core, it's usually the php_code filter that is the one getting stripped out, although a contrib or custom filter that is missing in D8 would also be converted to filter_null in the chain.

So the field has a valid format, it's just the format itself that contains one or more invalid filters.

So, we have to make some decisions. I strongly suspect the biggest offender will be the php_code filter, since that was provided in core in D7 and there is no replacement in D8. I wonder if it would be reasonable to add a new php_code filter to core which strips out php code without evaluating it (we can certainly change the name/id and map it in a migration). I also wonder if it would be a good idea to add an empty or pass-through filter to core, and maintain a list of popular contrib d7 filters that are considered safe to replace with a pass through, rather than forcing any filter that doesn't exist on the new site to be replaced with filter_null. At least then we could try to keep as many of the site's formats as possible.

Otherwise, what the original plan translates to is essentially:
Attempt to migrate a format
Loop through the format's filters
IF a filter is not found THEN
IF the filter is php_code THEN
replace the filter with filter_null
ELSE
abort migrating the format at all

After which, the field migration will run and the text fields will migrate with a invalid format, and then fallback to the fallback format.

Thoughts?

heddn’s picture

Issue tags: +Needs change record

Here's my thoughts after reviewing the notes from above.

  1. simple solution = skip_row for format if a filter is not found
  2. complex solution = let folks map filters correctly in the UI (or drush)

Let's work on simple solution first. It would require a CR, so tagging.

bogdog400’s picture

I found this page because my body fields were disappearing. After pulling out some hair, I found this error in the log:

Missing filter plugin: filter_null.

After stumbling around, I found that if I went to Home>Administration>Configuration>Content authoring and started editing the content types, I would start getting messages that "filter_null" couldn't be found and so it would be deleted when I saved the text format. After saving the text format for Full HTML, the body text started appearing.

I think it would be better to have a better default than filter_null that just deletes content.

mikelutz’s picture

@bogdog400 nothing is deleted, you simpler have to configure your formats to allow it to display.

It's tricky from a security issue. Because we don't know what the missing filter actually did, we don't know what of the content is actually safe to display, so we really can't display any of it by default. The purpose of this issue is to find some way for a person to make that decision during the migration rather than having to find out what to do later.

bogdog400’s picture

Yes, I understand that it doesn't delete content permanently, but it disappears from the page in a way that could be caused by dozens of different issues. Perhaps it's a misplaced block. Perhaps it's a permissions issue. I went through a dozen hypotheticals before finding this solution.

It might better to have an error filter that just inserts an error message with a pointer to an issue page.

Version: 8.8.x-dev » 8.9.x-dev

Drupal 8.8.0-alpha1 will be released the week of October 14th, 2019, which means new developments and disruptive changes should now be targeted against the 8.9.x-dev branch. (Any changes to 8.9.x will also be committed to 9.0.x in preparation for Drupal 9’s release, but some changes like significant feature additions will be deferred to 9.1.x.). For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

Wim Leers’s picture

#2946889-22: Missing migration filters that are replaced with filter_null may have invalid settings applied is a step in this direction.

I think the next step is to build a registry of filters and/or an audit step ahead of the migration that allows the user to choose to drop certain filters.

For example, the https://www.drupal.org/project/typogrify and https://www.drupal.org/project/image_resize_filter filters are purely cosmetic. If we'd have a registry for those, we could safely drop those during migration, which would then result in the body fields showing up right away.

Contrast this with e.g. https://www.drupal.org/project/markdown — without this, you'd get nonsensical output.

Wim Leers’s picture

Title: Set filter format in migrations to fallback filter, rather than filter_null, so that body fields show up » If no Drupal 8 equivalent filter is found during migration, drop transformation-only filters, so that body fields show up
Status: Active » Needs review
Issue tags: +Needs tests
FileSize
4.82 KB

A different proposal

  1. simple solution = skip_row for format if a filter is not found
  2. complex solution = let folks map filters correctly in the UI (or drush)

I propose an intermediate solution which combines the best of both.

In Drupal 8, filters must declare their type — this was introduced in #1782838: WYSIWYG in core: round one — filter types (change record:
https://www.drupal.org/node/1817474).

If we'd have similar metadata on Drupal 7 filters, we could make much smarter decisions automatically. Like I already mentioned in #8, not all filters are equally important.

Today's reality

Right now, any text format that uses a contrib filter gets a filter_null filter inserted automatically (the only exception appears to be the Markdown module + filter, thanks to #2711053: Migration from D7 broken in 8.1.x). Any time filter_null is used, all body fields using that text format show up as empty.

Even if the filter is purely cosmetic: typographical enhancements, URL transformations, tokens getting expanded — those things not working automatically after a migration is acceptable. It's easy to install add a filter to a text format after the migration. That's the whole point of the filter system: filters are non-destructive. Which means the user's original input data is never modified, and any fancy transformations happen at render time only.

This looks broken and is scary: it seems as if all content was lost during the migration. It makes it seem as if content was destroyed.

Proposed new reality

Maintain in Drupal 8 core's \Drupal\filter\Plugin\migrate\process\FilterID filter types for the most commonly installed contributed filters.

If a filter does not have a Drupal 8 equivalent and it is a transformation-only filter:

  1. drop that filter from the text format
  2. warn the user

EDIT: the D7 contrib filters in the patch are the first ten I encountered while going through https://www.drupal.org/project/project_module?f%5B0%5D=&f%5B1%5D=&f%5B2%... :)

Wim Leers’s picture

Rebased.

Wim Leers’s picture

Assigned: Unassigned » Wim Leers

See #3104111: [meeting] Migrate Meeting 2020-01-02 — we have +1s from Migration system maintainers @mikelutz and @heddn! 🥳

Wim Leers’s picture

I audited the top 300 editor/filter modules, or the first 6 pages out of a total of 14 pages. The pages were sorted by most installed first. Module 300 had only 89 sites using it. That's less than 0.013%.

Which means we're firmly in the 99.9% territory of D7 sites benefiting from the metadata I've gathered here.

(Note that I first used http://grep.xnddx.ru/search?text=filter_info%28%29&filename= to find all modules providing filters, but I noticed pretty quickly that the vast majority of modules were not being found, so I had to resort to a manual process of browsing d.o, navigating to gitlab, selecting the right branch, searching for filter_info.)

Wim Leers’s picture

Assigned: Wim Leers » Unassigned

I was uniquely qualified to perform the analysis in #12. But adding the test coverage and refactoring per @mikelutz and @heddn's guidance can be done by anybody who is familiar with the migration system. So, unassigning.

(I might get back to this, but if somebody else can push this across the finish line in the mean time, that'd be amazing.)

heddn’s picture

Status: Needs review » Needs work
+++ b/core/modules/filter/src/Plugin/migrate/process/FilterID.php
@@ -75,6 +77,14 @@ public function transform($value, MigrateExecutableInterface $migrate_executable
+        $message = $this->t('Filter @plugin_id could not be mapped to an existing filter plugin; omitted since it is a transformation-only filter. Install and configure a successor after the migration.', [
+          '@plugin_id' => $plugin_id,
+        ]);

We don't use t() in messages. Its because they are like exceptions.

Wim Leers’s picture

Some more stats for #12:

  • 110 contrib filters have a filter type assigned
  • 8 of which are TYPE_MARKUP_LANGUAGE, so cannot be dropped (because without them, it's not humanly possible to compare the content on the source and destination)
  • 8 of which are TYPE_HTML_RESTRICTOR, so cannot be dropped (for security reasons)
  • the 94 other filters are transformation filters, so can be dropped, because the transformations are either cosmetic or not essential for migration purposes (think: typografy, ads, tokens, animations, formatting …)

In other words, #12 enables many more D7 sites to migrate to D8 without necessarily having the same filters be available. In some cases, the filter is entirely cosmetic, in others it's only used in a relatively small number of cases so easy to fix manually, and in others it's going to be desirable to port the filter later — during the "polish" phase of the project.

mradcliffe’s picture

We don't use t() in messages. Its because they are like exceptions.

I do like this message having the plugin id so I think that sprintf would work if the ids are safe values to print. Does the message does get printed in the UI when using Migrate Drupal UI? I don't remember.

Wim Leers’s picture

#16: No, messages are not exposed in migrate_drupal_ui, they're only available via the existing "watchdog" UI.

Wim Leers’s picture

Status: Needs work » Needs review
FileSize
1.53 KB
17.96 KB
heddn’s picture

Status: Needs review » Needs work

Still needs tests. Since there's a lot of these, probably a sampling of the filter ids are needed.

+++ b/core/modules/migrate/src/Plugin/migrate/process/SubProcess.php
@@ -207,6 +207,9 @@ public function transform($value, MigrateExecutableInterface $migrate_executable
+        if ($key === NULL) {
+          continue;
+        }

This could use some docs on why this is needed. or split it out into another issue. I don't see any reason for its existence previously mentioned in this issue.

Wim Leers’s picture

Status: Needs work » Needs review
Issue tags: -Needs tests
FileSize
4.28 KB
19.94 KB

Added test coverage and fixed coding standards violations.

Wim Leers’s picture

RE: SubProcess: I'd swear I needed to do this for stuff to work … but now it's been a month since I wrote that code and I stupidly didn't write down the "why" back then 🙈

So … let's remove it and let's see if it passes!

Wim Leers’s picture

Issue tags: -Needs change record

🥳 Green! Happy to see the SubProcess hunk gone :)

Change record created: https://www.drupal.org/node/3109982

mradcliffe’s picture

Status: Needs review » Needs work
FileSize
12.87 KB
6.35 KB

I applied the patch to 8.8.1 and ran a migration from Drupal 7. This didn't seem to get me back to a working state with content being displayed on nodes.

On Drupal 7, the Filtered HTML (id: 1) text format has the following filters enabled (url, autop, wysiwyg, htmlcorrector). The only non-core one is wysiwyg filter provided by wysiwyg_filter. This should get picked up by the patch on migration, and it does, but the end result does not display the content.

On Drupal 8, the filter format migration failed to import the Filtered HTML (id: 1) text format. There are two messages:

1. Filter wysiwyg could not be mapped to an existing filter plugin; defaulting to filter_null
2. The configuration property filters.filter_null.settings.nofollow_domains.0 doesn't exist. (/mnt/gfs/ACQUIASITE/livedev/docroot/core/lib/Drupal/Core/Config/Schema/ArrayElement.php:76)

The nodes that have this text format do not show any content.

So it looks like the patch is working, but still failing to do what is proposed because the configuration schema doesn't match for filter_null and it's still trying to migrate the settings from drupal 7.

Node edit page with Text format not selected

Node view page with no content displayed

mradcliffe’s picture

Status: Needs work » Needs review

Oh, I see. There's a change in 8.9.x that makes this not compatible with 8.8.x :(

I'll put this back in Needs review and see if I can replicate in 8.8.x by applying the change to FilterSettings.

mradcliffe’s picture

Once I applied the commit from #2946889: Missing migration filters that are replaced with filter_null may have invalid settings applied, re-ran the filter format migration as an update, then the Filtered HTML text format was created, saved it, and the node edit screen displayed correctly in Drupal 8.8.1.

Wim Leers’s picture

#25: any chance you could include before vs after screenshots of what you describe in #25? 🙏😊

(Thanks for the manual testing by the way!)

myrto_a’s picture

for me it was resolved by going to admin/config/content/formats and by enabling CKeditor where it was not set

mikelutz’s picture

Status: Needs review » Reviewed & tested by the community

I like the approach and the list of contrib filters seems reasonable. The format migration is not generally one that it ran incrementally or updated, so this should provide no issues for people running ongoing migrations. It will, however help alleviate a common and confusing issue when running migrations, where body text doesn't appear to come through

alexpott’s picture

Version: 8.9.x-dev » 8.8.x-dev
Status: Reviewed & tested by the community » Patch (to be ported)

Committed and pushed f8c0cf381d to 9.0.x and adf48cb3cc to 8.9.x. Thanks!

I think we should consider backporting this task to 8.8.x as it makes migrating to D8 simpler.

  • alexpott committed f8c0cf3 on 9.0.x
    Issue #3061571 by Wim Leers, mradcliffe, webchick, heddn, mikelutz: If...

  • alexpott committed adf48cb on 8.9.x
    Issue #3061571 by Wim Leers, mradcliffe, webchick, heddn, mikelutz: If...
Wim Leers’s picture

I think we should consider backporting this task to 8.8.x as it makes migrating to D8 simpler.

+1

mradcliffe’s picture

mikelutz’s picture

#2946889: Missing migration filters that are replaced with filter_null may have invalid settings applied had already been backported to 8.8. I don't object to putting this in 8.8 too. It technically changes how a process plugin works, but as I said above, this is a migration you run once, so you either have the old behavior or the new, we are not breaking any ongoing process, plus the change is only ever an improvement, plus it's just so important to the project to make migrating as easy as possible that I'm +1 to putting this in 8.8 if possible.

alexpott’s picture

This broke PHP 7.4 due to

PHPUnit 8.5.2 by Sebastian Bergmann and contributors.

Runtime:       PHP 7.4.1
Configuration: /Users/alex/dev/sites/drupal8alt.dev/core/phpunit.xml

Testing Drupal\Tests\filter\Kernel\Plugin\migrate\process\FilterIdTest
.....PHP Fatal error:  Uncaught Exception: Serialization of 'ReflectionClass' is not allowed in /Users/alex/dev/sites/drupal8alt.dev/sites/simpletest/TestCase.php:708

I hotfixed the test to not create an exception but pass a flag in instead.

  • alexpott committed fd4711c on 9.0.x
    Issue #3061571 follow-up by alexpott: If no Drupal 8 equivalent filter...

  • alexpott committed 9dd40a1 on 8.9.x
    Issue #3061571 follow-up by alexpott: If no Drupal 8 equivalent filter...
alexpott’s picture

Here's a complete patch for 8.8.x

mikelutz’s picture

Status: Needs review » Reviewed & tested by the community
je1’s picture

Applied this patch to 8.8.x, and did "drush mim upgrade_d7_node_blog --update", the blog entries updated with correct created dates and titles but the blog entry body and summary are blanks.

got several of these from the drush command:
Missing file with ID 24647. ImageItem.php:329 [warning]

reports shows:
Location http://.../admin/config/content/formats/manage/php_code?_wrapper_format=drupal_ajax&ajax_form=1&destination=%2Fadmin%2Fconfig%2Fcontent%2Fformats
Message Missing filter plugin: filter_null.

Not sure how to go about fixing that, just migrated from d7 to d8 and need the content migrated. The book type contents worked but not the basic pages or blogs content.

Wim Leers’s picture

#21: found why it was necessary — see #3126063: Harden SubProcess process plugin.

xjm’s picture

Title: If no Drupal 8 equivalent filter is found during migration, drop transformation-only filters, so that body fields show up » [backport] If no Drupal 8 equivalent filter is found during migration, drop transformation-only filters, so that body fields show up

Got confused trying to apply this to 9.1.x. It's already been committed and the 8.8 patch is in #38.

Status: Reviewed & tested by the community » Needs work

The last submitted patch, 38: 3061571-36.patch, failed testing. View results

quietone’s picture

Status: Needs work » Reviewed & tested by the community

Not sure why this is NW the tests are passing. And I reran them to be sure. So, restoring the RTBC from #40.

Status: Reviewed & tested by the community » Needs work

The last submitted patch, 38: 3061571-36.patch, failed testing. View results

Version: 8.8.x-dev » 8.9.x-dev

Drupal 8.8.7 was released on June 3, 2020 and is the final full bugfix release for the Drupal 8.8.x series. Drupal 8.8.x will not receive any further development aside from security fixes. Sites should prepare to update to Drupal 8.9.0 or Drupal 9.0.0 for ongoing support.

Bug reports should be targeted against the 8.9.x-dev branch from now on, and new development or disruptive changes should be targeted against the 9.1.x-dev branch. For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

benjifisher’s picture

Title: [backport] If no Drupal 8 equivalent filter is found during migration, drop transformation-only filters, so that body fields show up » If no Drupal 8 equivalent filter is found during migration, drop transformation-only filters, so that body fields show up
Status: Needs work » Active

The bot moved this issue from 8.8.x to 8.9.x. Since this issue has already been fixed for 8.9.x and was open as a backport to 8.8.x, I am marking it fixed and removing "backport" from the title.

quietone’s picture

Issue summary: View changes

There is still a bit to do here. The change record needs to be published. It looks fine to me but I don't know much about filters.

benjifisher’s picture

Status: Active » Needs review

As my comment (#47) says, I meant to mark this issue Fixed, but I seem to have marked it Active. #multitasking

Yes, the CR is still unpublished. That CR refers to both this issue and the follow-up #3126063: Harden SubProcess process plugin. I am not sure what the right thing to do is. Maybe we should do our best to get that issue fixed, or maybe we should update the CR to describe how the system currently works.

I am also not sure what the status of this issue should be right now. Maybe NR is best, since at least that means it is more likely to be noticed at the weekly migration meeting.

benjifisher’s picture

Status: Needs review » Fixed

At #3154841: [meeting] Migrate Meeting 2020-07-09, we agreed that the CR can be published even though the follow-up issue is still RTBC. I published the CR, so now I will mark this issue as Fixed.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

TomChiverton’s picture

Still happens on fresh D7 to D9 migration.

Kristen Pol’s picture

I'm getting:

Missing filter plugin: filter_null.

errors during a Drupal 6 => Drupal 9 migration via Migrate Drupal UI.

UPDATE: It seems like going through and saving all the text filters may have fixed this. When I re-saved the node, it didn't but when I went and saved all the text filters, I see the content now. It's possible it's from something else but it's worth a try if people are having issues.

quietone’s picture

This results when the source uses a filter that does not exist on the destination. So not really a bug and the data has been migrated. To get it to display see the sections about filters at Known issues when upgrading from Drupal 6 or 7 to Drupal 8