Problem/Motivation

Add validation for so called "global" attributes (see https://html.spec.whatwg.org/multipage/dom.html#global-attributes) allowed or forbidden on all elements to Drupal\ckeditor5\Plugin\Validation\Constraint\FundamentalCompatibilityConstraintValidator.

FilterHtml supports:

<* dir="ltr rtl">
https://html.spec.whatwg.org/multipage/dom.html#attr-dir
<* lang>
https://html.spec.whatwg.org/multipage/dom.html#attr-lang

As proven in #27 and further clarified in #29's refined test coverage (\Drupal\Tests\ckeditor5\FunctionalJavascript\CKEditor5Test::testFilterHtmlAllowedGlobalAttributes()), this represents a data loss bug in HEAD.

Steps to reproduce

  1. Data loss: content like this would have its lang and dir attributes stripped:
    <p dir="ltr">Hello World</p><p dir="rtl">مرحبا بالعالم</p>
    
  2. Validation: no warning about not supporting these global attributes is provided.

Proposed resolution

  1. Add support for <*> to \Drupal\ckeditor5\HTMLRestrictions as per \Drupal\filter\Plugin\Filter\FilterHtml::getHTMLRestrictions() — and require that this tag has attributes listed for it, because this tag on its own is absolutely meaningless.
  2. Fix the data loss bug by adding ckeditor5_globalAttributeDir + ckeditor5_globalAttributeLang CKEditor 5 plugins that use GHS to prevent the data loss, and automatically enable them for text formats using filter_html.

Remaining tasks

Reviews.

User interface changes

None.

API changes

None. \Drupal\ckeditor5\HTMLRestrictions is entirely internal.

Data model changes

None.

CommentFileSizeAuthor
#74 3231334-74-9.3.x.patch53.69 KBWim Leers
#70 3231334-70.patch54.63 KBWim Leers
#64 3231334-64.patch50.63 KBWim Leers
#59 3231334-59.patch42.18 KBWim Leers
#59 interdiff.txt3.35 KBWim Leers
#53 3231334-53.patch40.19 KBWim Leers
#53 interdiff.txt5.67 KBWim Leers
#50 3231334-50.patch41.04 KBWim Leers
#50 interdiff.txt1.01 KBWim Leers
#48 3231334-48.patch40.08 KBWim Leers
#48 interdiff.txt1.15 KBWim Leers
#44 3231334-43.patch39 KBWim Leers
#44 interdiff.txt862 bytesWim Leers
#42 3231334-41.patch38.51 KBWim Leers
#42 interdiff.txt3.2 KBWim Leers
#39 3231334-39.patch35.67 KBWim Leers
#39 interdiff.txt5.54 KBWim Leers
#38 3231334-37.patch30.2 KBWim Leers
#38 interdiff.txt3.05 KBWim Leers
#36 3231334-36.patch27.76 KBWim Leers
#36 interdiff.txt1.32 KBWim Leers
#35 3231334-35.patch28.5 KBWim Leers
#35 interdiff.txt15.48 KBWim Leers
#33 3231334-32-rebased.patch25.47 KBWim Leers
#32 3231334-32.patch25.36 KBWim Leers
#32 interdiff.txt1.16 KBWim Leers
#30 3231334-30.patch25.53 KBWim Leers
#30 interdiff.txt1.48 KBWim Leers
#29 3231334-29.patch25.46 KBWim Leers
#29 interdiff-24-29.txt2.45 KBWim Leers
#29 interdiff.txt5.56 KBWim Leers
#27 3231334-27.patch26.15 KBWim Leers
#27 interdiff.txt3.16 KBWim Leers
#24 3231334-24.patch23.07 KBWim Leers
#24 interdiff.txt2.67 KBWim Leers
#23 3231334-23.patch22.28 KBWim Leers
#23 interdiff.txt3.18 KBWim Leers
#21 3231334-21.patch19.56 KBWim Leers
#21 interdiff.txt5.33 KBWim Leers
#19 3231334-18.patch14.68 KBWim Leers
#19 interdiff.txt1.18 KBWim Leers
#17 3231334-17.patch14.36 KBWim Leers
#17 interdiff.txt3.15 KBWim Leers
#15 3231334-15.patch13.08 KBWim Leers
#15 interdiff.txt2.6 KBWim Leers
#14 3231334-14.patch10.46 KBWim Leers
#10 3231334-9.patch10.9 KBWim Leers
#10 interdiff.txt2.36 KBWim Leers
#8 3231334-8.patch9.67 KBWim Leers
#8 interdiff.txt2.21 KBWim Leers
#7 3231334-7.patch7.84 KBWim Leers
#7 interdiff.txt6.41 KBWim Leers
#4 3231334-4.patch2 KBWim Leers

Issue fork drupal-3231334

Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

lauriii created an issue. See original summary.

Wim Leers’s picture

Title: Add validation for attributes allowed or forbidden on all elements » [PP-1] Add validation for attributes allowed or forbidden on all elements
Project: CKEditor 5 » Drupal core
Version: 1.0.x-dev » 9.3.x-dev
Component: Code » ckeditor5.module
Status: Active » Postponed
Related issues: +#3228334: Refactor HTMLRestrictionsUtilities to a HtmlRestrictions value object

This should wait for #3228334: Refactor HTMLRestrictionsUtilities to a HtmlRestrictions value object to finish; that will make this simpler.

Version: 9.3.x-dev » 9.4.x-dev

Drupal 9.3.0-rc1 was released on November 26, 2021, which means new developments and disruptive changes should now be targeted for the 9.4.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Wim Leers’s picture

Title: [PP-1] Add validation for attributes allowed or forbidden on all elements » Add validation for attributes allowed or forbidden on all elements
Status: Postponed » Needs review
FileSize
2 KB

#3228334: Refactor HTMLRestrictionsUtilities to a HtmlRestrictions value object is in!

This issue will have to make it possible to delete the current work-around in the 3 places where it occurs. I.e. it should make the attached patch pass tests.

Status: Needs review » Needs work

The last submitted patch, 4: 3231334-4.patch, failed testing. View results

Wim Leers’s picture

Wim Leers’s picture

Status: Needs work » Needs review
Issue tags: +stable blocker
FileSize
6.41 KB
7.84 KB

Add support for the * special case used by FilterHtml.

Wim Leers’s picture

Update FundamentalCompatibilityConstraintValidator to be aware of the * special case.

The last submitted patch, 7: 3231334-7.patch, failed testing. View results

Wim Leers’s picture

Assigned: Wim Leers » Unassigned
FileSize
2.36 KB
10.9 KB

Turns out that a few of the @todos were wrongly pointing to this issue when they should've been pointing to #3231336: Simplify HtmlRestrictions and FundamentalCompatibilityConstraintValidator now that "forbidden tags" are deprecated.

That means this is basically … done! The only thing that remains is adding some test coverage for operations.

Wim Leers’s picture

Issue tags: +Needs tests

The last submitted patch, 8: 3231334-8.patch, failed testing. View results

Status: Needs review » Needs work

The last submitted patch, 10: 3231334-9.patch, failed testing. View results

Wim Leers’s picture

Assigned: Unassigned » Wim Leers
Status: Needs work » Needs review
FileSize
10.46 KB
Wim Leers’s picture

Let's run only CKE5 tests to make DrupalCI faster.

Status: Needs review » Needs work

The last submitted patch, 15: 3231334-15.patch, failed testing. View results

Wim Leers’s picture

Status: Needs work » Needs review
FileSize
3.15 KB
14.36 KB
+++ b/core/modules/ckeditor5/src/HTMLRestrictions.php
@@ -136,6 +143,14 @@ private static function validateAllowedRestrictionsPhase1(array $elements): void
+      // The `*` HTML tag is a special case: it allows specifying specific
+      // attributes that are allowed on all tags (f.e. `lang`, because
+      // translations are an orthogonal concern) or disallowed on all tags (f.e.
+      // `style`, because security is an orthogonal concern).
+      if ($html_tag_name === '*' && !is_array($html_tag_restrictions)) {
+        throw new \InvalidArgumentException(sprintf('The value for the special "*" HTML tag must be an array of attribute restrictions.'));
+      }

This added validation, but we didn't have tests for it yet.

This adds those.

Status: Needs review » Needs work

The last submitted patch, 17: 3231334-17.patch, failed testing. View results

Wim Leers’s picture

Status: Needs work » Needs review
FileSize
1.18 KB
14.68 KB
+++ b/core/modules/ckeditor5/tests/src/Unit/HTMLRestrictionsTest.php
@@ -101,6 +101,34 @@ public function providerConstruct(): \Generator {
+    yield 'VALID: joker wildcard with attribute allowed, specific attribute values forbidden' => [
+      ['*' => ['foo' => ['a' => FALSE, 'b' => FALSE]]],
+      NULL,
+    ];

This case fails. Because it's the first time we need to deal with "attribute values are forbidden".

Nothing in Drupal core can configure this, not even FilterHtml.

Theoretically, there could be filters in custom/contrib modules that implement this. But we've yet to see this.

So I want to go ahead with just expecting this test failure and adding a @todo.

Status: Needs review » Needs work

The last submitted patch, 19: 3231334-18.patch, failed testing. View results

Wim Leers’s picture

Status: Needs work » Needs review
Issue tags: -Needs tests
FileSize
5.33 KB
19.56 KB

Thorough unit test coverage. I expect this to fail, hard. Because none of the logic in \Drupal\ckeditor5\HTMLRestrictions has been updated to support this special "joker wildcard tag" yet.

Once these new unit tests pass, all tests should pass. 🤞

Status: Needs review » Needs work

The last submitted patch, 21: 3231334-21.patch, failed testing. View results

Wim Leers’s picture

Status: Needs work » Needs review
FileSize
3.18 KB
22.28 KB

Add support for operations involving the joker wildcard tag *.

This will fix the \Drupal\Tests\ckeditor5\Unit\HTMLRestrictionsTest::testOperations() failures.

Wim Leers’s picture

What's special about the joker wildcard tag: it does not resolve into concrete tags. But it is a wildcard tag. Update the logic accordingly.

This will fix the \Drupal\Tests\ckeditor5\Unit\HTMLRestrictionsTest::testSubsets() failures.

The last submitted patch, 23: 3231334-23.patch, failed testing. View results

Status: Needs review » Needs work

The last submitted patch, 24: 3231334-24.patch, failed testing. View results

Wim Leers’s picture

Priority: Normal » Critical
Status: Needs work » Needs review
Issue tags: +data loss, +Needs title update, +Needs issue summary update
FileSize
3.16 KB
26.15 KB

While working on this, I realized that this in filter_html is not actually supported by CKEditor 5 today:

    // The 'style' and 'on*' ('onClick' etc.) attributes are always forbidden,
    // and are removed by Xss::filter().
    // The 'lang', and 'dir' attributes apply to all elements and are always
    // allowed. The list of allowed values for the 'dir' attribute is enforced
    // by self::filterAttributes(). Note that those two attributes are in the
    // short list of globally usable attributes in HTML5. They are always
    // allowed since the correct values of lang and dir may only be known to
    // the content author. Of the other global attributes, they are not usually
    // added by hand to content, and especially the class attribute can have
    // undesired visual effects by allowing content authors to apply any
    // available style, so specific values should be explicitly allowed.
    // @see http://www.w3.org/TR/html5/dom.html#global-attributes
    $restrictions['allowed']['*'] = [
      'style' => FALSE,
      'on*' => FALSE,
      'lang' => TRUE,
      'dir' => ['ltr' => TRUE, 'rtl' => TRUE],
    ];

… which means that content in Drupal 8/9 with CKEditor 4 that uses the lang attribute, or sets the dir="rtl" attribute, would be lost upon editing that content with CKEditor 5.

Here's a failing test that proves it.

Status: Needs review » Needs work

The last submitted patch, 27: 3231334-27.patch, failed testing. View results

Wim Leers’s picture

Status: Needs work » Needs review
FileSize
5.56 KB
2.45 KB
25.46 KB

While #27 proved the bug, it mixes it with other test coverage. That's bad.

Furthermore, once the #3259555: Organize and rename CKEditor 5 tests and test classes test reorganization happens, it'll be even more confusing.

So let's get this test right the first time.

Wim Leers’s picture

Fix cspell + typo.

Status: Needs review » Needs work

The last submitted patch, 30: 3231334-30.patch, failed testing. View results

Wim Leers’s picture

Status: Needs work » Needs review
FileSize
1.16 KB
25.36 KB

The test coverage in #7 was wrong: it shouldn't have expected the * tag to be returned by ::getAllowedElements(), only by ::getAllowedElements(FALSE) — because only the latter returns wildcards.

Wim Leers’s picture

Status: Needs review » Needs work

The last submitted patch, 33: 3231334-32-rebased.patch, failed testing. View results

Wim Leers’s picture

I've been working to get SmartDefaultSettingsTest to pass …

😬 Only now I discovered https://html.spec.whatwg.org/multipage/dom.html#global-attributes, which in turn led me to find #2549077: Allow the "Limit allowed HTML tags" filter to also restrict HTML attributes, and only allow a small whitelist of attributes by default which introduced

    // The 'style' and 'on*' ('onClick' etc.) attributes are always forbidden,
    // and are removed by Xss::filter().
    // The 'lang', and 'dir' attributes apply to all elements and are always
    // allowed. The list of allowed values for the 'dir' attribute is enforced
    // by self::filterAttributes(). Note that those two attributes are in the
    // short list of globally usable attributes in HTML5. They are always
    // allowed since the correct values of lang and dir may only be known to
    // the content author. Of the other global attributes, they are not usually
    // added by hand to content, and especially the class attribute can have
    // undesired visual effects by allowing content authors to apply any
    // available style, so specific values should be explicitly allowed.
    // @see http://www.w3.org/TR/html5/dom.html#global-attributes
    $restrictions['allowed']['*'] = [
      'style' => FALSE,
      'on*' => FALSE,
      'lang' => TRUE,
      'dir' => ['ltr' => TRUE, 'rtl' => TRUE],
    ];

😬 This made me realize that the approach in all of the patches above has been wrong on two counts:

  1. The <*> "joker" tag is misnamed: it should be named the "global attributes" tag.
  2. It should not behave like a wildcard tag, but like a concrete tag: precisely because it is to allow a particular attribute globally, it should not be resolved into a concrete set of tags.
Wim Leers’s picture

As a nice consequence, the solution becomes very elegant too, and the original scope of the issue easily gets solved: FundamentalCompatibilityConstraintValidator starts to complain about global attributes supported by filter_html but not by CKEditor 5! 🥳

(This reverts #8's change to FundamentalCompatibilityConstraintValidator.)

The last submitted patch, 35: 3231334-35.patch, failed testing. View results

Wim Leers’s picture

As of #36, we should start to see concrete helpful failures. But … they're lacking a list of unsupported tags and attributes! That's because the violation constraint messages are using ::toFilterHtmlAllowedTagsString(), but as of #35, that's been updated to not ever generate a <* something> entry for the allowed_tags setting of FilterHtml, because that is not supported by that plugin.

So, we need to generate a list of unsupported tags/attributes slightly differently.

Now we should start to see The following elements are not supported: <* lang dir="ltr rtl"> as expected 🤓

Wim Leers’s picture

So now we've established that <*> is the global attribute `*` HTML tag. Great.

… but ckeditor5_sourceEditing today has this in its plugin definition:

     # This is the only CKEditor 5 plugin allowed to generate a superset of elements.
     # @see \Drupal\ckeditor5\Plugin\CKEditor5Plugin\SourceEditing::getElementsSubset()
    elements: ['<*>']

… that is now problematic. And it sort of always was, but that's only become clear now 🙈

Let's refactor it to set elements: [] by default.

The last submitted patch, 36: 3231334-36.patch, failed testing. View results

The last submitted patch, 38: 3231334-37.patch, failed testing. View results

Wim Leers’s picture

Yay, #36 is showing a huge increase in the number of test failures! That proves that the original scope of this issue has been met: validation for attributes allowed/forbidden on all elements (aka global attributes that are allowed or forbidden).

Time to add the plugins that we need to solve the data loss bug!

Introducing:

ckeditor5_globalAttributeDir
Supports <* dir="ltr rtl"> by using GHS
ckeditor5_globalAttributeLang
Supports <* lang> by using GHS

They're automatically enabled for text formats using filter_html.

The last submitted patch, 39: 3231334-39.patch, failed testing. View results

Wim Leers’s picture

I think this should be the final thing — making sure that we do not treat CKEditor 5 not supporting forbidding global attributes a problem at all, because CKEditor 5 specifically requires everything to be explicitly supported. If it's not supported at all, there's also no need to then explicitly forbid it.

Hopefully this will be green 🤞

The last submitted patch, 42: 3231334-41.patch, failed testing. View results

Wim Leers’s picture

Title: Add validation for attributes allowed or forbidden on all elements » Global attributes (<* lang> and <* dir="ltr rtl">): validation + support (fix data loss)
Issue summary: View changes
Issue tags: -Needs title update, -Needs issue summary update

Status: Needs review » Needs work

The last submitted patch, 44: 3231334-43.patch, failed testing. View results

Wim Leers’s picture

Status: Needs work » Needs review
FileSize
1.15 KB
40.08 KB

The WildcardHtmlSupportTest failures are simple to fix.

Status: Needs review » Needs work

The last submitted patch, 48: 3231334-48.patch, failed testing. View results

Wim Leers’s picture

Status: Needs work » Needs review
FileSize
1.01 KB
41.04 KB

One particular edge case that is being tested in \Drupal\Tests\ckeditor5\Kernel\ValidatorsTest::testPair() needed to have an expectation updated — easy!

Status: Needs review » Needs work

The last submitted patch, 50: 3231334-50.patch, failed testing. View results

Wim Leers’s picture

Looks like the remaining failures are a genuine regression that this introduced. Will look into that tomorrow.

Wim Leers’s picture

Status: Needs work » Needs review
FileSize
5.67 KB
40.19 KB

First, clean-up. Let's make this more consistent.

This way, reviews could already start without being too confusing 🤓

Status: Needs review » Needs work

The last submitted patch, 53: 3231334-53.patch, failed testing. View results

bnjmnm made their first commit to this issue’s fork.

bnjmnm’s picture

The current test failures seem to be related to html restrictions no longer restricting as expected (for example, a not-allowed <div> is permitted by CKEditor 5 instead of being stripped/converted to something supported).
I believe this is due to CKEditor 5's htmlSupport syntax behaving differently from Drupal's core

So a rule like

htmlSupport:
        allow:
          -
            name:
              regexp:
                pattern: /.*/
            attributes:
              - key: dir
                value:
                  regexp:
                    pattern: /^(ltr|rtl)$/

Seems to be saying ANY tag is allowed, and those tags may have a dir attribute.

With the plugins added in this issue I'm able to add any tag when editing source and it remains there. The newly-allowed tags can only use the attributes specified in ckeditor5_globalAttributeDir + ckeditor5_globalAttributeLang.
This could be addressed (I believe) by creating plugin classes that dynamically provide the GHS config, I'm not sure if there are aspects that should be handled upstream instead.

Wim Leers’s picture

#57 is spot-on. The solution is very simple though: we just need to not pass this global attribute * HTML tag to CKEditor 5, because it's a Drupal-specific concept, not a CKEditor 5 one … and we can instruct CKEditor 5 exactly which tags it should allow these attributes on, because we know what the full list of supported HTML tags is!

That idea is very simple, now to actually implement it 🤓

Wim Leers’s picture

Assigned: Wim Leers » Unassigned
Status: Needs work » Needs review
FileSize
3.35 KB
42.18 KB

This implements #58.

Status: Needs review » Needs work

The last submitted patch, 59: 3231334-59.patch, failed testing. View results

Wim Leers’s picture

Status: Needs work » Needs review
Wim Leers’s picture

100% unrelated test failures, from Quick Edit to Migrate 🤪

bnjmnm’s picture

Status: Needs review » Reviewed & tested by the community

Feedback addressed, this is RTBC!

Wim Leers’s picture

lauriii’s picture

Status: Reviewed & tested by the community » Needs work

Posted some feedback on the MR.

Wim Leers’s picture

Status: Needs work » Needs review

All feedback addressed 👍

Wim Leers’s picture

Error: Call to undefined method Drupal\ckeditor5\HTMLRestrictions::isEmpty()

/var/www/html/core/modules/ckeditor5/src/SmartDefaultSettings.php:178

This makes no sense at all, because that line in this MR contains return [$editor, $messages]; 🤪🙃.

There are only two possible ways that this only reason I could possibly understand this happening: GitLab is wildly broken and testing the wrong commit, or the branch needs to be rebased. Let's assume it's the latter.

Wim Leers’s picture

Ah, now it makes sense! #3261943: Confusing behavior after pressing "Apply changes to allowed tags" with invalid value just landed and it added code to SmartDefaultSettings, and that needs to be updated still. 👍

Fixed now 😄

bnjmnm’s picture

Status: Needs review » Reviewed & tested by the community

Back to RTBC. Leaving the method renaming thread open since that's a bit subjective, but I offered some feedback there.

Wim Leers’s picture

Commit 5c0879dec73257c25762825bf8b4427cab0f3b71 in patch form, should apply to all 3 branches.

  • lauriii committed 464291f on 10.0.x
    Issue #3231334 by Wim Leers, bnjmnm: Global attributes (<* lang> and...

  • lauriii committed f6058b8 on 9.4.x
    Issue #3231334 by Wim Leers, bnjmnm: Global attributes (<* lang> and...
lauriii’s picture

Version: 9.4.x-dev » 9.3.x-dev

Committed 464291f and pushed to 10.0.x. Also cherry-picked to 9.4.x. Thanks!

Leaving open for 9.3.x but that requires another patch.

Wim Leers’s picture

Wim Leers’s picture

  • lauriii committed d65692c on 9.3.x
    Issue #3231334 by Wim Leers, bnjmnm: Global attributes (<* lang> and...
lauriii’s picture

Status: Reviewed & tested by the community » Fixed

Committed d65692c and pushed to 9.3.x. Thanks!

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

Wim Leers’s picture