Problem/Motivation
There's currently a lack of clarity in the half a dozen methods on HTMLRestrictionsUtilities
. \Drupal\ckeditor5\Plugin\CKEditor5PluginManager::getReadableElements()
and \Drupal\ckeditor5\Plugin\CKEditor5PluginManager::getBlockElementList()
should also be refactored out of there.
IMHO by far the clearest solution here is to stop treating this specially structured as an array. We already started doing that by introducing that utility class in the first place in #3216015: Generate CKEditor 5 configuration based on pre-existing text format configuration, where we added \Drupal\ckeditor5\HTMLRestrictionsUtilities::diffAllowedElements()
. #3222852: <dl> <dt> <dd> by introducing "Manually editable HTML tags" configuration on Source Editing made it grow a lot, and just made it clear we really need better developer ergonomics around this — the risk of bugs is otherwise very high.
Therefore I think introducing a HTMLRestrictions
value object would make this a lot less brittle, and far simpler to test:
final class HTMLRestrictions {
public static function create(string $string): HTMLRestrictions;
public function diff(HTMLRestrictions $other): HTMLRestrictions;
public function intersect(HTMLRestrictions $other): HTMLRestrictions;
public function toCKEditor5ElementsArray(): string[];
public function toFilterHtmlAllowedTagsArray(): string[];
public function toGeneralHtmlSupportConfig(): string[];
}
Steps to reproduce
Proposed resolution
Remaining tasks
User interface changes
API changes
Data model changes
Comment | File | Size | Author |
---|---|---|---|
#65 | filter_html.zip | 9.7 KB | aitala |
#62 | D937-error.txt | 9.01 KB | aitala |
#54 | diff-between-patches.txt | 577 bytes | Wim Leers |
#54 | 3228334-54-mr-commit-1632af75-9.3.x.patch | 130.25 KB | Wim Leers |
#54 | 3228334-54-mr-commit-1632af75-9.4.x-and-10.0.x.patch | 130.25 KB | Wim Leers |
|
Issue fork drupal-3228334
Show commands
Start within a Git clone of the project using the version control instructions.
Or, if you do not have SSH keys set up on git.drupalcode.org:
Issue fork ckeditor5-3228334
Show commands
Start within a Git clone of the project using the version control instructions.
Or, if you do not have SSH keys set up on git.drupalcode.org:
- 3228334-refactor-htmlrestrictionsutilities changes, plain diff MR !130
Comments
Comment #2
Wim LeersAnd as @bnjmnm pointed out: this allows us to add unit tests too! :)
Comment #3
Wim LeersWhile working on this, identified #3231327-13: Plugin definition DX: validate ckeditor5.drupal.elements items . That will mean one less
HTMLRestrictionsUtilities
call to convert.Comment #4
Wim LeersI've been working on unit tests first and foremost and have found some issues. Doing this refactor will make it far easier to detect those subtle bugs ahead of time, that would also make the automatic upgrade path not fail in those subtly broken ways (when it comes to interpreting differences between two sets of HTML restrictions, to determine what the appropriate course of action is during the automatic upgrade path).
More news tomorrow.
Comment #5
Wim Leersℹ️See #3222852-18: <dl> <dt> <dd> by introducing "Manually editable HTML tags" configuration on Source Editing and #3222852-20: <dl> <dt> <dd> by introducing "Manually editable HTML tags" configuration on Source Editing by @bnjmnm and I, where we both strongly agreed that unit tests for all these complex comparisons are essential.
Comment #7
Wim LeersCrediting Ben and I already, for the work that led to this issue.
Comment #9
Wim LeersYay, the unit test is working, and already finding problems! 🤓🥳
Comment #10
Wim LeersThe new unit test's results before
88b0f21ed7bfe56a0479063bcd51b3d423572ee4
("Fix ::intersect()"):vs after:
Progress!
Comment #11
Wim LeersYay #3231327-13: Plugin definition DX: validate ckeditor5.drupal.elements items landed :)
Comment #12
larowlanComment #13
Wim LeersPer #3249240-21: HTMLRestrictionsUtilities:: providedElementsAttributes() causes deprecations on PHP 8.1 by @larowlan: this is considered important by core committers. Not in the least because not having this done already meant it was pretty painful to figure out what were the acceptable changes to get the current code to work on PHP 8.1.
Comment #14
Wim LeersI'm working on the port of this MR!
Comment #15
Wim LeersPostponed #3231336: Simplify HtmlRestrictions and FundamentalCompatibilityConstraintValidator now that "forbidden tags" are deprecated, #3231334: Global attributes (<* lang> and <* dir="ltr rtl">): validation + support (fix data loss), #3231328: SmartDefaultSettings should select the CKE5 plugin that minimizes creation of HTML restriction supersets and #3228691: Restrict allowed additional attributes to prevent self XSS on this.
Comment #18
Wim LeersPostponed #3256616: AssertionError: assert($value === TRUE || Inspector::assertAllStrings($value)) in assert() (line 242 of core/modules/ckeditor5/src/HTMLRestrictionsUtilities.php) on this.
Comment #20
Wim LeersLooks like I triggered an infinite retesting loop in DrupalCI: https://www.drupal.org/pift-ci-job/2289771 → https://dispatcher.drupalci.org/job/drupal_patches/109074/console + https://dispatcher.drupalci.org/job/drupal_patches/109076/console + https://dispatcher.drupalci.org/job/drupal_patches/109077/ …
🙈
Asked in
#drupal-infrastructure
on Slack to stop it — I don't have the power.But … at least locally, I'm now seeing
which is identical to what I had in #10 👍 (and also identical to the last test result that was tested against the contrib project: https://www.drupal.org/pift-ci-job/2211769 for https://git.drupalcode.org/project/ckeditor5/-/merge_requests/130/diffs?...)
Comment #21
Wim LeersYay, tests are working now! The new test case I added based on the bug report in #3256616: AssertionError: assert($value === TRUE || Inspector::assertAllStrings($value)) in assert() (line 242 of core/modules/ckeditor5/src/HTMLRestrictionsUtilities.php) is passing. So this will fix that issue too. Marked that issue as a duplicate.
Please also credit
beatrizrodrigues
andDieterHolvoet
for their work on that issue! 🙏Comment #24
tim.plunkettComment #25
beatrizrodriguesthank you @Wim Leers and @tim.plunkett :)
Comment #26
DieterHolvoet CreditAttribution: DieterHolvoet as a volunteer commentedThanks for the credits and the quick fix!
Comment #27
Wim Leers76d34c21 added much-needed validation that will prevent much future pain. It caused a significant number of additional tests to fail.
4970a06a fixed the bugs that the new validation uncovered.
c65b4550 added more test coverage that I realized was necessary thanks to the preceding commit, which in turn caused more fixes.
Comment #28
Wim Leers05310198 added test coverage for
HTMLRestrictions::union()
, which was completely missing until now. Note that the implementation of::union()
is identical to that inHEAD
!Test results for the commit prior to that one:
Tests results for that commit:
So … that indicates there were multiple bugs in
HEAD
that this issue is fixing! (Now removing the tag — I will continue adding more test coverage, but at this point I think I've proven beyond a doubt that there are a number of bugs that this issue is solving. 👍)73ac5784956f5b777c2c29510130fe1433e2568f should fix those failures.
Comment #29
Wim Leers3 of the 4 remaining failures (yes that "1 fail" is actually four … blame DrupalCI) should be fixed in https://git.drupalcode.org/project/drupal/-/merge_requests/1628/diffs?co...
Comment #30
Wim Leers4f07de09 passed testing. 😱 That means that the last caller of any
HTMLRestrictionsUtilities
code is in a section ofSmartDefaultSettings
with zero test coverage.So, to be able to proceed, we need test coverage that tests that untested code path, which should then trigger a test failure. I'll add that here, but it really should not land in this MR — that makes this MR too unwieldy to review. Opened #3259174: Add missing CKE5 SmartDefaultSettings test coverage (wildcard tag with unsupported attribute) for that.
Comment #31
Wim LeersAlso filed #3259179: Split ckeditor5_alignment CKEditor 5 plugin, to allow for more precise upgrade path.
Pushed #3259174-3: Add missing CKE5 SmartDefaultSettings test coverage (wildcard tag with unsupported attribute) as a commit on this MR. That should now cause this to fail, fixing the scare we had in 4f07de09! 👍
Comment #32
Wim LeersYay, that resulted in the failure we hoped:
Comment #33
tim.plunkettBoth blockers have been committed!
Comment #34
Wim LeersWonderful! Just pushed a rebased branch. The following 3 commits are now omitted from the branch thanks to those issues landing:
That should trigger some new failures, because more bugs will now be exposed. In particular, the data structure validation added to the constructor (in
validateAllowedRestrictionsPhase4()
) should detect that there is a problem with the computed union for wildcard tags (<$block>
).It is no coincidence that wildcard handling also is the very last remaining thing for which
SmartDefaultSettings
is callingHTMLRestrictionsUtilities
. I already have extra explicit test coverage locally that I will push tomorrow, as well as progress on fixes for those.More tomorrow!
Comment #35
Wim LeersThat happened 🥳 This is the error that is causing all test failures:
… but there is no unit test yet showing what the root cause. The root cause is that the computed union (of all elements supported by all available CKEditor 5 plugins) is not correct. This is also broken in HEAD but the absence of validation makes it impossible to see.
So, just pushed https://git.drupalcode.org/project/drupal/-/merge_requests/1628/diffs?co... which adds a unit test that makes that easy to see 👍
Comment #36
Wim Leers"Not currently mergeable" → due to #3248177: Language toolbar item cannot be removed from the toolbar just having landed half an hour ago, this resulted in a simple conflict. Rebased!
Comment #37
Wim LeersAlright, as promised in #35: pushed wildcard test cases for
::union()
that reproduced the failure. Then started to apply the same wildcard test cases to::diff()
and::union()
. I realized I was doing a lot of copy/pasting … and so I refactored that into a single test method that tests all 3 operations, and a single data provider that has a lot of:test cases in its data provider. This ensures that every operation has the exact same test coverage (I had made mistakes in my copy/pasting before! 😬), and also shows very clearly how different operations have different effects. Also, I added the ability to not copy/paste needlessly: instead of a
HtmlRestrictions
object as an expectation, you can also specify'a'
or'b'
— meaning the result ofa→operation(b)
is … exactly operanda
or operandb
. This makes the test coverage a lot more maintainable.Next up: making all those tests actually pass!
Comment #38
Wim LeersWe went from
to the commit that fixed
::union()
getting us to:… and no other failures 😄
Now the commit fixing
::diff()
and::intersect()
should get us to 100% green again. 🤞Once that has happened, I can refactor away the last use of
\Drupal\ckeditor5\HTMLRestrictionsUtilities::getWildcardTags()
fromSmartDefaultSettings
like I said yesterday in #34 … 🤓🥳Comment #39
Wim LeersVICTORY: https://git.drupalcode.org/project/drupal/-/merge_requests/1628/diffs?co... → this was the last use of
HTMLRestrictionsUtilities
.#3259174: Add missing CKE5 SmartDefaultSettings test coverage (wildcard tag with unsupported attribute) added test coverage to HEAD for this piece of code. Now it is drastically simplified, thanks to the new
HTMLRestrictions
value object having::diff()
,::intersect()
and::union()
operators. That means that individual pieces of code no longer need to have complex logic (in this case: eight levels of nestedif
andfor
statements!!!!!) to interpret/compare\Drupal\filter\Plugin\FilterInterface::getHTMLRestrictions()
arrays! 🥳Note that I’m only doing minimal simplification here — the goal here is to refactor
HTMLRestrictionsUtilities
away, not to refactorSmartDefaultSettings
. So I’m resisting that temptation, to land this as soon as possible.In the commit linked above, you can see ~60 lines of complex code disappear. The complexity is now in those 3 operations. We just use those operations. And those operations are very thoroughly unit-tested. Therefore fixing any edge cases that I might have missed will be orders of magnitude simpler the next time 😊
Note that this merge request only handles the
allowed
top-level key of\Drupal\filter\Plugin\FilterInterface::getHTMLRestrictions()
—forbidden_tags
is currently ignored. But that is no regression compared to HEAD anyway. #3231336: Simplify HtmlRestrictions and FundamentalCompatibilityConstraintValidator now that "forbidden tags" are deprecated will have to add support for that. We could do it here too, but then the scope will become even bigger.Note that if you look at the total set of changes at https://git.drupalcode.org/project/drupal/-/merge_requests/1628/diffs, then you’ll notice that every file sees a net reduction in lines of code — with the exception of the new value object’s class and its test coverage of course. There’s a net increase of ~1000 lines of code. But ~700 of those are for test coverage! Much of the remaining 300 is due to additional comments.
(And again, note that
SmartDefaultSettings
will be able to get simplified further, but I consider that out of scope here.)Follow-ups that will be unblocked once this lands:
I'll push a few clean-up commits tomorrow, but this is fundamentally ready for review. I'd be happy to walk reviewers through the changes in a call! 🤓
Comment #40
Wim LeersSince my last comment:
HTMLRestrictions
FundamentalCompatibilityConstraintValidator
to use the new value object operations instead of custom array logic in 39c99b2f + de2a1b57SmartDefaultSettings
to use the new value object operations instead of custom array logic in f9956225 + 5603a62cComment #41
Wim LeersThanks to this issue, we discovered #3259179: Split ckeditor5_alignment CKEditor 5 plugin, to allow for more precise upgrade path, which led to #3259367: [upstream] Allow configuring *which* HTML tags can be aligned, which now led to #3260869: Resolve mismatch between <$block> interpretation by CKEditor 5 and Drupal …
Comment #42
Wim LeersWalked @bnjmnm through most of the commit history starting at the beginning and going all the way to https://git.drupalcode.org/project/drupal/-/merge_requests/1628/diffs?co.... Things made sense to him so far. He made 3 great suggestions:
::parse()
to::fromString()
, for symmetry with::fromFilterPluginInstance()
and::fromTextFormat()
@see
makes that very clear::union()
to::merge()
, because A)::union()
sounded unfamiliar and::merge()
does not (becausearray_merge()
), B) "union" is not actually a verb/action, and "diff" and "intersect" are, but "unionize" … means something entirely different :PComment #43
bnjmnmWhoa!
HTML restrictions are represented in several different ways depending on where they are being processed - Strings of HTML with special wildcard characters, HTML Filter config arrays, CKEditor 5 allowed tags config (that support different wildcard characters), and CKEditor 5 GHS config. These various formats need to be compared and combined - the process for doing this was distributed across multiple classes. They might be stored as a string, inside a text format, or an object such as a plugin defintion. This centralizes these representations/locations into a single object and that normalizes for comparison/combining. The HTMLRestrictions constructor accepts every possible representation and is able to be compared/combined with any other HTMLRestrictions object, regardless of how it was constructed.
@Wim Leers walked me through the entire MR. Despite the size of the diff, this is largely repackaging logic that already existed in CKEditor 5 in a manner that makes it easier to test, then improving that logic based on new tests that failed. Many of the use cases that surfaced were due to PHP's array intersect/merge etc. functions being insufficient because HTMLRestrictions config for tags and attributes can be an array or a boolean values.
Attached patch in case we've reached "needs rebase" fatigue.
Comment #45
lauriiiMoving back to RTBC due to random failure in a test case that is known for random test failures.
Comment #46
Wim LeersFYI:
1711 insertions, 511 deletions.
→ this breaks down into:
\Drupal\ckeditor5\HTMLRestrictions
\Drupal\ckeditor5\HTMLRestrictionsUtilities
\Drupal\Tests\ckeditor5\Unit\HTMLRestrictionsTest
HTMLRestrictionsUtilities
have stayed equally big (2 files), or became smaller (7 files), and their logic always became simpler.Comment #47
larowlanHiding patches because there's an MR
Comment #48
larowlanGreat work, left a review.
Love how much cleaner consuming code is with this patch.
Moving back to NR because there's a few unresolved items from both myself and @lauriii
Comment #49
Wim LeersAddressed all of @lauriii's feedback. Implicitly addresses some of @larowlan. Will do Lee's tomorrow :)
Comment #50
Wim LeersNow also addressed all of @larowlan's feedback.
P.S.: it took four minutes for this to even begin queueing a test run 🤯 Also, all those "this line changed in version X of the diff" auto-comments are useless in the context of the d.o issue, because the parent context is not displayed. So much distraction on this issue 😔
Comment #51
Wim LeersPatch to make this committable to 9.3, 9.4 and 10.0.
Comment #52
lauriiiThank you! The MR looks good for me now. All feedback has been resolved with the exception of one comment where I wasn't fully sure if it has been addressed: https://git.drupalcode.org/project/drupal/-/merge_requests/1628#note_69392. Asked @Wim Leers about that and he said he is 99% certain that he has been able to address the feedback from @larowlan. Based on that, moving to RTBC.
Comment #53
Wim LeersYep, that specific comment by @larowlan was also addressed. I just didn't mark it resolved to give @larowlan the opportunity to provide feedback, since that was the only non-trivial change in response to his feedback 😊🤓
Comment #54
Wim LeersApparently
9.3.x
differs from9.4.x
in nits —$ git diff origin/9.3.x origin/9.4.x -- core/modules/ckeditor5/src/Plugin/Validation/Constraint/FundamentalCompatibilityConstraint* core/modules/ckeditor5/src/Plugin/Validation/Constraint/SourceEd*
yields9.3-differences.txt
.git apply -3
handles these graciously.patch -p1
does not.Uploading a
patch
-compatible9.3.x
patch. Re-uploading the patch in #51 to make it clear it's targeted for both9.4.x
and10.0.x
. Also uploadingdiff-between-patches.txt
to show the actual differences between these two huge patches: it's caused entirely by9.3-differences.txt
!Comment #55
bnjmnmI'm prepared to commit this after the code freeze.
While it's certainly possible there are a few unturned stones in this giant MR, I see no evidence of anything that would have adverse impact. A few nits might still be hiding amongst the ~2000 changed lines of code. Were these to exist, they would be overshadowed by the substantial improvements provided here. Between @Wim Leers providing me a guided tour via Zoom, hours spent combing and Xdebugging this code, and most importantly the thorough + perceptive reviews from @lauriii and @larowlan, this strikes me as ready for commit. I'll be on the lookout for the thaw.
Comment #58
bnjmnmThis beast is in!
Patch applied to 10.0.x, then cherry-picked to 9.4.x. Applied the 9.3.x patch (with its adorable differences) to 9.3.x since CKEditor 5 is experimental and the only user facing changes are objective improvements in how the allowed html tags field generates its value.
Comment #60
Wim LeersYAY! This unblocks seven issues, all now have patches, one RTBC'able, one in progress, one PoC, 4 indicating next steps:
Comment #62
aitala CreditAttribution: aitala commentedThe change to FilterHtml.php from 9.36 to 9.37 took down my Drupal site... When I replaced FilterHtml.php with the previous version, the site came back up.
The change in 9.37 was:
I've attached the error that was dumped...
Thanks,
Eric
Comment #63
aitala CreditAttribution: aitala commenteddeleted
Comment #64
Wim Leers@aitala: woah! Can you share your exact
filter_html
configuration, or alternativelydrush cget filter.format.YOUR_TEXT_FORMAT
? That way I can try to reproduce the bug. I don't yet see how it's possible that this change caused breakage for you, but I take it very seriously! If you could provide me that info, I promise to dig into it immediately.Comment #65
aitala CreditAttribution: aitala commentedHere is an export of my text formats. I think review_html.yml may be the one to look at
Eric
Comment #66
acbramley CreditAttribution: acbramley at PreviousNext commentedThis broke our site as well. It will happen if any tag has a
*
for attributes. E.g I have<drupal-entity * data-*>
(obviously this looks like semi broken config but it shouldn't cause a Fatal).What happens is the code mentioned in #62 sets the allowed tags to TRUE, then in
filterAttributes
we have the following code:At this point
$tag_attributes
is TRUE and therefore we get an error in the foreachEDIT: Created #3268983: [regression] FilterHtml throws Unsupported operand types error when * used in tag attribute
Comment #67
pingers CreditAttribution: pingers at University of Adelaide commentedThanks for posting @acbramley - also experiencing the same issue with our sites D9 sites because of
<{some element} *>
Comment #68
plachThis bit us as well, I just posted a mitigation patch at #3268983-8: [regression] FilterHtml throws Unsupported operand types error when * used in tag attribute in case anyone needs it.