Problem/Motivation
Xss::filter()
does not handle HTML tags inside attributes. The respective core issue for this is #3227831: Xss::filter() does not handle HTML tags inside attribute values. However, since the core issue requires changing Xss::filter()
, it's unlikely to be solved any time soon. Ideally we could change how attributes are encoded to ensure compatibility with Xss::filter()
. This is required for for #3222808: Follow-up for #3201646: markup in image captions is lost. However, this could be reproduced outside of that issue in some edge cases. For example, if I wanted to add alt="</em> a strange alt text"
attribute to an img
element, CKEditor 5 would happily allow us to do that through the UI.
Proposed resolution
Encode attributes to match expectations by Xss::filter()
.
Remaining tasks
User interface changes
API changes
Data model changes
Issue fork ckeditor5-3247246
Show commands
Start within a Git clone of the project using the version control instructions.
Or, if you do not have SSH keys set up on git.drupalcode.org:
Comments
Comment #2
Wim LeersBut this is not a case where we need to support it: we only need to support it for
data-caption
. For all other attributes, we can strip away all HTML, can't we? Even if CKEditor 5 does not do that by default.Comment #3
effulgentsia CreditAttribution: effulgentsia at Acquia commentedRe #3222808-20: Follow-up for #3201646: markup in image captions is lost, HEAD's
Html::normalize()
function already converts<
and>
inside of attribute values to<
and>
.So a possible solution here is to make sure
Html::normalize()
is called beforeXss:filter()
. The question is where? I think the best option would be to callHtml::normalize()
when saving. Not sure where the best place to do that would be if the goal is to only do it when saving a textarea that was edited by CKE5.Comment #4
Wim LeersWe can't do that, at all.
We should not be processing the HTML generated by CKE5 after it's done its job and before it gets saved. That'd be violating the abstraction layer provided by the Text Editor module.
Comment #5
effulgentsia CreditAttribution: effulgentsia at Acquia commentedAnother approach is to change
FilterHtml::process()
to callHtml::normalize()
before callingXss::filter()
. It already normalizes the HTML after calling that, viafilterAttributes()
, so given that, I don't think that normalizing before as well would regress anything. But if we're going to take this approach, let's do it as a separate core issue, since then that's not scoped to ckeditor5 module.Comment #6
Wim LeersThat's my point exactly: either we need to modify Drupal core (and that has huge risks, not in the least in terms of timeline), or we need to tweak CKEditor 5. We previously concluded we needed to tweak CKEditor 5. We got the infrastructure we needed to be able to do that. But … it appears to be inadequate, I just don't understand why yet. See #3222808-26: Follow-up for #3201646: markup in image captions is lost, where I'm asking @lauriii for clarifications — because I definitely feel lost right now 😳.
Comment #7
lauriiiWe can encode the attributes to match
Xss::filter()
expectations by overriding the CKEditor HTML Writer as mentioned in #3222808-26: Follow-up for #3201646: markup in image captions is lost. I created this spinoff because I realized this problem goes beyond just caption markup. This is about<
and>
characters in all attributes.If we want to match CKEditor 4 behavior closely as possible, we would have to encode HTML in all attributes.
Comment #9
lauriiiComment #10
Wim LeersEpic work here, @lauriii!
Comment #11
Wim Leers🚀
Comment #12
Wim LeersAny idea what could cause this? 😳
Comment #13
Wim LeersThis looks great, no further remarks, just the single failure left to be addressed! 🥳
EPIC work!
Comment #16
lauriii