Problem/Motivation
CKEditor 5 changes the HTML Structure almost immediately. This doesn't affect the pre-existing HTML structure of different pages until and unless we open those respective nodes in /edit mode.
Steps to reproduce
- Setup a D10 Site.
- Enable CKEDitor 5
- Configure any text format to use "CKEditor 5" as the text editor in /admin/config/content/formats
- Input the following in "Source" -
<div class="social-media"> <span>Share</span> <span class="icon"> <a href="#" target="_blank" rel="noopener"> <em class="fa-fw fa-twitter fab"> </em> </a> </span> </div> - The structure gets changed into -
<div class="social-media"> <span>Share</span> <a href="#" target="_blank" rel="noopener"> <em class="fa-fw fa-twitter fab"> <span class="icon"> </span> </em> </a> </div>
Proposed resolution
Make sure that the HTML Structure doesn't get changed.
Comments
Comment #2
wim leersThat markup looks like you're using https://www.drupal.org/project/fontawesome. Are you?
See #3274028-8: CKEditor 5 compatibility and #3274028-29: CKEditor 5 compatibility.
Comment #3
ighosh commentedHi Wim, I am giving another example of the HTML DOM Restructuring which is not how I need it to be -
Input -
<div class="container"> <span class="icon"><a href="#" target="_blank"><em>SOMETHING</em></a></span></div>Output -
<div class="container"><a href="#" target="_blank"><em><span class="icon">SOMETHING</span></em></a></div>The main issue with this kind of restructuring is that it affects all the existing nodes the moment they are opened up in "/edit" and re-saved. I am currently backtracking on what happens when I click on the "Source" button. Maybe some .js file gets called which in turn filters out and restructures the DOM(?).
Comment #4
wim leersI can see how that's disruptive. But that sure looks like some pretty questionable HTML 😅 That makes this difficult to describe and report. Can you still reproduce this without the
<em>too? Try to find the smallest possible pattern, and then verify that it works with multiple tag combinations. That'd help report this upstream, and would result in a higher priority upstream.Comment #5
ighosh commented@Wim, I removed
<em>from my DOM -It is getting changed into -
However, upon further testing, this is not the only instance where the DOM is getting changed. I am checking on a few more instances of HTML structure, where the DOM is getting changed. Will keep everything updated here.
Comment #6
wim leersThanks!
Comment #7
ighosh commentedTested for these cases -
Input -
<a href="#"> </a>Output - Entire Thing Got Removed. However, if I enter
<a href="#">Lorem Ipsum</a>, it works.Another test case -
Input -
Output -
Here, the anchor tag that contains the main
<svg><path></path></svg>is being copied after the main anchor tag, and put inside a paragraph (<p></p>).Comment #8
ighosh commentedComment #9
mvonfrie commentedThat is related to the HTML normalization "feature" of CKeditor 5. See https://github.com/ckeditor/ckeditor5/issues/16203 for more examples.
Comment #10
quietone commentedComment #11
ighosh commentedRegarding this issue, I found that there was no easy way to "fix" the problem. As this is not an issue in the first place. Meaning, that CKEditor 5 was altering the HTML code because the code itself was wrong (obviously). So, I updated the structure of the DOM via code using an update hook to queue all nodes where I needed my DOM processing to take place and then created a QueueWorker to process the DOM.
Here is a gist of how the work has been done. Please note that I have targetted only those nodes using some specific paragraph components as the DOM alteration was taking place in those nodes containing some specific components.
Update Hook -
QueueWorker -
Then, in the QueueWorker, used switch-case to target each paragraph and its corresponding DOM processing.
Hope this helps someone :)
Comment #12
mvonfrie commentedWhy is your first example
<a href="#"> </a>(obviously) wrong? Syntactically it is totally correct, but of course semantically this doesn't make sense because the user will never be able to click this link. If this for some reason is used as a trap link (kind of honeypot) with a special url, you would know that the link has been "clicked"/followed by a robot and not a human, then it makes sense again.Would be interesting what CKeditor5 does with this?
<a name="top"> </a>This is an invisible anchor which can be used as jump target (a "Top" button at the end of the page or floating at the bottom to jump back to the start of the page (after header, banner image etc.).In my opinion, CKeditor5 should correct syntactically wrong markup but not interpret syntactically correct markup which maybe makes no sense, as it cannot know a developer's intentions.
Comment #13
skowyra commentedWe've also been running into this behavior; html tags and classes get stripped out in CKEditor 5. I can see where normalization could be the culprit, but in our case, we have a clunky work-around when the behavior occurs. If resaving doesn't work after several attempts, we copy the content (Source), paste into a text editor, add the new class or html there, copy the updated content back. That usually works.
The fact that we can eventually save it indicates that normalization would not be the root cause. Let me add, this behavior occurs in nodes and webforms, plus we use Site Studio where it occurs in our components.
This started happening when we upgraded to CKEditor 5. We're currently on Drupal 10.2.4, but will be going up to 10.3 very soon.
Comment #14
lisa.rae commentedAlso encountered this issue on a site that was recently upgraded from CKEditor4 to CKEditor5. Edited a footer block that was created with CKEditor4 to make a minor text edit. The block also contained fontawesome social media icons, which were not affected by the text edits made.
Saving the footer resulted in the FontAwesome social media icons getting wrapped in
<em></em>tags.Original content:
Changed to: