Problem/Motivation

When adding a <script> or <style> elements to a filtered text field Html::escapeCdataElement() gets called by a number of filters via Html::normalize() or Html::serialize() which encloses the contents of the elements in a CDATA, but this is causing some issues.

Basically I have found 3 issues.

1. The opening and closing comments are being wrapped in a HTML comment which is not applicable to a script or a style element. See #3162421: Upgrade HTML::escapeCdataElement to correctly comment CDATA for a fix.
2. Since Html::escapeCdataElement() is called from multiple filters, (i.e. the HTML corrector and the Alignment filters) the CDATA's are not being nested correctly. Infact http://wikipedia.org/wiki/CDATA#Nesting and when it is being called 7 times you get a real mess.

<style type="text/css">
/*<![CDATA[*/

/*<![CDATA[*/

/*<![CDATA[*/

/*<![CDATA[*/

/*<![CDATA[*/

/*<![CDATA[*/

/*<![CDATA[*/
.content { max-width: 100%; }
/*<!]]]]]]]]]]]]]]><![CDATA[><![CDATA[><![CDATA[><![CDATA[><![CDATA[><![CDATA[>*/

/*<!]]]]]]]]]]]]><![CDATA[><![CDATA[><![CDATA[><![CDATA[><![CDATA[>*/

/*<!]]]]]]]]]]><![CDATA[><![CDATA[><![CDATA[><![CDATA[>*/

/*<!]]]]]]]]><![CDATA[><![CDATA[><![CDATA[>*/

/*<!]]]]]]><![CDATA[><![CDATA[>*/

/*<!]]]]><![CDATA[>*/

/*<!]]>*/
</style>

3. Lastly I do not think that we should be nesting CDATA. It is not needed,

Steps to reproduce

In a text filtered field add the following

<style>.content { max-width: 100%; }</style>

If you are using a text format which uses multiple filters like HTML filter, HTML Corrector filter, Align filter, it will result in the multiple nesting.

Proposed resolution

Add a check in the escapeCdataElement() if data contains CDATA and wrap that way.

Previous proposal
I am not sure we need the nesting behaviour, But how we check for it I am not 100% sure. I.e. Should be just search for CDATA or search for "\n{$comment_start}

Html::escapeCdataElement() is only called from Html::serialize() (In core) for script and style elements so really the nesting functionality could be removed.

API changes

If we correct the nesting behaviour to produce correct nesting we may want to add a flag to turn on and off the nesting just in case someone is relying on this feature.

Comments

gordon created an issue. See original summary.

gordon’s picture

Issue summary: View changes
StatusFileSize
new1.98 KB

Here is a first try.

I have included #3162421: Upgrade HTML::escapeCdataElement to correctly comment CDATA in this patch as it is such a small change in a larger issue.

I have decided that nesting should never be done and removed it altogether. However the way I have added this it would be easy to add it back in and resolve how it should nest correctly.

I have made the check so that it will only check for the presents of "CDATA" to determine if it should not be wrapped. This means that if the user wants to handle the handling of the CDATA wrapping themselves they can do it and it will not be done. Also if they do not want the CDATA wrapping they can just add a commented out CDATA and it will not changing anything.

gordon’s picture

StatusFileSize
new1.96 KB
new665 bytes

Small cleanup on the formatting of the CDATA.

gordon’s picture

StatusFileSize
new1.96 KB
new647 bytes

Missed removing a trailing >

gordon’s picture

StatusFileSize
new1.96 KB
new596 bytes

Make sure the output is identical to the examples in https://en.wikipedia.org/wiki/CDATA#Use_of_CDATA_in_program_output

cilefen’s picture

Status: Active » Needs review
Issue tags: +Needs tests

Let’s send this to the test bots. This will need a new unit test or alterations to an existing one, if any exists.

cilefen’s picture

Status: Needs review » Needs work

Setting to needs work for test coverage.

gordon’s picture

StatusFileSize
new5.15 KB
new2.99 KB

I have fixed up the tests. Lets see if I got them all.

gordon’s picture

StatusFileSize
new5.24 KB
new708 bytes

Correct a couple of tests.

gordon’s picture

StatusFileSize
new5.23 KB
new498 bytes

Fix up another test issues.

gordon’s picture

StatusFileSize
new5.23 KB
new437 bytes

Found another issue. Try again.

gordon’s picture

StatusFileSize
new5.21 KB
new429 bytes

This must be it.

gordon’s picture

Status: Needs work » Needs review

The tests are passing now. Basically it was just changing expected results to be the new format.

gordon’s picture

Update customer.

Version: 9.4.x-dev » 9.5.x-dev

Drupal 9.4.0-alpha1 was released on May 6, 2022, which means new developments and disruptive changes should now be targeted for the 9.5.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

smustgrave’s picture

Status: Needs review » Reviewed & tested by the community
Issue tags: -Needs tests

Still applies and passes 9.5 tests.

Test cases were also added

quietone’s picture

Title: Html::escapeCdataElement() not adding CDATA correctly. » Html::escapeCdataElement() not adding CDATA correctly
Status: Reviewed & tested by the community » Needs review

Nice to see a working patch with tests.

The IS suggests adding a flag and I see no discussion here on that point. If this is changing behavior I expect we do need to consider not breaking things.

This still needs a code review and a fail patch. Setting back to NR.

smustgrave’s picture

Issue tags: +Bug Smash Initiative

Here's a tests only patch and reupload of #12

This issue came up today so thought I would move it forward.

As far as a flag change I'm not sure if that's needed anymore (could be mistaken) the current solution does not make use of any flag.

So maybe we drop that API change proposal?

smustgrave’s picture

StatusFileSize
new3.19 KB
new4.78 KB

Not sure what happened with the file upload

The last submitted patch, 19: 3259255-19-tests-only.patch, failed testing. View results

Version: 9.5.x-dev » 10.1.x-dev

Drupal 9.5.0-beta2 and Drupal 10.0.0-beta2 were released on September 29, 2022, which means new developments and disruptive changes should now be targeted for the 10.1.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

catch’s picture

Status: Needs review » Needs work
+++ b/core/lib/Drupal/Component/Utility/Html.php
@@ -345,17 +345,16 @@ public static function serialize(\DOMDocument $document) {
+        $data = $child_node->data;
+        if (strpos($child_node->data, 'CDATA') === FALSE) {
+          $embed_prefix = "\n{$comment_start}<![CDATA[{$comment_end}\n";

Can this be !str_contains() now?

The issue summary could use an update, I'm also wondering whether the current test coverage covers the nesting case that's in the issue summary.

smustgrave’s picture

Issue summary: View changes
Status: Needs work » Needs review
Issue tags: +Needs Review Queue Initiative
StatusFileSize
new0 bytes
new5.21 KB
new1.45 KB
larowlan’s picture

Status: Needs review » Needs work

The existing test method \Drupal\Tests\filter\Kernel\FilterKernelTest::testHtmlCorrectorFilter is testing \Drupal\Component\Utility\Html::normalize but none of the existing or new cases call Html::normalize twice per the originally reported issue.

So I think we should add an extra test case that does that.

I think we should also open a follow up to move that test coverage out of the filter module, into the Utility component, it doesn't use any filter features.

smustgrave’s picture

Status: Needs work » Needs review
StatusFileSize
new988 bytes
new5.73 KB

Opened [#3352404

Added another assertion that calls normalize() twice.

borisson_’s picture

Status: Needs review » Reviewed & tested by the community

This looks great, great test coverage!

  • catch committed 16a44263 on 10.1.x
    Issue #3259255 by gordon, smustgrave, cilefen, quietone, catch, larowlan...
catch’s picture

Status: Reviewed & tested by the community » Fixed

Committed 16a4426 and pushed to 10.1.x. Thanks!

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.