Hi. It seems there is a problem with javascript filtering. Steps to reproduce:

1.- New Drupal 8.1.10 installation with standard profile (also fails with 8.2.x tested in simplytest.me).
2.- Edit Basic HTML filter and adds <script type> to allowed HTML tags.
3.- Add a new basic page node, and set the following code in HTML Source mode:

<script type="text/javascript">
document.write('<p>This is a <a href="http://www.drupal.org/">test</a></p>');
document.write('<p>This is a <a href="http://www.drupal.org/">test</a></p>');
</script>

4.- After saving the node, Drupal generates the HTML you can see in the attached screenshot.

CommentFileSizeAuthor
#17 fix_script_parsing_2822525-17.patch2.57 KBstevenlafl
#15 fix_script_parsing_2822525-15.patch2.56 KBAnonymous (not verified)
#15 fix_script_parsing_2822525-15-test-only.patch852 bytesAnonymous (not verified)
#14 2822525-14.patch1.42 KBAnonymous (not verified)
Captura de pantalla 2016-10-27 a las 12.15.09.png281.46 KBskuark

Comments

skuark created an issue. See original summary.

skuark’s picture

Title: Problem with basic HTML filter » Problem with basic HTML javascript filtering
cilefen’s picture

Priority: Major » Normal
Status: Needs work » Postponed (maintainer needs more info)

The basic HTML is a text format, the settings within it are filters. The "Limit allowed HTML tags and correct faulty HTML", FilterHtml, is the filter you are dealing with.

I have no idea why you would want to add script to the allowed tags. But, understand that the entered text is filtered for a number of malformations beyond simply the allowed tags. Is there a reason why the CDATA comments are a problem?

skuark’s picture

Sorry, I said it wrong. Indeed, I was speaking about a Filter Html problem.

Do you mean I have to include the JS code inside a CDATA comment, like this?

<script type="text/javascript">
//<![CDATA[
document.write('<p>This is a <a href="http://www.drupal.org/">test</a></p>');
document.write('<p>This is a <a href="http://www.drupal.org/">test</a></p>');
//]]>
</script>

I've tried it with the same result as before. The HTML filter breaks the code in the same way. And I have to add the <script type> to the allowed tags. Without adding it, the previous code disappears completly after saving the node

Anyway, I don't know if you are referring to this, or to the CDATA comments generated by Drupal that appears in the screenshot. But it is not the problem. If you look at the screenshot, you can see the following JS code, instead of the previous JS:

<script type="text/javascript">
<!--//--><![CDATA[// ><!--

//<![CDATA[
document.write('<p>This is a <a href="http://www.drupal.org/">test');
document.write('<p>This is a <a href="http://www.drupal.org/">test');
//]]]]><![CDATA[>

//--><!]]>
</script>

Notice the missing closing </a> and </p> tags. That's the problem I was referencing.

skuark’s picture

Title: Problem with basic HTML javascript filtering » Problem with HTML Filter and embedded JS code
cilefen’s picture

Title: Problem with HTML Filter and embedded JS code » FilterHtml leaves unclosed tags inside inline scripts
Status: Postponed (maintainer needs more info) » Active

Just to be clear: letting people type inline JavaScript into body text is a bad idea, even when that person is a trusted administrator.

skuark’s picture

I agree it's a bad idea, but unfortunately it's a client requirement in our latest project. More specifically, we need to support legacy html with tons of script tags, migrated from a Drupal 6.

Anyway, I submitted that issue because, in spite of being a bad practice, I understand it's not working as expected.

droplet’s picture

Status: Active » Postponed (maintainer needs more info)

Tested. No Problems. Make sure not the CKEditor messed your JS code.

cilefen’s picture

Good point.

Anonymous’s picture

Status: Postponed (maintainer needs more info) » Postponed
Related issues: +#1333730: [Meta] PHP DOM (libxml2) misinterprets HTML5

It is very old problem of libxml2. It have bad parsing tag inside <script>. You can read more by query "DOMDocument::loadHTML() Unexpected end tag in Entity". Drupal not showing error, because it run with ignore regime:

//@file core/lib/Drupal/Component/Utility/Html.php, load()
@$dom->loadHTML($document);
skuark’s picture

Thanks. I've tested without CKEditor, and got the same problem. You can see the followed steps in this gif I've recorded: https://dl.dropboxusercontent.com/u/2430190/filter.gif

Thanks @vaplas. I'll be watching to #1333730: [Meta] PHP DOM (libxml2) misinterprets HTML5.

droplet’s picture

Status: Postponed » Active

Sorry, overlooked your point

Anonymous’s picture

@skuark, it related issue, but it will not help you now.

PS. Amazing gif!!! We believe you, just do not understand correctly first). And way to reproduce it:

$dom = new \DOMDocument();
$start = "<script><b>Hello</b> DOM</script>";
@$dom->loadHTML($example);
$end = $dom->saveHTML();

# $start: <script><b>Hello</b> DOM</script>
#  $end: <script><b>Hello DOM</script>
Anonymous’s picture

Status: Active » Needs review
StatusFileSize
new1.42 KB

Ok, this patch solves the problem of incorrect parsing <script> through a stubs. It does not look good, but who knows better solution, while we using libxml?)

Anonymous’s picture

StatusFileSize
new852 bytes
new2.56 KB

Nit improvement in performance + test

skuark’s picture

@vaplas Great! Your patch works for me :-)

Thanks a lot!

stevenlafl’s picture

StatusFileSize
new2.57 KB

I've simply made this compatible with PHP 7.2 which deprecates create_function.

stevenlafl’s picture

Version: 8.1.x-dev » 8.6.x-dev

Version: 8.6.x-dev » 8.8.x-dev

Drupal 8.6.x will not receive any further development aside from security fixes. Bug reports should be targeted against the 8.8.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.9.x-dev branch. For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

Version: 8.8.x-dev » 8.9.x-dev

Drupal 8.8.7 was released on June 3, 2020 and is the final full bugfix release for the Drupal 8.8.x series. Drupal 8.8.x will not receive any further development aside from security fixes. Sites should prepare to update to Drupal 8.9.0 or Drupal 9.0.0 for ongoing support.

Bug reports should be targeted against the 8.9.x-dev branch from now on, and new development or disruptive changes should be targeted against the 9.1.x-dev branch. For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

Version: 8.9.x-dev » 9.2.x-dev

Drupal 8 is end-of-life as of November 17, 2021. There will not be further changes made to Drupal 8. Bugfixes are now made to the 9.3.x and higher branches only. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 9.2.x-dev » 9.3.x-dev
gordon’s picture

Status: Needs review » Needs work

I have looked at this in relation to CDATA issues. I believe that this is "Works as intended". We really do not want Drupal playing with imbedded js. This is an advanced feature and could break so easy.

If you do need to to fix this, then create a custom filter which will do these corrections for you. This doesn't need to be in resolved in core when this will be something that will be individual for sites,and the fact that custom filters can be created to resolve these issues.

Lastly if this is content being migrated and this is a common issue (i.e. the code was just cut and paste from node to node) then if could be fixed during the migration.

Version: 9.3.x-dev » 9.4.x-dev

Drupal 9.3.15 was released on June 1st, 2022 and is the final full bugfix release for the Drupal 9.3.x series. Drupal 9.3.x will not receive any further development aside from security fixes. Drupal 9 bug reports should be targeted for the 9.4.x-dev branch from now on, and new development or disruptive changes should be targeted for the 9.5.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 9.4.x-dev » 9.5.x-dev

Drupal 9.4.9 was released on December 7, 2022 and is the final full bugfix release for the Drupal 9.4.x series. Drupal 9.4.x will not receive any further development aside from security fixes. Drupal 9 bug reports should be targeted for the 9.5.x-dev branch from now on, and new development or disruptive changes should be targeted for the 10.1.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

andypost’s picture

Version: 9.5.x-dev » 11.x-dev
Related issues: +#2441811: Upgrade filter system to HTML5

I bet it could be closed as filter system upgraded to HTML5

Version: 11.x-dev » main

Drupal core is now using the main branch as the primary development branch. New developments and disruptive changes should now be targeted to the main branch.

Read more in the announcement.