[D6, D7, D8] Regression in filter_xss on php 5.4

Project:Drupal core
Component:Code
Category:bug report
Priority:Moderately critical
Assigned:Unassigned
Status:Closed (can be public)
Description

While working on [#2006568] I've discovered what appears to be a regression in filter_xss(). The unit test that should strip javascript:

<img src=\" &#14;  javascript:alert(0)\">

works fine on php 5.3, but is failing to sanitize on php 5.4 (PHP 5.4.6-1ubuntu1.2 (cli) (built: Mar 11 2013 14:57:54) . This can be tested via the simpletest UI, or via drush:

on 5.3:

jhedstrom@hyperion:~/☠ drush @undara php-eval "var_dump(filter_xss('<img src=\" &#14;  javascript:alert(0)\">', array('img')));"
string(20) "<img src="alert(0)">"

on 5.4:

jhedstrom@hyperion:~/☠ drush @d7 php-eval "var_dump(filter_xss('<img src=\" &#14;  javascript:alert(0)\">', array('img')));"
string(43) "<img src=" &amp;#14;  javascript:alert(0)">"

Comments

#1

Title:Regression in filter_xss on php 5.4» [D6, D7, D8] Regression in filter_xss on php 5.4
Status:Needs team response» Needs maintainer response

I guess this affects all versions of core?

#2

It certainly affects 7 and 8. I haven't really had any luck running 6 sites under php 5.4, but regardless, it does not appear to occur on Drupal 6:

jhedstrom@hyperion:~/w/c/utilities/drush (7.x-5.x) (@d6)☠ drush @md php-eval "var_dump(filter_xss('<img src=\" &#14;  javascript:alert(0)\">', array('img')));"
string(20) "<img src="alert(0)">"

#3

We have a public issue opened for that, for about a year: https://drupal.org/node/1210798

As far as I know, there are only two ways to exploit this:

  • As part of a <img src="javascript:alert('XSS');" />: as far as I know this only affects IE6 nowadays,
  • As part of a <iframe src="javascript:alert('XSS');"></iframe: but the <iframe> tag itself is filtered

So I would qualify this of "non-critical".

#4

Did a little research into this and that looks mostly true (http://ha.ckers.org/blog/20061014/xss-cheat-sheet-updated-for-ie70) - however, can't you still do <a href="javascript:alert('XSS')">evil link</a> in any browser and execute the JavaScript if you convlnce someone to click on it?

I don't have PHP 5.4 so I can't test if the vulnerability extends to that too, but I think it might.

#5

At a minimum it seems like https://drupal.org/node/1210798 should be critical (or at least a blocker for 8.0 since more and more folks will be hosting prod sites on 5.4 once 8 is released).

#6

Priority:Highly critical» Not critical

@David_Rothstein: For "a" tags the javascript part is filtered away by filter_xss() as expected on PHP 5.4.

The question is if there is a remaining security implication with the special character before "javascript:". Do we know a browser version that is vulnerable to that?

#7

So I did a bit more research: the test sample comes from https://www.owasp.org/index.php/XSS_Filter_Evasion_Cheat_Sheet#Spaces_an...

I tested this on browserstack with Windows XP and IE6:

<html>
<IMG SRC=" &#14;  javascript:alert('XSS');">
</html>

This successfully triggers the javascript popup in IE6 :-(

I don't quite follow where in filter_xss() html_entity_decode() is called or where the code fails?

Do we still support IE6? If not we can handle this publicly. https://drupal.org/node/61509 says IE8+, but sounds more like a recommendation (although the page says "Browser requirements"). Drupal 7 was originally released with IE6 support, does that still apply?

#8

Priority:Not critical» Moderately critical

Great research, Klausi. I think from our perspective we have to support IE6. I think that requirements page is really about javascript/css and not something more fundamental like security. Open to other opinions on that.

#9

I agree that we should not let this slide, so I unpublished https://drupal.org/node/1210798 .

Any tips where to look in filter_xss()?

#10

I ported KSES to become filter_xss under the watchful eye and tutelage of Steven; I doubt I could've done otherwise. I am looking at this but so far I can only tell you that the first half, the one before return preg_replace_callback('% is not different. I also looked into _filter_xss_attributes case 2 and I do not see a difference in the matching either. My 5.3 is the testbot 5.3, my 5.4 is 5.4.15.

#11

Status:Needs maintainer response» Needs team response

Looking closer at this I made a mistake when testing this: the check_plain() call in filter_xss_bad_protocol() transforms &#14; to &amp;#14;, so the browser test sinppet should look like this:

<html>
<IMG SRC=" &amp;#14;  javascript:alert('XSS');">
</html>

And this does NOT trigger the javascript execution on IE6. Since we have now additional standard printable characters in the protocol/scheme part of the URL we can assume that no browser confuses that with just "javascript:" and no browser would execute that as javascript.

So we have an inconvenient bug here and we need to fix this somehow on PHP 5.4, but I think there is no security issue when using filter_xss() and friends on PHP 5.4.

So I'm going to re-publish the drupal.org issue in a couple of days, let me know if you disagree or have further input.

#12

What if there is an actual U+000E character in the stream (that's the character of &#14;)?

#13

That does not matter because drupal_strip_dangerous_protocols() only allows exact matches of allowed protocols. Example " http:" with a space as the first character gets stripped out because it does not match "http". The only exception are "?", "/" and "#" characters before the protocol, drupal_strip_dangerous_protocols() has the following comment:

<?php
     
// If a colon is preceded by a slash, question mark or hash, it cannot
      // possibly be part of the URL scheme. This must be a relative URL, which
      // inherits the (safe) protocol of the base document.
     
if (preg_match('![/?#]!', $protocol)) {
        break;
      }
?>

That's the reason why "&amp;#14;  " was allowed to slip through, because it contains a hash character.

For completeness here is a script to test this anyway:

<?php
// &#14; is the decimal notation, so we use "000e" as the hex representation in JSON.
$unicode = json_decode('"' . '\u000e' . '"');
print
$unicode . "\n";
$string = "<img src=\" $unicode  javascript:alert(0)\">";
print
$string . "\n";
var_dump(filter_xss($string, array('img')));
?>

Which returns <img src="alert(0)"> as we expect :-)

#14

Status:Needs team response» Closed (can be public)

Ok, thanks for completing the analysis, Klaus.

Let's deal with further improvements publicly.

#1

Edit issue settings
File attachments
The maximum upload size is 1 MB. Only files with the following extensions may be uploaded: jpg jpeg gif png pdf odt ods odp tar gz zip patch diff txt.
Restore previously entered data