Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Status: Needs review » Needs work

The last submitted patch failed testing.

swentel’s picture

description' +> t(''),

should be

description' => t(''),

dmitrig01’s picture

Status: Needs work » Needs review
FileSize
1.43 KB

Can you tell i didn't try it?

Status: Needs review » Needs work

The last submitted patch failed testing.

valthebald’s picture

Version: 7.x-dev » 8.x-dev
Status: Needs work » Needs review
FileSize
1.21 KB

Bumping version, rerolling for 8.x, marking for backporting

Heine’s picture

$this->assertIdentical(check_plain('"\''), ""'", t('HTML quotes are escaped'));

' Should be escaped as '

valthebald’s picture

Corrected and added another check - for non-HTML quotes

valthebald’s picture

0x20000 converted to string before passing to check_plain, thanks, PIFR.
Also, taken invalid unicode character according to Wikipedia

Status: Needs review » Needs work

The last submitted patch, check-plain-test-393538-8.patch, failed testing.

valthebald’s picture

Status: Needs work » Needs review
FileSize
1.23 KB

Interesting.
Should check_plain escape non-HTML quotation marks? Currently it does not.

Status: Needs review » Needs work

The last submitted patch, check-plain-test-393538-10.patch, failed testing.

valthebald’s picture

There is a catch here.
As a part of the test, we pass incorrect unicode sequence to check_plain(), and expect check_plain() to return empty string.
It does, and assertion works fine.
But the test fails due to PHP warning produced by htmlspecialchars() function.
Any idea how to overcome this?
I thought of 2 solutions:
1) Change check_plain() implementation to call @htmlspecialchars() - seems bad, or
2) place set_error_handler() inside test

xjm’s picture

xjm’s picture

Not sure the best way to proceed here. As several folks in IRC said, check_plain() should really just clean input silently. So do we stick a @ in front, or what? Is it actually cleaning the input, or failing to entirely?

Meanwhile, code style nitpicks:

+++ b/core/modules/simpletest/tests/bootstrap.testundefined
@@ -507,3 +507,26 @@ class BootstrapOverrideServerVariablesTestCase extends DrupalUnitTestCase {
+class BootStrapCheckPlainTestCase extends DrupalUnitTestCase {

This class should have a docblock. See http://drupal.org/node/1354#classes and http://drupal.org/node/1354#functions.

+++ b/core/modules/simpletest/tests/bootstrap.testundefined
@@ -507,3 +507,26 @@ class BootstrapOverrideServerVariablesTestCase extends DrupalUnitTestCase {
+   * Test check_plain().

Should be "Tests check_plain()."

+++ b/core/modules/simpletest/tests/bootstrap.testundefined
@@ -507,3 +507,26 @@ class BootstrapOverrideServerVariablesTestCase extends DrupalUnitTestCase {
+    // Test non-utf8 characters.
+    $this->assertIdentical(check_plain(chr(192)), '', t('Non-UTF8 characters don\'t show up'));
+    // Test normal text.
+    $plain = $this->randomName(5);
+    $this->assertIdentical(check_plain($plain), $plain, t('Plain data remains plain'));
+    // Test a few HTML entities.
+    $this->assertIdentical(check_plain('<script>&foo'), '&lt;script&gt;&amp;foo', t('HTML entities are escaped'));
+    // Test HTML quotes.
+    $this->assertIdentical(check_plain('"\''), "&quot;&#039;", t('HTML quotes are escaped'));

The last argument (message assertion texts for the test report) should not be translated. (Yes, many you will see in the codebase are currently wrapped in t(), but there is an issue to remove them.)

valthebald’s picture

Status: Needs work » Needs review
FileSize
1.75 KB

After some short discussion in IRC,
I have suppressed htmlspecialchars() output in check_plain().
In addition to making tests happy, it gives bonus of anti-flooding defense (massive input containing incorrect unicode sequences)
There is related PHP bug here - https://bugs.php.net/bug.php?id=47494

Niklas Fiekas’s picture

Status: Needs review » Needs work

Is it actually cleaning the input, or failing to entirely?

On invalid UTF8? It should drop the whole string and return an empty one, because there are security problems with trying to actually only correct a few bytes. See ENT_IGNORE flag on htmlspecialchars().

valthebald’s picture

valthebald’s picture

Status: Needs work » Needs review
FileSize
1.7 KB

Fixed #14 stuff: added docblock, spell fix, and removed t()

xjm’s picture

I'm not sure the @ is a good solution, because that would/could hide legitimate feedback on malformed data, no? As Crell put it in https://bugs.php.net/bug.php?id=47494:

I'm working on a project that has been stalled for over a week while we try to figure out what's wrong with the character encoding configuration on our production server, only to realize that the data is (probably) bad but we didn't know it because of this bug.

Wouldn't we then just be expanding the scope of the PHP bug?

Crell’s picture

Status: Needs review » Needs work

Suppressing errors is the wrong answer in all use cases except one: ldap_open() triggers a warning on invalid login rather than returning false, so you must use it there. I cannot accurately describe my opinion of that "feature" of PHP in a family-friendly forum.

So no, @ is not the answer here. Rather, we should document the underlying PHP critical bug (I don't care what the PHP devs say, it's a critical bug that they refuse to fix. See previous comment about family-friendly forums.) To keep tests working, probably swap out the error handler right around that line combined with extensive documentation about why PHP is broken.

Incidentally, this is why trigger_error() is never the right answer.

xjm’s picture

Well, I'm also not sure that adding a special workaround just for the test is a good idea. It seems to me that the test exposes something that could trigger a slough of unwanted notices in the wild, right? So would we want to make a change to check_plain() itself?

valthebald’s picture

Issue tags: +Needs security review

1. Simpletest explicitly assumes, that code under test must not throw messages. Otherwise, tests fail.
2. That means, that current implementation of check_plain() contains bug.
I hope we all agree on 1 and 2?

Main question that remains, what is desired behavior in case of detected invalid input.
Yes, it is possible to intercept error condition in check_plain().
For example, it is possible to

  static $error_reporting = error_reporting();
  error_reporting(0);
  $result = htmlspecialchars($text, ENT_QUOTES, 'UTF-8');
  error_reporting($error_reporting);
  return $result;

or

  set_error_handler('mysterious_error_handler');
  $result = htmlspecialchars($text, ENT_QUOTES, 'UTF-8');
  restore_error_handler();
  return $result;
...
function mysterios_error_handler() {
// What happens here?

instead of just

@htmlspecialchars($text, ENT_QUOTES, 'UTF-8');

My answer to remaining question: valid behavior of mysterios_error_handler() is silent return, so let's not cheat ourselves and use @ suppression.
Any other mysterios_error_handler behavior, including attempt to log error (i.e. with watchdog()) opens vulnerability to flooding DoS attack by sending massive invalid requests. I've got explicit instruction not to log invalid user input during every security code review that I had, be it PHP or not, from different reviewers.

Note 1: error suppressing does not mean that caller of check_plain() cannot detect invalid input. In case of error check_plain() returns empty string.
Note 2: Don't get me wrong - in general, I am a strong opponent of error suppression, but not in case where error suppression is a goal.

valthebald’s picture

sun’s picture

Status: Needs work » Postponed (maintainer needs more info)
valthebald’s picture

Status: Postponed (maintainer needs more info) » Active

The following piece of code hides the problem instead of solving it (common.test, lines 372-376):

     // Ignore PHP 5.3+ invalid multibyte sequence warning.
     $text = @check_plain("Foo\xC0barbaz");
     $this->assertEqual($text, '', 'check_plain() rejects invalid sequence "Foo\xC0barbaz"');
     // Ignore PHP 5.3+ invalid multibyte sequence warning.
     $text = @check_plain("\xc2\"");

check_plain() messages are suppressed in test, but not in real usage.
So yes, there's no need in a new test, yet the problem with incorrect check_plain() behavior remains

sun’s picture

check_plain() messages are suppressed in test, but not in real usage.

The issued warnings are expected behavior, documented on php.net, nothing wrong with it.

incorrect check_plain() behavior

Which incorrect behavior? This issue is categorized as a bug, but I'm missing a concrete bug report in here?

We can certainly extend the tests to verify more expected behavior of htmlspecialchars() to ensure it treats the most important characters as intended. But I fail to see where incorrect behavior is involved?

Lastly, note that code dealing with arbitrary user input normally invokes drupal_validate_utf8() on it, before attempting to process or output it (thus, before check_plain() and related functions are getting involved). That said, the primary source for arbitrary user input does not perform this validation yet; see #943722: Form API accepts values containing invalid UTF-8.

valthebald’s picture

Title: Check Plain Tests » check_plain() issues PHP messages on invalid UTF-8 input

Valid code does not produce PHP messages.
check_plain() produces PHP messages.
Therefore - check_plain() contains invalid code. I call that incorrect behavior :)

htmlspecialchars() policy on messages is controversial, there is open bug on it (https://bugs.php.net/bug.php?id=47494), and finally, it's behavior is addressed with the new helpful parameters in PHP 5.4. Documentation of the problem does not solve the problem.

droplet’s picture

a soultion from #1090290: admin/people/permissions/roles: htmlspecialchars(): Invalid multibyte sequence in argument in check_plain()

 if (drupal_validate_utf8($text)) return htmlspecialchars($text, ENT_QUOTES, 'UTF-8');
  return htmlspecialchars($text, ENT_QUOTES); 
smartango’s picture

Damien Tournoud’s picture

Status: Active » Closed (won't fix)

I've got explicit instruction not to log invalid user input during every security code review that I had, be it PHP or not, from different reviewers.

Suppressing errors hides bugs, here or elsewhere. Your logging policy in production is your problem. It is completely independent of this: it happens down the chain, when the error has been caught and sent to the watchdog implementation.

cweagans’s picture

abedzoubi25’s picture

Status: Closed (won't fix) » Needs review
Issue tags: -Simpletest, -Needs security review, -Needs backport to D7

#18: check-plain-test-393538-18.patch queued for re-testing.

Status: Needs review » Needs work
Issue tags: +Simpletest, +Needs security review, +Needs backport to D7

The last submitted patch, check-plain-test-393538-18.patch, failed testing.

swentel’s picture

Status: Needs work » Closed (won't fix)
Niklas Fiekas’s picture

mattjbondi’s picture

I use rawurlencode() and decode for my input and works...

asPagurus’s picture

Status: Closed (won't fix) » Needs review

I works with Drupal 6.28 and have same problem.
And I Have problem with sequrity. One of bots that walks through my site, has error in its engine (imho) and does not recognize construction as <a href="http://address" class="something">. And it trying to get page with address:
http://address"%20class="something
And I have many errors in my log about incorrect byte sequences...
But my site is ok :)
And what have I to do?
IMHO it is strong sequrity issue... May be not today but tomorrow....
Sorry for my english :)

Crell’s picture

Status: Needs review » Closed (won't fix)

Please don't reopen irrelevant issue threads. That should be filed as a new support request.

int_ua’s picture

Hey, are you aware that it's kind of critical? The whole site dies after just trying to add an empty node with UTF-8 title. Is there a different issue for D7 that I can't find?

Liam Morland’s picture

Status: Closed (won't fix) » Needs review
FileSize
697 bytes
431 bytes

The documentation for check_plain() says that it returns,

An HTML safe version of $text, or an empty string if $text is not valid UTF-8.

But that is not true. If $text is not valid UTF-8, it raises a warning message (and returns empty string). At the very least, the documentation has to be changed to reflect what the function actually does.

But I think the documentation describes what people want it to do. Here are two patches. The first makes check_plain() do exactly what the documentation says. The second has it return FALSE if the UTF-8 is invalid and updates the documentation to match.

valthebald’s picture

I like approach from #40. We have an indication of incorrect UTF-8 (format_string() returns FALSE), but no PHP-level warning. Nice!

sun’s picture

If I understand https://bugs.php.net/bug.php?id=47494 and http://php.net/manual/en/function.htmlspecialchars.php correctly, then all you have to do is to update to PHP 5.4.

There are new flags like ENT_SUBSTITUTE, but for now, we still want the empty string behavior.

In any case, the warnings seem to be gone with PHP >=5.4.


Off-topic here: I think we actually want to add a PHP version condition that triggers two different code paths there:

static $is_php_54;

if (!isset($is_php_54)) {
  $is_php_54 = version_compare('PHP_VERSION', '5.4', '>=');
}
if ($is_php_54) {
  return htmlspecialchars($text, ENT_QUOTES | ENT_HTML5);
}
else {
  return htmlspecialchars($text, ENT_QUOTES, 'UTF-8');
}

But that's for another issue.

Liam Morland’s picture

Until PHP 5.4 is the minimum required, one of the patches in #40 should go in. Perhaps the version test that you suggest could bypass the call to drupal_validate_utf8() for those using 5.4 or above.

Liam Morland’s picture

Version: 8.x-dev » 7.x-dev
Issue summary: View changes
FileSize
677 bytes
411 bytes

This is no longer an issue for D8. Reroll for D7.

mgifford’s picture

latin_miracle’s picture

#44 worked like a charm on 7.26.

Thanks Liam Morland!

valthebald’s picture

Status: Needs review » Reviewed & tested by the community

Worked for me too

BBC’s picture

Same here. #44 seems to have eliminated the error for me on 7.27.

Thanks!

Status: Reviewed & tested by the community » Needs work

The last submitted patch, 44: drupal_393538_check_plain.patch, failed testing.

David_Rothstein’s picture

David_Rothstein’s picture

Status: Needs work » Needs review
valthebald’s picture

Status: Needs review » Reviewed & tested by the community
Issue tags: -Needs backport to D7

Still looks good

Status: Reviewed & tested by the community » Needs work

The last submitted patch, 44: drupal_393538_check_plain.patch, failed testing.

David_Rothstein’s picture

Status: Needs work » Needs review
David_Rothstein’s picture

Looks like a testbot fluke... but which patch from #44 is RTBC anyway? (On first glance returning an empty string makes more sense to me.)

Also wondering about the performance implications of this. Didn't we deliberately choose a minimum PHP version for Drupal 7 so that (unlike Drupal 6) check_plain() could just be a simple wrapper around htmlspecialchars() with no extra regular expressions or anything like that? See this code from system_requirements():

  // Check that htmlspecialchars() is secure if the site is running any PHP
  // version older than 5.2.5. We don't simply require 5.2.5, because Ubuntu
  // 8.04 ships with PHP 5.2.4, but includes the necessary security patch.
  elseif (version_compare($phpversion, '5.2.5') < 0 && strlen(@htmlspecialchars(chr(0xC0) . chr(0xAF), ENT_QUOTES, 'UTF-8'))) {
    $requirements['php']['description'] = $t('Your PHP installation is too old. Drupal requires at least PHP 5.2.5, or PHP @version with the htmlspecialchars security patch backported.', array('@version' => DRUPAL_MINIMUM_PHP));
    $requirements['php']['severity'] = REQUIREMENT_ERROR;
    // If PHP is old, it's not safe to continue with the requirements check.
    return $requirements;
  }

Status: Needs review » Needs work

The last submitted patch, 44: drupal_393538_check_plain.patch, failed testing.

Liam Morland’s picture

It could be either patch. Since "(string) FALSE" is empty string, it is very unlikely to make a difference. However, the one called drupal_393538_check_plain.patch just fixes the bug instead of making the very minor change to the API documentation.

Liam Morland’s picture

Having said that, I do like sun's suggestion in #42. That would also be an API change, since it removes invalid characters instead of returning empty string.

I think we should commit drupal_393538_check_plain.patch since that is the minimal change to fix the bug, making behavior match documentation. A follow-up issue could look at API improvements.

Liam Morland’s picture

Status: Needs work » Needs review
FileSize
1.39 KB

The attached patch is based on sun's idea from #42 combined with my patch from #44. I understand that the warning is only generated on PHP < 5.4, so the check is only done on the older version. That should address the performance concern.

I like sun's other ideas. I think they should go in a separate issue.

Liam Morland’s picture

Sorry, use this one.

The last submitted patch, 59: drupal_393538_check_plain.patch, failed testing.

Liam Morland’s picture

Reroll.

Status: Needs review » Needs work

The last submitted patch, 62: core-check_plain-393538-62.patch, failed testing.

Status: Needs work » Needs review
Liam Morland’s picture

Fix silly typo.

mgifford’s picture

Looks like a pretty simple patch. Not sure how it would cause a security problem.

The issue summary could use some work.

It should have unit tests...

I'd also like to have a simple description of how to replicate the problem so we can verify it's been fixed.

Liam Morland’s picture

Assigned: dmitrig01 » Unassigned
Issue summary: View changes
Issue tags: -Needs issue summary update
Liam Morland’s picture

Issue tags: -Needs tests
FileSize
1.92 KB

Patch with tests. The only difference in the test is that the error suppression is no longer needed.

Status: Needs review » Needs work

The last submitted patch, 68: core-check_plain-393538-68.patch, failed testing.

Liam Morland’s picture

I don't think the failure is related to the patch.

Status: Needs work » Needs review
mgifford’s picture

Patch applies nicely on SimplyTest.me....

Drupal 7: PHP 5.2.5 or higher (5.3 recommended). - https://www.drupal.org/requirements

So don't we still need the check? It was there in #59.

Liam Morland’s picture

#68 still does the PHP version check. The only difference with #65 is that it removes the unneeded error suppression from the tests.

#59 had the bug that it would never run version_compare() because $php_lt_54 was always set.

mgifford’s picture

Status: Needs review » Reviewed & tested by the community

Sorry, I must have been looking at an earlier version.. It's definitely there in #68.

Status: Reviewed & tested by the community » Needs work

The last submitted patch, 68: core-check_plain-393538-68.patch, failed testing.

Status: Needs work » Needs review
Liam Morland’s picture

Status: Needs review » Reviewed & tested by the community

Restoring previous status.

Status: Reviewed & tested by the community » Needs work

The last submitted patch, 68: core-check_plain-393538-68.patch, failed testing.

Status: Needs work » Needs review
mgifford’s picture

Status: Needs review » Reviewed & tested by the community

bad bot...

Status: Reviewed & tested by the community » Needs work

The last submitted patch, 68: core-check_plain-393538-68.patch, failed testing.

David_Rothstein’s picture

This is still a performance hit on the majority of Drupal sites (which aren't running PHP 5.4 yet)... how much of one I'm not sure, but a new function call which itself calls preg_match() on every single piece of text that passes through check_plain() is not obviously negligible impact.

As others have mentioned above, this warning is documented PHP behavior and we should just be able to add it to the check_plain() documentation also... right?

Liam Morland’s picture

Status: Needs work » Needs review
FileSize
1.16 KB

Time is healing this wound as PHP 5.3 is no longer supported. This patch just updates the documentation.

Liam Morland’s picture

FileSize
750 bytes

Sorry, use this one.

The last submitted patch, 83: core-check_plain-393538-83.patch, failed testing.

Crell’s picture

Status: Needs review » Reviewed & tested by the community

I think this is reasonable. Drupal 7 still supports PHP 5.2+ (as in, runs on) but that doesn't mean we need to bend over backward for people running completely unsupported software.

Status: Reviewed & tested by the community » Needs work

The last submitted patch, 84: core-check_plain-393538-84.patch, failed testing.

Status: Needs work » Needs review

Status: Needs review » Needs work

The last submitted patch, 84: core-check_plain-393538-84.patch, failed testing.

Liam Morland’s picture

Status: Needs work » Reviewed & tested by the community

Status: Reviewed & tested by the community » Needs work

The last submitted patch, 84: core-check_plain-393538-84.patch, failed testing.

Status: Needs work » Needs review
Liam Morland’s picture

I love it when a comment-only patch fails testing. ;-)

mgifford’s picture

Status: Needs review » Reviewed & tested by the community

indeed.. wonder how many times we'll have to retest this...

Status: Reviewed & tested by the community » Needs work

The last submitted patch, 84: core-check_plain-393538-84.patch, failed testing.

Status: Needs work » Needs review
Liam Morland’s picture

Status: Needs review » Reviewed & tested by the community
David_Rothstein’s picture

Status: Reviewed & tested by the community » Needs work

I tested this on PHP 5.3 and it did not behave as described in the code comment - I did not get a PHP warning. Played around with the PHP configuration a bit and https://bugs.php.net/bug.php?id=47494#1306953388 is an accurate description of how it works (backwards as that is...)

We don't have to go into that much detail in the comment, but we shouldn't imply that there will always be a warning either.

Regarding PHP 5.3, I do think it's important we still pay attention to that since it's still by far the most used version of PHP. PHP 5.4 finally just caught up to PHP 5.2's usage (sadly) and is nowhere near PHP 5.3 yet (24.5% vs. 47.4%, as of today - http://w3techs.com/technologies/details/pl-php/5/all). Also, various Linux distros, etc., will still be providing security support for PHP 5.3 for years to come.

David_Rothstein’s picture

Status: Needs work » Needs review
FileSize
853 bytes

Maybe just something like this...?

Liam Morland’s picture

Status: Needs review » Reviewed & tested by the community

Make sense to me.

David_Rothstein’s picture

Title: check_plain() issues PHP messages on invalid UTF-8 input » Document that check_plain() can issue PHP messages on invalid UTF-8 input
Component: base system » documentation
Status: Reviewed & tested by the community » Fixed

Committed to 7.x - thanks!

  • David_Rothstein committed b73583f on 7.x
    Issue #393538 by Liam Morland, valthebald, dmitrig01, David_Rothstein:...
David_Rothstein’s picture

+ * @param $text string

Actually, not sure how I missed this but that definitely should be the other way around ("string" before the variable).

Fixing now via a quick followup commit.

  • David_Rothstein committed 28c34e9 on 7.x
    Issue #393538 followup by David_Rothstein: Fixed code style error in...

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

bmateus’s picture

This error just appeared on my D7.38 installation, with PHP 5.3.

Had to solve it by patching includes/bootstrap.inc on line 1566.

function check_plain($text) {
-  return htmlspecialchars($text, ENT_QUOTES, 'UTF-8');
+  return @htmlspecialchars($text, ENT_QUOTES, 'UTF-8');
 }

Thought the patch was already applyed to the new versions of core. Isn't it?

David_Rothstein’s picture

@bmateus, the documentation patch is the one that was committed to core - we didn't commit anything that overrides PHP 5.3's behavior.