When I was running the test suite on PHP 5.2.11 I came across 2 warnings in the Core Filter tests that used check_plain. The warning reads "htmlspecialchars(): Invalid multibyte sequence in argument".

See http://img.skitch.com/20100329-gethwb38jmkx4847yq3bih8hdd.png

Following http://insomanic.me.uk/post/191397106/php-htmlspecialchars-htmlentities-... I turned display_errors on and the messages went away. This is not a good thing for production sites but provides some insight into the problem. This seems to be strange problem in some versions of PHP.

This, also, makes me wonder what version of PHP is on the test bots.

Comments

andypost’s picture

This change is described in http://api.drupal.org/api/function/check_plain/7

Why it's critical if this is standard behavior to throw warning?

EDIT: introduced in #523058: Optimize check_plain()

mfer’s picture

@andypost Drupal core should not throw any warnings. There is a notable performance hit when the the system in PHP that managers, displays, deals with notices, warnings, errors, etc comes into play. This is a performance issue.

andypost’s picture

Issue tags: +Performance

Do you this issue should be marked as duplicate of #523058: Optimize check_plain()
Or continue here? There's some benchmarks in [523058]

mfer’s picture

@andypost #523058: Optimize check_plain() is already in for D7. The warnings are still being thrown and I do not see a follow-up addressing them. It is related but I was not put it on that post since it is now about back porting the performance bump.

With performance, we need to remove all notices so that part of PHP is not invoked.

gábor hojtsy’s picture

Is this also applicable to the yet uncommitted version of the check_plain() optimization patch for D6?

heine’s picture

Priority: Critical » Normal

We cannot prevent the invocation of the error handler. It is only triggered when 1) display_errors is off and 2) an invalid utf-8 string is passed to check_plain.

As passing an invalid string to check_plain is likely a rare event, the performance penalty of handling the warning is dwarfed by the performance gain of having htmlspecialchars do the validation vs. having to call drupal_validate_utf8 (the only way to prevent the warning).

mfer’s picture

Priority: Normal » Critical

@Heine The core tests are causing the error to be thrown in both PHP 5.2.11 and 5.3 on single quotes.

heine’s picture

Priority: Critical » Normal

mfer, the warning precedes the testresult (those with invalid utf8) afaik. A warning on a single quote would not make sense.

mfb’s picture

FYI the patch at #348448: Always report E_STRICT errors allows the test suite to pass cleanly on PHP 5.3. I worked around the htmlspecialchars() warnings when testing invalid UTF-8 by preceding the function call with "@" error suppressor.

jrchamp’s picture

It's difficult to claim that the test suite is a valid source of "performance" issues as well. After all, the memory requirements of the test suite is very large. Last I checked, it was hemorrhaging several megabytes each time it loaded a test section within a test group. I never could identify where it was coming from, but I was pretty sure that it was loading duplicate items into memory.

It would make sense to look at mfb's patch on #348448: Always report E_STRICT errors because that needs to get solved before launch anyway.

mfb’s picture

Status: Active » Closed (duplicate)

It's a good thing that check_plain throws warnings, it alerts you of a serious problem. You could lose a piece of content because it contains some invalid UTF-8 and has been rendered as an empty string. Unfortunately these warnings are disabled if display_errors is on -- PHP WTF!

anyways now that the tests pass maybe we can mark this as a duplicate?