In PHP 5.2, we now have filter_var() and related functions that did not exist previously.

These functions _may_ be able to replace the following Drupal 6 functions.

filter_xss*()
check_plain()
check_url()
check_markup()
valid_email_address()
valid_url()

See http://us.php.net/manual/en/book.filter.php.

We need new functions and tests to confirm that we can use the PHP standard functions, which will improve DX by removing Drupal-specific code that is no longer needed.

Comments

sun’s picture

Issue tags: +DrupalWTF

Adding to the mixture of function names:

- drupal_validate_utf8()
- drupal_valid_http_host()

Similar, so potentially also:

- drupal_query_string_encode()
- _fix_gpc_magic()
- _fix_gpc_magic_files()
- drupal_urlencode()
- drupal_valid_token()

Note that

- check_file()

does not really belong into that list, but shares the same prefix.

sun’s picture

Also note that we're already using filter_var() in http://api.drupal.org/api/function/valid_email_address/7

agentrickard’s picture

Weird.

So my first question is: Do we want Drupal-specific wrapper functions in future?

sun’s picture

I'm inclined to say "yes", because

- if all validation and sanitation functions use a common prefix and function name pattern = better DX

- if we rely on PHP's own filter functions for some, but not for others = bad DX

- PHP's own filter functions require developers to know and use cryptic flags and options, so there is no guaranteed standard = bad for security.

catch’s picture

Subscribe. check_plain() and drupal_validate_utf8() take up a lot of time on some page requests (more than 10% of the entire request sometimes). Good to explore other options.

damien tournoud’s picture

Status: Active » Closed (won't fix)

Let's study this quickly and close this issue:

  • filter_xss*(): there is no XSS-filtering in core PHP
  • check_plain(): there is no direct equivalent (our function also prevent invalid UTF-8 from slipping thru), the encoding part would be filter_var($var, FILTER_SANITIZE_SPECIAL_CHARS), which has no added value compared to the htmlentities() equivalent... (by the way, I have no idea which kind of encoding filter_var() supports... maybe it is local dependent? It doesn't seems to be documented)
  • check_url(): no equivalent (there is a FILTER_SANITIZE_URL, but it doesn't protect against evil URLs)
  • check_markup(): no equivalent (obviously), this function is misnamed on Drupal end
  • valid_email_address(): #308138: Make valid_email_address() support IDNs changed that to use FILTER_VALIDATE_EMAIL, but it hasn't proven to be such a good idea, see the issue for discussion
  • valid_url(): was studied in #308977: Replace valid_url with filter_var and FILTER_VALIDATE_URL, and rejected

About others pointed out by sun:

  • drupal_validate_utf8(): I'm not aware of any other way to do this, even if the hack that we use right now is not really full-proof
  • drupal_valid_http_host(): PHP has a FILTER_VALIDATE_IP, but no FILTER_VALIDATE_HOST
  • drupal_query_string_encode(): this one is a smart, recursive query string encoder, no equivalent in PHP
  • _fix_gpc_magic() / _fix_gpc_magic_files(): I think we should drop that completely. But of course that's a separate issue.
  • drupal_urlencode(): needs discussion. Was required at a time for the path part of the URL, but we could probably drop that now (see #298498: Drupal assumes mod_rewrite bugs which are not there and #310139: drupal_query_string_encode() should not call drupal_urlencode())
  • drupal_valid_token(): this is not an input validation function.

All in one, I see no patterns there. Closing this issue, let's discuss some potential refactoring separately.

agentrickard’s picture

Nice to have this all in one place. sun and I were having a separate IRC conversation and were not aware.

sun’s picture

joshmiller’s picture

Issue tags: -DrupalWTF

Cleaning up DrupalWTF list... Since this is a duplicate, removing tag...