On /admin, 14% of page execution time is taken up by check_plain() (inclusive - since that also calls drupal_validate_utf8(), 50% of these calls are from drupal_attributes(), many of which are running check plain on string literals added as classes.
This is a proof of concept patch which adds a PASS_THROUGH option to drupal_attributes similar to drupal_set_title(), and converts theme_links() to use it for the classes on the list item tags which are all generated internally with no user-supplied data.
drupal_attributes() is very conservative, only setting PASS_THROUGH to TRUE if every single attribute has PASS_THROUGH set.
On a node with 300 comments, this reduces check_plain() calls from 8,000 to 6,909, and reduces check_plain's share of page execution from 14% to 12% (drupal_attributes() goes up by .5%).
Obviously there's more places we can convert, but we also need to make sure we're only using PASS_THROUGH for literals, not for anything else, and it wouldn't hurt to get an early review from Heine/chx on whether this is won't fix or not for security reasons.
| Comment | File | Size | Author |
|---|---|---|---|
| #10 | 522716_drupal_attributes_10.patch | 2.52 KB | scor |
| #8 | 522716_drupal_attributes_8.patch | 1.48 KB | scor |
| #1 | head.png | 246.18 KB | catch |
| #1 | patch.png | 245.17 KB | catch |
| attributes.patch | 1.8 KB | catch |
Comments
Comment #1
catchObligatory webgrind screenshots.
Also there's no API change here for anything calling drupal_attributes(), if you don't use this property, it'll still continue to behave as before.
Comment #2
moshe weitzman commentedIt is going to be rare to pass a user supplied text as a value. i would not mind removing check_plain() from this function and expecting caller to do it.
Comment #3
catchThat'd make this considerably less ugly, I think it's mainly 'title' that gets user supplied text, we could also consider check_plain() on 'title' but expecting the caller to do the rest.
Comment #4
catchIf #523058: Optimize check_plain() gets in this becomes a bit less critical, but it doesn't make it cheap, just less expensive. I may look at removing the check_plain() here and see what we get, but that'd need to be hammered in the upgrade documentation if we do that.
Comment #5
catchComment #6
catchToo late for this one.
Comment #7
scor commentedlet's re-evaluate this for D7 as a partial solution for #687712: Optimize rdf.module for displaying multiple comments. The RDF module calls drupal_attributes() a lot (esp. when displaying comments), and does not deal with any user-input data.
Comment #8
scor commentedrerolling. benchmarks coming soon.
Comment #10
scor commentedhow about this approach more in line with drupal_set_title? less overhead since we do not check for the pass_through flag for each attribute, but at the beginning of drupal_attributes(). I can measure a 0.4% positive performance impact when displaying 50 comments.
Comment #11
catchThat looks much neater to me.
Comment #12
scor commentedHeine's concerned about this approach:
[10:03]
output .= '<li' . drupal_attributes(array('class' => $class), PASS_THROUGH) . '>';[10:03] Why use drupal_attributes here?
[10:06] Heine: there is no user input in $class
[10:06] Heine: PASS_THROUGH speeds up the function, as long as you can ensure all its data has been sanitized
[10:06] scor, right, but why not just add this as a string literal in such cases?
[10:09] scor, so what happens when I add a preprocess hook that adds to the attributes variable?
[10:10] scor, then I have to lookup what flag is used on the final call to see whether I have to check_plain myself.
[10:12] Heine: yes, unless you're not dealing with user input at all
[10:12] scor, ampersands and quotes also need to be escaped
Comment #13
gerhard killesreiter commentedTo me it seems as if Heine's concerns would make this a "won't fix".
Comment #14
damien tournoud commentedWon't fix. HTML-encoding attributes is *not* optional. It doesn't matter if the attribute comes from user-supplied data or not. It's an HTML context, attributes are plaintext: you have to encode them. It's not a question of security, but a bare question of HTML correctness.