On a page with 90 comments, 437 times to be precise, with time spent in self at 20658 (total 50,000) on one request according to cachegrind - putting it into our top 20 expensive functions.
Damien found the bad call to check_url() and dereine patched it at #272854: Allowing '#' as a link with the l() function, but I'm splitting it out to here since it deserves it's own issue and a backport.
Quick benchmarks on a taxonomy term page with 30 nodes.
HEAD:
6.38 reqs/sec
Patch:
6.44 reqs/sec
Comment | File | Size | Author |
---|---|---|---|
#32 | check_url_D6.patch | 822 bytes | smk-ka |
#26 | check_url_7-2.patch | 2.05 KB | mfer |
#23 | check_url.patch | 2.05 KB | catch |
#19 | check_url.patch | 11.35 KB | catch |
#17 | check_url.patch | 11.3 KB | catch |
Comments
Comment #3
catchI should mention that with the patch, the function calls on that same page go down to 2. Here's kcachegrind screenshots.
Comment #4
catchProper status.
Comment #5
catchWe don't have proper tests for l() and url() yet, unless I've missed them. Could do with some reviews on the logic though as well.
Comment #6
catchThis would've been a security hole when $options['external'] was set. But here's a slightly modified version with some initial tests for l() - which as far as I can see looks quite solid. Could do with some serious review from someone with better XSS chops than me.
Comment #7
neclimdulinterest
Comment #8
catchDiscussed this with chx and lyricnz in #drupal - I think we should centralise all the XSS examples from http://ha.ckers.org/xss.html in either common.test or drupalWebTestCase so we can do foreach ($this->xss_examples as $xss) wherever we need to do unit tests with functions which are supposed to filter XSS.
That should reduce a lot of duplication. Will try to get a patch started later tonight, but of course anyone else is welcome to.
Comment #9
neclimdulThrew together a quick script that builds a test case that contains an array of all the test cases using the xml version of the ha.ckers.org list. It is fairly simple and expects the xml to be downloaded into the same directory the script is in and outputs the code to stdout not a file.
Its not really tested much past "makes valid php" but it should be a useful starting point.
Comment #10
Dries CreditAttribution: Dries commentedGreat catch, catch. This looks fine to me.
Comment #11
catchHere it is with a better test using neclimdul's script. Both HEAD and the patch pass all tests. Adds CommonXSSTestCase() so we can use it for other functions.
$options['external'] isn't a valid option for l() - although it's used in core twice (powered by Drupal and a link to OpenID) which probably needs tidying up, so this is the same patch as #1 + the tests.
Comment #12
Dries CreditAttribution: Dries commentedPatch no longer applies.
Comment #13
catchRe-rolled.
Comment #17
catchgrrr.
Comment #18
catchComment #19
catchDiscussed this with neclimidul in IRC - since some of these attacks assume they'll be partially filtered, just checking for the same string might not be sufficient. So instead I've modified the test so that instead we confirm that the output of check_url(url($path)) is found inside the output of l(). This pretty much ensures that the output from l() is identical to what we'd get in the existing code - and means that since the test still passes we really, really don't need the extra call.
Comment #20
Dries CreditAttribution: Dries commentedIt seems to me that the new test is a bit brain dead. Even when the escaping is broken, they would return the same result.
Comment #21
catchWell the idea there was to confirm that whether we wrap the url() call in check_url() or not, we get the same result - confirming that the call is redundant. I've got my eye on some drupal_attributes() calls as well since we do that even if we've only added 'active' as a class - and again it'd be good to have a sanity check just to make sure there's no over-optimisation there.
We obviously need full unit tests for check_plain() check_url(), filter_xss_admin() etc. to ensure they actually escape properly - but while I'm hoping the examples in this one will be useful for those tests, I don't think they're needed for this patch.
Comment #22
Dries CreditAttribution: Dries commentedIf the only goal is to validate whether we wrap the url() call in check_url(), we don't need to run all of the XSS tests. It would suffice to run one -- not?
Comment #23
catchRe-rolled with only one example. Will switch the XSSTestCase into a new issue so we can add it along with proper tests.
Comment #25
mfer CreditAttribution: mfer commentedIf someone doesn't beat me to it I'll review this on Thursday.
Comment #26
mfer CreditAttribution: mfer commentedUpdated to follow head. Exact same patch as #23 just different offsets. Ran tests and manually tested change. Looks good.
Comment #27
Dries CreditAttribution: Dries commentedCommitted to CVS HEAD. Thanks!
Comment #29
Dave ReidMarking to be ported to 6.x
Comment #30
mr.baileysThis change caused an issue in HEAD, since filter_xss_bad_protocol was also check_plaining its argument. By implementing this, check_plain is no longer called, and the output can contain invalid markup (see #399488: Invalid markup generated by l().).
Comment #31
mr.baileysSorry, I shouldn't have changed the status since there is no patch yet.
Comment #32
smk-ka CreditAttribution: smk-ka commentedD6 backport combines this patch with #399488: Invalid markup generated by l()..
Comment #33
dpearcefl CreditAttribution: dpearcefl commentedHas this been fixed in the latest D6?
Comment #34
dpearcefl CreditAttribution: dpearcefl commentedComment #35
neclimdulDon't know why #354812-32: filter_xss_bad_protocol is called hundreds of times on some pages. can't be reviewed. patch visually looks correct and applies.
Comment #36
dpearcefl CreditAttribution: dpearcefl commentedSometimes the testbot ignores the patch if the filename is non-standard.
[description]-[issue-number]-[comment-number].patch
http://drupal.org/node/707484
You will need to resubmit the patch and change the status to needs review.
Comment #37
neclimdul@dpearceMN I'm pretty sure there are no d6 tests for core so testbot won't run on it.
Comment #38
c960657 CreditAttribution: c960657 commentedI think this optimization breaks the current behaviour of D6 in cases like this:
l('foo', 'baz://foo.bar', array('external' => TRUE))
According to #354812-11: filter_xss_bad_protocol is called hundreds of times on some pages., $options['external'] is not a valid option for l(), but according to the documentation of l() in D6 the function accepts all options supported by url().
So even though I think this is a good optimization for D7, I'm afraid we cannot use the same approach in D6 without breaking the existing behaviour in a way that may have some security impact.
Comment #39
Anonymous (not verified) CreditAttribution: Anonymous commentedI'm going to guess this is a won't fix for D6, since D6 commits focus on security fixes at this point.
Comment #40
Anonymous (not verified) CreditAttribution: Anonymous commentedChanging issue status to reflect that it was fixed in 7.x.