This information is from the code coverage report (see http://coverage.cwgordon.com/coverage).

We need to test:

1) 'Escape all HTML' filter in action.
2) URL filter with rel=nofollow enabled.
3) HTML corrector with unclosed tags.
4) URL filter with insanely long links.
5) Filtering of XSS in attributes (e.g. onClick=steal your cookie)

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

floretan’s picture

Title: Tests needed: filter.module » TestingParty08: filter.module
catch’s picture

URL filter is broken so probably not a good idea to include those additions in the testing party.

So...

1) 'Escape all HTML' filter in action.
2) HTML corrector with unclosed tags.
3) Filtering of XSS in attributes (e.g. onClick=steal your cookie)
And I'll add:
4) HTML corrector with redundant closing tags.

jcnventura’s picture

The patch for the HTML corrector filter in #222926: HTML Corrector filter escapes HTML comments is pending on the existence of filter module tests.

Please add the following to the list to verify the need for that patch and to validate it once it is merged:

5) Multi-line HTML comment with some broken HTML inside. The test should verify that the contents of the comment are not changed nor corrected in any manner. With the current code, this test will fail as the <!-- tag will be converted to & lt;-- and the contents of the comment will be corrected.

João

jhedstrom’s picture

Assigned: Unassigned » jhedstrom

I should have some time for this this week.

jhedstrom’s picture

Status: Active » Needs work
FileSize
10.29 KB

Attached is my first attempt at this.

I've written unit tests for following filter functions, for a variety of inputs. The XSS examples come from http://ha.ckers.org/xss.html

* _filter_html
* trim(check_plain()) as used for the remove all html filter.
* _filter_htmlcorrector
* _filter_url_trim
* _filter_url

Higher level tests may be needed, for testing the glue that binds all these functions together (e.g., filter_filter is not tested). Also, it would be nice if somebody could review the expected output of the XSS.

As mentioned in comment #3, 2 of the tests currently fail since the HTML comment handling is broken.

wrwrwr’s picture

Assigned: jhedstrom » wrwrwr
Status: Needs work » Needs review
FileSize
21.61 KB

I've somehow managed to take it over, sorry for that ;) Started with your code but ended with almost a total rewrite:

  • made all tests short, should be easier to see what they check,
  • restructured it a bit, tests are now directly under appropriate test* functions; on the other hand a new assert (looking for a substring in a lower case, entity decoded input) proved to be rather useful,
  • left (actually added a test for each filter), but commented out, tests for comment handling -- I believe these should go with a patch for the bug,
  • added over test 100 cases, mostly for all those different script injection vectors -- (collected from publicly available sources and by looking through CVEs, security announcements etc.)

There are a couple of TODOs:

  • I would separate the filter_filter tests as a 'defaults' test and maybe add a similar one for check_markup(). Does this sound like a good approach?
  • Firstly two XSS vectors may be too general (does </0script or o/0nerror work in some browser?). Secondly, I don't know how to really use two more taken from security announcements (invalid unicode doesn't get stored in the database (some specific MySQL version only?) and no idea how to use the <script>> thingy), so the tests are a bit virtual in these cases. If someone knows that a test is ineffective or knows some example I've missed, please let me know. (Of course the assumption here is to test the filter as it is, so the tests wouldn't be effective with a different implementation; and of course it's not possible to fully test such filter automatically; still these tests should catch many typical mistakes).
  • In two cases the HTML filter behaves differently than what I would expect it to, however not in an exploitable way.
  • The URL filter doesn't seem to work with some addresses like: "\"\\()[]\;:,<>@ "!#$%&'*+-/=?^_`.{|}~@example.com, but I'll leave this for another issue (if anyone really cares :)
  • Some more tests could be added for the URL filter, corrector and line breaker.

This is some reasonable amount of work already and could use a review already.

catch’s picture

Component: tests » filter.module
Category: bug » task
Issue tags: +Needs tests

wrwrwr, thanks for the work. I only scanned through briefly but a couple of things -

If any tests fail, as you say they should go into their own issue - we don't currently have a way to flag tests which are expected to fail for known bugs within core, so currently they're attached to the patches which fix them. So any failing tests which are commented out should simply be removed from the patch.

Also, partially for this reason, there's already tests for the URL filter over at #161217: URL filter breaks generated href tags - that patch has been stalled for some time, but no need to add extra ones here.

in terms of the patch itself, we don't usually abbreviate variable names - so it'd be breat if $f could be expanded to a real word. Also on first glance it looks like there could be more inline comments - although some of the assertions look fairly self-documenting.

wrwrwr’s picture

Thank you for the review. Here's my take two:

  • Removed failing and commented out tests -- attaching these separately, for reference. Now no test fails and none are coerced into passing.
  • Confirmed two vectors that needed confirmation, commented invalid UTF-8 example as only working as reflected XSS. (If you want to check this don't forget to remove the validity check in check_plain() in bootstrap.inc also -- protocol filtering passes attributes values through this function. :) Showing a stored example stands open.
  • Decided that the <tag>> thingy is only an inconsistency; the test doesn't hurt anyway and stays.
  • Split filter_filter tests into a separate function. This one could use some work -- should include checks for enforcing filter access, proper cache handling etc. (idea: all the stuff that is not directly focuced on processing text).
  • Turned $f into $filtered. Added some inline comments (I'd rather not describe XSS vectors -- this could be a bit involved in some cases).

Failing tests -- this is the more interesting part -- some of the deficiencies already have an issue, but some are without one:

  • URL filter, corrector and line breaker should skip comments (part of this is http://drupal.org/node/222926).
  • URL filter and square brackets: http://drupal.org/node/190466.
  • General improvement of the URL filter (http://drupal.org/node/161217). These are meant as a replacement for the tests in a patch there, some of the cases went into working tests (domain in title, links in code or script, not breaking existing links, dot at the end of an address); tried to make them more concise and still capture all the interesting cases. I'd rather have all tests use one 'style' and avoid repetition also, hopefully authors of the original test wouldn't mind :)
  • URL filter misses some e-mail addresses ("\"\\()[]\;:,<>@ "!#$%&'*+-/=?^_`.{|}~@example.com). This is purely academic.
  • HTML filter breaks markup sometimes (<p attr=">" />). Fixing might be tricky.
  • HTML filter doesn't recognize one cipher hex entities (&x#D;). This seems very easy.
  • HTML filter shouldn't accept some tags (script, iframe, style, etc.) as allowed tags. I would say it should throw an exception.
  • HTML filter should remove more attributes (id, name, class & xmlns probably). Use a whitelist editable by user maybe (just like the tag list)? Maybe also let the user edit the set of allowed protocols?
  • HTML filter allows <p style />: http://drupal.org/node/327331.

One thing that doesn't belong here (discovered during testing though and about which I've almost forgot already), is the installation path disclosure due to the database layer throwing an exception when fed with invalid UTF-8 (7.x only).

Any comments, including code style, language corrections and concerning failing tests are welcome.

sun’s picture

Subscribing

Damien Tournoud’s picture

That's an impressive work. Studying this a while back, I quickly convinced myself that most filters were broken beyond repair. I'm glad that someone came over this initial impression.

Aka; subscribe.

Status: Needs review » Needs work

The last submitted patch failed testing.

jcnventura’s picture

Status: Needs work » Reviewed & tested by the community
FileSize
5.01 KB
24.01 KB

Hi,

I have re-rolled the patch from #8 to make it applicable to the latest 7.x-dev. I have executed the tests and they all pass at the moment. I have reviewed them individually and couldn't find a problem in them.

I am removing two tests from the failures and including them in my newest fix for #222926: HTML Corrector filter escapes HTML comments.

João

sun’s picture

Status: Reviewed & tested by the community » Needs review

I really like this patch. However, either someone smarter than me or myself needs to have a really thorough look at this patch. I'm not sure whether I agree with all test methods, but I have to think more about it.

catch’s picture

Status: Needs review » Reviewed & tested by the community

No one has looked at this in three months, and jcventura did a thorough review already.

If Dries or webchick need more reviews before commit that's fine - but probably we need to find/ask people individually.

Status: Reviewed & tested by the community » Needs work

The last submitted patch failed testing.

webchick’s picture

Status: Needs work » Reviewed & tested by the community

Status: Reviewed & tested by the community » Needs work

The last submitted patch failed testing.

lilou’s picture

Status: Needs work » Needs review
FileSize
24.37 KB

Reroll jcnventura's patch #12 (passed).

pp’s picture

pp’s picture

Status: Needs review » Reviewed & tested by the community

It work's fine.

pp

webchick’s picture

Holy frigging crap, these tests are AWESOME!&*^!*@ I'm really, really sorry this has stayed off of my radar for so long. The timing of RTBC queue sprints to testing bot screwups must have been pretty bad. :( Thanks so much for writing these, what a tremendously valuable thing to have!!

There were a few issues with the comments, such as wrapping at 80 chars, being on the same line rather than line above it, and there was a small novel in the middle there about trailing dots which I shortened up, since tests aren't to ask "what if?" questions, but rather to test what's actually there. I'm not about to tell someone who waited like 100 years for a core committer to finally stumble into their patch to fix their comment spacing. ;) So I did that myself.

However, there were enough of these changes and I have enough sleep deprivation that I want to re-upload it and let the testing bot have one last crack at it before committing.

webchick’s picture

Status: Reviewed & tested by the community » Fixed

Committed to HEAD! Woohoo!!! :)

sun’s picture

Status: Fixed » Reviewed & tested by the community
FileSize
9.46 KB

Thank you! I obviously didn't manage to get back to this issue.

Can we commit a quick follow-up cleaning the coding-style?

webchick’s picture

Status: Reviewed & tested by the community » Fixed

We can!

Status: Fixed » Closed (fixed)
Issue tags: -Needs tests

Automatically closed -- issue fixed for 2 weeks with no activity.