The l() call does not correctly handle links when the protocol is anything but lower case. Calling

l('display text', 'HTTP://www.mywebsite.com')

Results in

http://drupalsite.org?q=HTTP://www.mywebsite.com

The problem is in filter.module, which is only checking for the lower case variant of protocol strings. A simple patch is attached.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

agentrickard’s picture

Status: Active » Reviewed & tested by the community
agentrickard’s picture

Status: Reviewed & tested by the community » Patch (to be ported)

The HTTP specification does say that scheme checks should be case insensitive (http://www.ietf.org/rfc/rfc2616.txt).

If approved, this bug also applies to 5.x and HEAD. The affected function is filter_xss_bad_protocol().

RobRoy’s picture

Title: l() creates bad links if protocol contains anything but lower case text » Allow
Version: 4.7.4 » 5.x-dev
Priority: Normal » Minor
Status: Patch (to be ported) » Needs work

I'd recommend submitting a patch for 5.x-dev to be applied then backported to get more visibility. The (to be ported) status should only be used after a patch has been applied to a development version and it needs to be re-rolled to apply to previous versions.

RobRoy’s picture

Title: Allow » Allow uppercase protocol in filter_xss_bad_protocol() as per RFC 2616

Oops, finishing my title change. Not sure if this should be won't fixed or not, can someone find the appropriate quote from the RFC?

agentrickard’s picture

From http://www.ietf.org/rfc/rfc2616.txt

3.2.3 URI Comparison

When comparing two URIs to decide if they match or not, a client
SHOULD use a case-sensitive octet-by-octet comparison of the entire
URIs, with these exceptions:

- A port that is empty or not given is equivalent to the default
port for that URI-reference;

- Comparisons of host names MUST be case-insensitive;

- Comparisons of scheme names MUST be case-insensitive;

- An empty abs_path is equivalent to an abs_path of "/".

This is the closest passage that I found regarding the use of http: or HTTP: The standard says to treat both as equals.

scott.mclewin’s picture

Status: Needs work » Reviewed & tested by the community
FileSize
636 bytes

I've rolled the patch for Drupal 5.

drumm’s picture

Version: 5.x-dev » 6.x-dev
Dries’s picture

Imo, this is worth documenting. Can we add a code comment referring to RFC 2616?

scott.mclewin’s picture

FileSize
733 bytes

Dries - is this sufficient, or were you looking for a block comment with full details?

Dries’s picture

Status: Reviewed & tested by the community » Needs work

The code comment is sufficient, but this patch no longer applies against CVS HEAD.

Also, looking at the code some more, it's probably best to use strripos rather than strpos + strtolower. If you insist on using strtolower, it's probably best to lowercase the input string outside of the do-while-loop.

We're almost there with this patch -- one more go? Thanks. :-)

Dries’s picture

I obviously meant stripos (1 r) and not strripos (2 r's).

scott.mclewin’s picture

Dries,

I factored out the extra "r". :) I knew what you meant.

In hindsight I have no clue why I didn't just use stripos() in the first place. I'm not bound to strtolower(). I think I must have bought a few extra strtolower() licenses and just needed to use them somewhere.

For what is now essentially a one letter change, we've certainly worked hard at it. :)

I'm not running Drupal HEAD anywhere and my test cases (which happen to be the wishlist.module I maintain) are on D5 rather than HEAD/D6. For such a simplistic change would you accept it against D5 and then port it into HEAD on my behalf? I'm not going to supply a patch that I've not tested - no matter how simple. Right now with a newborn child and a two year old at home, I get very little time to play with anything other than diapers, and I just don't see myself getting a HEAD + test case environment set up for so small a delta in the near term.

I'll roll this change for D5 no matter what the outcome on the D6/HEAD question.

Scott

scott.mclewin’s picture

Dries,

In settling in to remind myself of the details, I know why I used strtolower and not stripos.

$protocol is being used as an index into $allowed_protocols[] and itself needs to be lower case to be a valid index on that array, where all the keys are lower case.

$allowed_protocols = array_flip(variable_get('filter_allowed_protocols', array('http', 'https', 'ftp', 'news', 'nntp', 'telnet', 'mailto', 'irc', 'ssh', 'sftp', 'webcal')
...
$protocol = strtolower(substr($string, 0, $colonpos));
if (!isset($allowed_protocols[$protocol])) {

I don't see a way to use stripos here that is an improvement over the existing logic, since the problem is not in finding the substring to extract, it is in keying the $allowed_protocols array. $protocol will still need to be converted to lower case. Does PHP have a case-insensitive array index operation?

On your other request, of doing the strtolower outside of the loop, I'll decline that one for the following reason: While the protocol component of the URI is supposed to be case insensitive, the rest of the URI is case sensitive. Because this function returns $string, I don't want to alter the full URI to lower case. I aimed to keep the conversion to just the protocol portion of the URI and to leave the rest of the string untouched. Without adding quite a bit of complexity, I don't see a clean way to achieve this without the strtolower() within the loop on $protocol.

Replacing soft hypens and running it through check_plain() are appropriate transformations. I don't think setting the full URI to lower case is an appropriate transformation because it would break the path portion of URIs when they use mixed case. IIS won't care, but Apache/*nix will.

If I am being dense please do articulate how, don't hold back. I want the right patch for Drupal, one that is effective, simple and as efficient as it can be. I think I've done that with this very minor patch, but I'm hardly as expert in PHP tricks as I am with C/C++ and may be overlooking a language trick.

Scott

RobRoy’s picture

We can't use stripos as it's PHP5 only...http://php.net/stripos. That's probably why you used strtolower().

scott.mclewin’s picture

Status: Needs work » Reviewed & tested by the community

@RobRoy - an excellent point that I'd not even considered.

Dries’s picture

Thanks for the further analysis. Looks like this is RTBC then. I'll do that later when I'm back home. :)

Steven’s picture

Status: Reviewed & tested by the community » Fixed

This code has been affected by a recent security update. In Drupal 5, the strtolower() call was already added correctly... I updated HEAD to match.

Steven’s picture

Version: 6.x-dev » 5.x-dev

Wait no, my CVS lost its tag again *grmbl*. Committed to 5 too.

scott.mclewin’s picture

Thanks. This marks my first patch to core, I'm glad it made it in.

RobRoy’s picture

@Scott It's a good feeling isn't it!

/me pats Scott on the back. :D

Anonymous’s picture

Status: Fixed » Closed (fixed)