Searching for a date, e.g. "14/05/2005", results in a 404 error.

search.module redirects to http://drupal.org/search/node/14%2F05%2F2005 which isn't handled correctly by mod_rewrite.

The attached patch replaces all occurrences of '%2F' in the query with '/'.

A test query for '1/2 1~2' results in a redirect to http://drupalhead.local/search/node/1/2+1%7E2 which is handled correctly by mod_rewrite.

(Solution found via comment on http://ccca.nctu.edu.tw/~hlb/tavi/index.php?page=URL+Rewriting)

CommentFileSizeAuthor
#5 urlencode.diff4.81 KBSteven
search.module.patch740 byteswulff
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Dries’s picture

Waiting for Steven's feedback on this. He know what the best fix is.

Steven’s picture

Actually it is not clear to me at all why an urlencoded slash would be a problem. It is even weirder that mod_rewrite can deal with unescaped slashes, but 404s on encoded slashes. In fact, if it wasn't URLencoded, then the non-clean URL would be invalid (as far as I know it is not allowed to use slashes in GET query data).

The page you linked to doesn't make me any wiser as to the cause of the problem. Does this problem occur with more than just slash characters?

Even if it is just limited to slashes, we'd need a cleaner fix than this. Someone with more mod_rewrite skills should look into this problem as it has really nothing to do with searching. If it is a mod_rewrite bug that cannot be fixed, then a wrapper around urlencode() sounds like a good idea.

wulff’s picture

I have done some more digging. It seems that this is not really related to mod_rewrite. According to http://137.113.100.11/manual/mod/core.html#allowencodedslashes Apache versions prior to 2.0.46 automatically return a 404 error when encountering '%2F' in the URL.

Apparently, the AllowEncodedSlashes functionality may be backported to 1.3.x (http://www.mail-archive.com/dev@httpd.apache.org/msg25089.html).

drumm’s picture

Status: Needs review » Needs work
Steven’s picture

Assigned: Unassigned » Steven
Status: Needs work » Needs review
FileSize
4.81 KB

This patch adds a wrapper around urlencode() which restores the escaped slashes to real slashes, when clean URLs are turned on. This doesn't introduce any problems or breaks standards. Mod rewrite will re-escape it for us when it rewrites the query on the server side.

Before:
site.com/search/node/query+with+a+slash+%2F+in+it

After:
site.com/search/node/query+with+a+slash+/+in+it

So, it simply makes our fake directories look more like real directories in case there's a slash in there. The non-clean URL situation doesn't change:

site.com/?q=search/node/query+with+a+slash+%2F+in+it

drumm’s picture

Tested and works. Code looks good too. +1

Steven’s picture

Status: Needs review » Fixed

Committed to HEAD.

Anonymous’s picture

Status: Fixed » Closed (fixed)