From what I understand, Clean URL support in achieved through mod_rewrite using the following rule:



^(.*)$ index.php?q=$1



What that means is that URL's of the form http://somesite.com/node/view/1 are rewritten as http://somesite.com/index.php?q=node/view/1



According to RFC 2396 - Uniform Resource Identifiers (URI): Generic Syntax, Page 14, section 3.4,


3.4. Query Component

   The query component is a string of information to be interpreted by
   the resource.

      query         = *uric

   Within a query component, the characters ";", "/", "?", ":", "@",
   "&", "=", "+", ",", and "$" are reserved.



Which means that the "/" character when it appears in the Query Component has to be escaped!



I read somewhere that Mozilla used to complain when it encountered an unescaped "/" character in the query component, but since IE did not seem to mind, they eventually turned a blind eye toward it too. So, having an unescaped "/" in the query component does not create an immediate problem.



That aside, if my interpretation is correct, then wouldn't the right thing (following standards) to do be escaping the "/" character in the query component, that is, shouldn't url() in includes/common.inc escape $url?

Comments

sandeep-1’s picture

To translates special characters in the query to hex-encodings, the following directive could be used:

RewriteMap  esc  int:escape

along with,

RewriteRule ^(.*)$ index.php?q=${esc:$1}

Haven't tested this though!