In my brief testing it's impossible to create a URL alias that includes characters which should be allowed.

In IRC UnConeD also pointed out that "core is broken" in this regard.

CommentFileSizeAuthor
#1 iri.patch2.15 KBSteven
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Steven’s picture

Title: support IRI » Remove restrictions on path aliases (support IRIs)
Version: 6.x-dev » 5.x-dev
Status: Active » Needs review
FileSize
2.15 KB

Things to know:

  • All menu paths are urlencoded on output (by Drupal) when placed in the GET query.
  • All GET values (including the menu path) are urldecoded on input (by PHP).

This means, the URLs that result from user defined menu paths and aliases will always be valid, even menu paths that use punctuation like "#" or "!" or even random Unicode characters.

e.g.

Path/Alias = blog/Bunnies are made of people!?
Resulting URI = http://example.com/base-path/?q=blog/Bunnies+are+made+of+people%21%3F

Path/Alias = blog/My résumé
Resulting URI = http://example.com/base-path/?q=blog/My+r%C3%A9sum%C3%A9

Path/Alias = blog/アニメ
Resulting URI = http://example.com/base-path/?q=blog/%E3%82%A2%E3%83%8B%E3%83%A1

In spite of this, path.module requires that path aliases contain only characters valid in relative URLs. This makes no sense. The attached path removes this restriction.

This is a necessary step towards allowing e.g. pathauto to support arbitrary languages. The current practice of transliteration of letters to ASCII and removal of accents is a hack which produces 'prettier URLs', but which are less meaningful to search engines. It is also useless for languages which do not use the latin script.

Note that the 'odd' escapes for the Unicode characters above is perfectly normal. This is the standard used for IRIs (the i18n'd form of URIs, see RFC 3987) and supported by all the major browsers and search engines.

However, because of phishing abuse, some browsers will not show the Unicode characters in some or all IRIs in the address bar and/or status bar. e.g. Japanese Wikipedia on Google.

chx’s picture

Status: Needs review » Reviewed & tested by the community

Lovely patch. Less restrictions, more features, less code, more comments.

Dries’s picture

Status: Reviewed & tested by the community » Fixed

Committed to CVS HEAD! :)

Anonymous’s picture

Status: Fixed » Closed (fixed)