Now that we have all links running through common l() functions, we should implement a pretty URL convention where savvy users can get right to their blog, bug report, etc. without typing strange characters like '?' and '&'.

Another benefit of this URL structure is that Search Engines will only index sites with URLs lacking querystrings.

Use this feature request to discuss a good convention for building up pretty URLs.

Comments

ax’s picture

while ago, J C Lawrence started implementing this under the title "Sane URL patch". unfortunately, he hasn't been seen on the list for 4 month now :( (anyone?).

as he had been working on this feature quite some time, we may still consider some of his ideas / findings. for a start, check http://list.drupal.org/search/?q=sane+url+patch&ps=76&o=0&m=all&wm=wrd&u... (drupal-devel search results for "Sane URL patch").

among them, there is the following suggestion for a clean url scheme (http://mail.zind.net/pipermail/drupal-devel/2001-December/005872.html):

Note that going recursive descent like this offers many other
advantages:

Bubba's Blog:
http://site/mod/blog/name/Bubba/

Bubba's blog on date XXXXX:
http://site/mod/blog/name/Bubba/date/XXXXX

All blogs from day YYYYY:
http://site/mod/blog/date/YYYYY

Buuba's comments:
http://site/mod/comment/name/Bubba

Bubba's comments on blog Foo:
http://site/mod/comment/name/Bubba/node/Foo

The 5th comment from that set:
http://site/mod/comment/name/Bubba/blog/Foo/num/5

Bubba's comments written on date QQQQQ:
http://site/mod/comment/date/QQQQQ

The 8th comment Bubba wrote that day:
http://site/mod/comment/date/QQQQQ/num/8

It can be done, and without incredible difficulty. Mainly it just
needs a standardised set of variable names and the supports to
easily construct arbitrary queries from such name sets.

Note: We're going to need result paging support sooner rather than
later if we go down that road. PEAR has some fairly nice stuff
for that.

moshe weitzman’s picture

JC's suggestion is a good one from a code perspective. But from a user perspective I really want my blog to be /blog/weitzman for example. I think that /mod/blog/name/weitzman is not short enough.

We should be able to generally support "recursive descent" in the engine but hopefully modules can implement additional schemes.

In my example above, we would parse the url and notice that blog is the first item in the path and thus hand the rest of the processing to the blog_page or blog_url function.

just thinking aloud here. still fuzzy.

moshe weitzman’s picture

Component: Code » Other

I looked at this a bit more, but decided not to pursue further. I'm hoping others will give it a go. Here are some of my findings.

ABOUT IIS
The proposed pretty URL techniques may have problems on IIS. That is because an URL like this

http://foo/blog/moshe

will cause IIS to look for the default document in the 'moshe' directory. If that file or directory don't exist, you get a 404. The 404 logic may be overridden with some success, but in my experiments the PATH_INFO and querystring arguments are slightly changed as a result. Also, the error logs now have an entry for this URL. So you're dirtying your error logs by using the 404 mechanism.

It is possible to use an ISAPI filter to do URL rewriting. A free one that seems to work well is called URL Replacer. It is found at http://www.pstruh.cz/.

No URL rewriting should be needed in Apache because of its "lookback" functionality (see
http://mail.zind.net/pipermail/drupal-devel/2001-December/002598.html).

SEE BOTTOM OF THIS ARTICLE FOR LINKS TO MORE ON THIS TOPIC
http://www.searchtools.com/robots/goodurls.html

HIGHLIGHTS OF SaneURL DISCUSSION

JC EXPLAINS THE PROPOSEDTECHNIQUE
http://mail.zind.net/pipermail/drupal-devel/2001-December/002598.html

JC advocates "recursive Descent"
http://mail.zind.net/pipermail/drupal-devel/2001-December/005872.html

JC on URL Handler system design
http://mail.zind.net/pipermail/drupal-devel/2001-December/005876.html

Dries lays out a few system requirements
http://mail.zind.net/pipermail/drupal-devel/2001-December/005878.html

JC posts some code in response
http://mail.zind.net/pipermail/drupal-devel/2001-December/005882.html

Dries on the security benefits of an URL Handler
http://mail.zind.net/pipermail/drupal-devel/2001-December/005895.html

JC responds to Unconed asking - "why bother?"
http://mail.zind.net/pipermail/drupal-devel/2001-December/005906.html

Scoop's URL Handling
http://mail.zind.net/pipermail/drupal-devel/2001-December/005880.html

Dries proposes a far simpler but still useful approach
http://mail.zind.net/pipermail/drupal-devel/2001-December/005914.html

moshe weitzman’s picture

Looks Marco has uploaded a beta patch for this to his sandbox. nice. He says in README:
-------------------------------

The idea is to have an url like
http://www.drupal.org/node/id/662
instead of
http://www.drupal.org/node.php?id=662

this is prettier both for human and spiders. there are probably better ways to display an url, but this way at least spiders will index your site.

[snip - install instructions]

a caveat is that you can't use drupal in a subdirectory anymore (www.example.com/drupal)

as you can see this patch is very beta. a problem I had is that on my pc it doesn't work (win2000+apache+php as cgi). it returns a 404 error, because somehow apache doesn't find the scripts :/

anyway it has been tested quite a lot on my server. it is backward compatible -> old urls still work

----------------------------

I suspect that the free ISAPI plug-in 'URL Replacer' will provide enough mod_rewrite like capability such that this patch can work on IIS. See http://www.pstruh.cz/

I use drupal in a subdirectory all the time. it is quite convenient. would be nice to make that work somehow.

happy to have this one back in the conversation.

kika’s picture

I'd suggest to check Postnuke ShortURL RFC.
http://centre.ics.uci.edu/~grape/modules.php?op=modload&name=Wiki&file=index&pagename=RFC-23%20Short%20URL%20Support:

In order to generate a short URL for a particular set of module parameters, pnModURL() will call the function <module>_<type>_encode_shorturl, pass it the module function and parameters to be encoded, and expect back a virtual path for the short URL. If no path is returned, this means this particular combination of parameters doesn't have an equivalent short URL, e.g. because there is no reasonable meaningful short URL for it.

Example : selecting articles that are in a particular category could easily be given a short URL in the style /articles/category, but selecting articles that belong to several categories at the same time may not really have a meaningful equivalent short URL.

In the other direction, pnGetRequestInfo() will call the module-specific <module>_<type>_decode_shorturl() function, pass it the virtual path and expect back an array with the corresponding module parameters. Again, there may not be a reasonable equivalent, e.g. if a user tries to play a bit with the short URL.

Interesting idea - module-based custom URL encoding/decoding via hooks.

moshe weitzman’s picture

+10 to kika for proposing a hook where modules may define their own short URLS.

moshe weitzman’s picture

just read the postNuke spec. they state that the spec is not only for apache webservers. i don't see how they do this in IIS - seems like it would 404 when receiving a pretty URL request. perhaps they mean that apache is required but mod_rewrite is not.

can anyone explain this futher?

moshe weitzman’s picture

thanks to dries, this one is marked as CLOSED

currently requires apache and mod_rewrite, but that may be improved one day. open another feature request if you need that.

ax’s picture

Priority: Major » Normal


> currently requires apache and mod_rewrite

it doesn't really require mod_rewrite. i made may site axel.kollmorgen.net completely clean url with just some modifications to .htaccess and common.inc and an added "handler" (this one includes index.php - alternatively, you could just make index.php the "hadnler" by renaming it to drupal (or index) and include the $script_name = ...; defines ...; and $q = ... from drupal.txt).