I have a site that has been through a few versions, but, so far, I have managed to keep urls unchanged. I would also like to have hierarchical urls, and to generate some index pages.

The current urls are of the form:

http://example.com/mine/

From what I have read, Drupal seems to have a problem with trailing slashes. Can this be worked around? Obviously mod_rewrite can ensure the right page is shown, but I also want trailing slashes from links in menus.

I also want to used the book module, but here the urls would get more complicated. What I want there is:

http://example.com/bookname/bookpagename/

This is going to be new content so the trailing slash is not essential - but it would be nice to be consistent.

The site also has alphabetical index pages. These have the title of most (not all) pages, and a short summary of each. This is not taken from the page text, but is separately written. I could maintain these manually, but it would be nice for them to be automatically updated. The views module looks like to could generate these, but I cannot see where I would store the additional data.

The site is currently static pages generated from scripts (my own AND hyperlatex: the former call the latter).

I have an existing solution to this using Wordpress (for historical reasons: the site started off as a static section of mostly chronologically ordered site). I could also do this using something like CMS Made Simple. However I would prefer to use Drupal, because:

1) It is a long term solution. I am unlikely to have to move again if I want new features: for example, to add a forum.
2) It is well designed (compare the database schema with CMS Made Simple for example), robust, performs well etc.
3) It has a static HTML import module.
4) There is a module that can show pages linking to a page.

Comments

Gurpartap Singh’s picture

Having or not having trails in urls make no difference for:
1. Google, Yahoo, MSN, and all other search engines, crawlers, systems.
2. Drupal -- Drupal trims any slashes around the url for it's internal processing(server-side).
3. Browsers.
4. etc.

In other words, your current links(from sites, bookmarks, google, etc) will point to correct paths, even if they have a trailing slash(Drupal trims them, see #2 above). So, you should be able to live without slashes.

If you have doubts, please ask :)

graemep’s picture

A lot of what I have read says that search engines can treat a url with and without slash as separate urls. Not good for ranking.

I will use redirects which will solve this problem, but there will be a period of adjustment when I will lose rank.

As this is my revenue generating site, this means I lose money, for an indeterminate period. Being able to retain the URLs as they are would be the safe option.

Of course, I could use rewrites to keep the slashes, but then I have the problem that navigation urls generated by Drupal will always get redirected ,which does not sound right to me.

mikeschinkel’s picture

Actually, according to RFC 3986 it does make a difference. Any two URLs are are not a character-by-character match (ignoring URL encoding which does not apply in this case) are considered to indicate different resources. This means that client agents such as Google, Yahoo, MSN, and all other search engines, crawlers, systems and all browsers are violating the specification if they assume the two URLs are the same.. On the other hand, Drupal as the infrastructure for a URL Authority is perfectly within the spec to serve the same content when different URLs are requested.

Yes, Google and others may have some optimizations that violate the spec, but if possible it is better not to assume that Google will violate the spec. The upshot of all this is that it really makes good sense to ensure that all Drupal implementations choose to either use the trailing slash or not use the trailing slash as a canonical form and that they issue a 301 redirect when the non-canonical form is requested.

What's unfortunate is how few professional web developers understand the specs. Heck, it's unfortunate how few professional web developers actually ever read the specs.

Personally, I strongly feel that the trailing slash is preferred, and plan to blog about why at http://blog.welldesignedurls.org at some point in the future.

dman’s picture

Makes a big difference for relative links!

Having spent over a decade migrating and moving sites of all types through different platforms, hosts, subsites and naming schemes, I choose relative links wherever possible to link related sections together.
However Drupal and most CMSs now push you into using more absolute links, and you have to try hard to impliment relativity any more.

But for this, a link found in
/my/page
pointing to 'pic.gif' will return
'/my/pic.gif'
while the same link in
/my/page/
will look for
'/my/page/pic.gif'

:(

There are many reasons why relative links just won't work in Drupal (teaser lists, views) so it's pretty much a lost cause BUT
- those two URLs are significantly different when parsing.

.dan.
How to troubleshoot Drupal | http://www.coders.co.nz/

Gurpartap Singh’s picture

It's the "trailing" slash, not the one you are pointing :)

dman’s picture

Sorry, what are you saying?

Gurpartap Singh’s picture

Trailing slash is the slash found at the end of urls especially in case of wordpress. like http://site.com/about/
but as per Drupal's default behavior it would be like http://site.com/about

Thats all what this is ;-)

dman’s picture

I know exactly what a trailing slash is, and that's what I was illustrating when comparing the two paths above.
/my/page
/my/page/

And my point is that they produce different URL-resolution results.

What is your point?

.dan.
How to troubleshoot Drupal | http://www.coders.co.nz/

mark_r’s picture

so how can i get the trailing slash into my drupal setup?

edit//
ok, after searching over and over, i done my own drupal hack, and here it is:

===
goto File /includes/path.inc
[FIND]
$_GET['q'] = drupal_get_normal_path(trim($_GET['q'], '/'));

[REPLACE WITH]
// $_GET['q'] = drupal_get_normal_path(trim($_GET['q'], '/'));
$_GET['q'] = drupal_get_normal_path($_GET['q']);

===
// if installed pathauto
goto File /modules/pathauto/pathauto.module
[FIND]
// Trim any leading or trailing slashes
$alias = preg_replace("/^\/|\/+$/", "", $alias);

[REPLACE WITH]
// OLD: Trim any leading or trailing slashes
// $alias = preg_replace("/^\/|\/+$/", "", $alias);
// NEW: Trim anly leading slashes
$alias = ltrim($alias, "/");

===
//if installed globalredirect
goto file /modules/globalredirect/globalredirect.module
[FIND]
// Get the request string (minus trailing slash e.g. node/123/)
$request = preg_replace('/\/$/', '', $_REQUEST['q']);

[REPLACE WITH]
// OLD: Get the request string (minus trailing slash e.g. node/123/)
// $request = preg_replace('/\/$/', '', $_REQUEST['q']);
// NEW: Get the request string
$request = $_REQUEST['q'];

#############################################

Addition changes:

// change the url-alias input field
goto File /modules/path/path.module
[FIND]
$form['dst'] = array(
'#type' => 'textfield',
'#default_value' => $edit['dst'],
'#maxlength' => 64,
'#size' => 64,

[REPLACE WITH]
$form['dst'] = array(
'#type' => 'textfield',
'#default_value' => $edit['dst'],
'#maxlength' => 255,
'#size' => 80,

################################

thats all. i accept no responsibility for any error occure through this chnages! Change this code on you own risk.

darumaki’s picture

I actually prefer the trailing slash if for no other reason that it looks cool and also closes the url like /my-page/ instead of /my-page
I think a cms should be flexible enough to let the admin decide how he wants his site pages to display, I wish I could do this without a hack

Bilalx’s picture

Thanks Mark_r

Here is a small improvement for pathauto.inc

===

// if installed pathauto
goto File /modules/pathauto/pathauto.module
[FIND]
// Trim any leading or trailing slashes
$alias = preg_replace("/^\/|\/+$/", "", $alias);

[REPLACE WITH]
// OLD: Trim any leading or trailing slashes
// $alias = preg_replace("/^\/|\/+$/", "", $alias);
// NEW: Trim any leading slashes and all but one trailing slashes if there are more than one trailing slash
$alias = preg_replace('/^\/|\/{2,}$/', '', $alias);

The url will have a trailling slash and trying to acces them without a trailing slash returns a 404

To fix this you can add the following code in you .htaccess to redirect url without slashes to correct ones

# Custom Fix
# Only if url does not contain already a trailing slash "/" or a file extension ".xxxxx" at the end
# trailing slash fix
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_URI} !(\.[a-zA-Z0-9]{1,5}|/)$
RewriteRule (.*)$ http://example.com/$1/ [R=301,L]

Found here: http://trif3cta.com/blog/entry/add-a-trailing-slash-via-htaccess-to-prev...

AgaPe’s picture

it's an old topic but it worked for me, for drupal 6.
But the problem is that then the node has a path "something/" and you can't go to the page using "something", there is page not found error which i guess is not good, it should be still possible to visit the site from within this url

rogoff123’s picture

I've just setup a test version of Drupal on my server and, so far, been really impressed. That's until I came across this issue. I'm thinking of migrating my entire site into Drupal and I have lots of existing URLs that have trailing slashes. Hundreds of hours have been invested in getting these URLs ranked well in search engines. However, these rankings would be degraded if I had to change the URL by losing the trailing slash (even if I use a 301 permanent redirect).

I think this is a major omission and I will really have to think hard about whether or not to use Drupal and take the ranking hit. Not to mention well formed URLs which I think someone else referred to. I don't think it's strictly correct or good practice to omit a trailing slash in a URL unless you're linking to a file with an extension, eg www.example.com/directory/index.html. as far as i know, www.example.com/directory does not comply with standards.

What a shame - I was just getting to like Drupal. Is there a work-around (instead of the hack) or module that fixes this?

Jeff Burnz’s picture

I have been dismayed by this also, and have wondered if this could be made an option in Pathauto? I've tried doing it in Pathauto but havent had time to work on it properly.

frames’s picture

Having been playing with Drupal for only one month ... I don't think PathAuto would be the answer as, to my knowledge, it does not handle each and every URL that can be generated in a Drupal install, Does it?

I also agree this is indeed something that can/could make a lot of people stop thinking on moving to Drupal.

There must be a reason why removing trailing slashes is harcoded in Drupal and its modules. I just can't find one.

I don't know if it would be possible to use mod-rewrite to add a slash at the end of every URL that does not end in slash, html or the like (and I mean every URL "used" on every page). But I don't think that's a solution. For instance, not all sites are using Apache. As I see it now, there should be a module or an option in the core code that would do that, just in the same way one can simply turn on 'nice URLs'.

Just an opinion.

dman’s picture

(If this is double-post, blame the server)

Mark_r's fix above is fine and safe. You only need to do the first step - a one-line change from trim() to ltrim(). There's nothing hacky or work-around about it, it's just a change to the existing behaviour.

The reason for the path to be trimmed by default is primarily consistancy. There's no need to remember the difference between the URL /user/ and /user.
URL resolution falls back correctly from /user/ to give you the /user page, but the reverse does not happen. Transferring from /usey to add a slash and serve /user/ may be possible, but is not done. Webservers do, of course, but via proper redirects and only for directories.
The behaviour being asked for here would require that slashed URLs would have to be typed in full, which is a usability pain, ... unless a redirect was used also, which is flaky, and would put a bit more strain on the path resolution phase.

In my eyes, /user/ represents a directory full of stuff, and I'd expect to be seeing an index sort of page. /user represents a destination content page.
Admin pages and sections may go bothe ways, but as most nodes are content pages, not containers, it would seem misleading to represent them as directories by putting a slash after them.

However, if you want to structure your URLs in such a way, I think the choice is yours, and you can make that change easily. Go ahead. I can't imagine why a "lot" of people would drop Drupal over this tweak tho.

Why would they want to do this anyway? The only practical reason is to try and capture old pageranks when emulating or migrating from an existing site. Seeing as Google often detects 'signficant' changes to websites, you may see resets anyway. Seeing as it's surely aware that URLs with and without trailing slash are almost always the same (spec or not, it's true) this exercise is probably not worth it.

But adding trailing slashes to new node items, or via pathauto, is not very logical. Go ahead and do it if you want, but the option is not in core because it's not very sensible for most sites.

.dan.
How to troubleshoot Drupal | http://www.coders.co.nz/

KingMoore’s picture

This problem is a bother for me as well. My thinking about it may be a bit different. Here is a typical example of why I would want to be able to specify paths ending in /... It's not that I ALWAYS want them to be like that, but here is a situation where I do.

I have an About us page on my site, with 3 sub pages:

About Us (overview page)
- Who We Are (sub page)
- What We Do (sub page)
- Why We Do It (sub page)

Now I want to set my site up in a logical, heirarchial order for various reasons. My sub pages will be pathed like this:

about/who-we-are
about/what-we-do
about/why-we-do-it

I don't have any problem here with there being no trailing slash here, but I do wish that I COULD enter in a trailing slash for my paths because I want my main About Us page to reside here:

about/

And act as the main page for that section. Currently my options are as follows:

about - I don't like this as I want my about us page to appear to be the index of the subdirectory my sub pages are in, not as a content page off the root of the site

about/about - I don't like this because it is redundant and then I have nothing at just about/

So that's my thinking. Mainly I just wish I had the option to define my URLs how I want in this regard.

dman’s picture

about - I don't like this as I want my about us page to appear to be the index of the subdirectory my sub pages are in, not as a content page off the root of the site

I agree with your point, and it's similar to what I was saying. It can be appropriate to have about/ when it is an index-type page.

A question is ... 'appear' as an index ... to who?
No difference to search engines and spiders. No big difference to users, unless they (like me) are developers who try to look under the hood of site structures. And we already know that the two URLs mean basically the same thing ... to all useful purposes (save only the relative URL niggles I mentioned earlier)

But yes, just make that change in your core files. Your choice.

.dan.
How to troubleshoot Drupal | http://www.coders.co.nz/

KingMoore’s picture

I see your point about ... 'to who'? I suppose my perception is skewed a bit by being a devoloper and having read loads on optimum usibility. I like to structure my sites in a logical order.

Quite often when I get to a page via google, if it isn't exactly what I am looking for I will look for an index page for the current 'directory.'

However, I realize this wouldn't even effect this situation because if a drupal user were to go to 'about/' they would get the same content as if going to 'about'.

Basically... I wish that path auto would work when I save menu paths as 'about/'. The admin->build->menus will let me save items with the trailing backslash, and they work correctly. But when you try to create content using a trailing backslash in the alias, the alias does not work, either like 'about' or 'about/'

And I have been instructed to never hack core, so perhaps I will look into writing a patch :).

Cheers for the discussion.

webel’s picture

Have read with great interest this discussion.
I am migrating from a JSP site where the trailing slash has a clear role.

www.example.com/about/ would run www.example.com/about/index.jsp

And lots of other .jsp pages would be under about/

I would really like to be able to specify aliases ending in /
for about 100 cases like this (in combination with books).

I am however hesitant to hack the modules to achieve it.
I would really like a Drupal standard switch (like clean URLs).

Many users "know" that when they enter a URL ending in '/' they get a "folder".

Webel, "Elements of the Web", Scientific IT Consultancy,
For UML, UML Parsing Analysis, SysML, XML, Java, symbolics
http://www.webel.com.au (under migration to Drupal)
"See a need, fill a need.", Bigweld (from "Robots")

Webel IT Australia, "Elements of the Web", Scientific IT Consultancy,
For PHP-driven Drupal CMS web sites, Enterprise Java, graphical UML, UML Parsing Analysis, SysML, XML.

harkonnen’s picture

So I heard, but I'm not sure whether it's applicable to URL Rewrite.

Source:
http://www.alistapart.com/articles/slashforward/

dman’s picture

That is totally true for /index.html built sites, yes, good practice.
Not so for CMSs

KingMoore’s picture

just a quick note, the code above:

// if installed pathauto
goto File /modules/pathauto/pathauto.module
[FIND]
// Trim any leading or trailing slashes
$alias = preg_replace("/^\/|\/+$/", "", $alias);

[REPLACE WITH]
// OLD: Trim any leading or trailing slashes
// $alias = preg_replace("/^\/|\/+$/", "", $alias);
// NEW: Trim anly leading slashes
$alias = ltrim($alias, "/");

is in pathauto.inc, not .module, at least in the version I am running (5.x-2.1)

KingMoore’s picture

Here is a patch for path.inc. When drupal_lookup_path is called, if it tries to look up $path and fails and $path does not end in a '/' it then looks again for $path."/"

Path.inc Patch:
http://drupal.org/node/250525

I have also uploaded a pathauto patch here:
http://drupal.org/node/203632

Not sure if this format will work for everyone, as I have generated it in Zend/Eclipse. Let me know.

mark_r’s picture

Thx! What ever the others say, i will use the trailing slash on drupal 6 too. can u post the patch for path.inc again. it looks like it was removed.

simple question: how descide whats worth and whats not?

j0hn-smith’s picture

The patch isn't there, can you post it again please?

osviweb’s picture

I have a problem on our drupal website http://www.popvision.com/agenzietui
as you can see if you try to get the drupal site without a final / slash in the URL you are redirected in the same url but without the www. Why??

I've tried to :

- double check the setting.php file $base_url = 'http://www.popvision.com/agenzietui';
- tried to delete the .htaccess files
- tried to disable the global redirect module

It happens anyhow. Can you tell me why?

thanks

Osvaldo Mauro
Agenzie Viaggi Tui

dman’s picture

wget -v  --save-headers http://www.popvision.com/agenzietui
--17:27:16--  http://www.popvision.com/agenzietui
           => `agenzietui'
Resolving www.popvision.com... 213.92.110.149
Connecting to www.popvision.com|213.92.110.149|:80... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: http://popvision.com/agenzietui/ [following]
--17:27:17--  http://popvision.com/agenzietui/
           => `index.html'
Resolving popvision.com... 213.92.110.149
Connecting to popvision.com|213.92.110.149|:80... connected.
HTTP request sent, awaiting response... 200 OK

You'll see the server is sending a 301-Moved to do this for you.
It's happening before it hits Drupal, but may be affected by the .htaccess.
Did you read the .htaccess instrctions for configuring redirects with and without www?
If that still doesn't work, it's happening at a higher level, It's probably to do with configuration on your host. See if the control panel has an option for 'preferred' or 'primary' domain that you can change. Otherwise talk to your sysadmin.

.dan.
How to troubleshoot Drupal | http://www.coders.co.nz/

KingMoore’s picture

Sorry this took so long. Not sure why the other patch is no longer there?

Anyways, I have posted a new patch here:

http://drupal.org/node/250525#comment-913436

osviweb’s picture

Thank you very much for your kind help
I've checked the site without any .htaccess and .redirect and it happens anyhow. I cannot say if there is a missconfiguration on dns and I'm double checking also my control panel.

I don't know where it coulb be stored a host file doing this redirect... it could be a file , a drupal module (global redirect) or something I can delete?

thanks again

Osvaldo Mauro
Agenzie Viaggi Tui

dman’s picture

It's not DNS - It's Apache. Probably in the apache conf file for the virtual host, or more globally.
It's an additional config, and not standard, so it should be pretty easy to spot and turn off .. FOR YOUR SYSADMIN if it's not accessable in a control panel

.dan.
How to troubleshoot Drupal | http://www.coders.co.nz/

Brigadier’s picture

In case anyone's still referring to this thread, the Global Redirect Module should handle redirects from a url with a trailing slash to the equivalent url without the slash (at least it does as of Drupal 6). I know that's not exactly what the OP requested but it's a good solution for most people.

dutchie76’s picture

Its in the pipeline http://drupal.org/node/300100

j0hn-smith’s picture

I've written a simple module to provide a 301 domain.com/folder to domain.com/folder/, it won't be suitable for all situations but it's a good start. See http://drupal.org/node/540348

Akaoni’s picture

It's taken a few years, but:
http://drupal.org/project/trailing_slash