If you use the path module for URL aliases, the aliases are case insensitive. This is bad for SEO because search engines see two different URLs as being different pages of duplicate content.

Example:
http://www.google.com/index.html
http://www.google.com/index.HTML

Example:
drupal.org/handbook
drupal.org/handBOOK

[copy and paste the above URLs; I don't want to create links]

URLs should be case sensitive.

Comments

J. Cohen’s picture

Lines 223 - 231 in Drupal 5.7 Path module:

      case 'load':
        $path = "node/$node->nid";
        // We don't use drupal_get_path_alias() to avoid custom rewrite functions.
        // We only care about exact aliases.
        $result = db_query("SELECT dst FROM {url_alias} WHERE src = '%s'", $path);
        if (db_num_rows($result)) {
          $node->path = db_result($result);
        }
        break;

Is it a MySQL problem?
http://dev.mysql.com/doc/refman/5.0/en/case-sensitivity.html
http://www.delphifaq.com/faq/databases/mysql/f801.shtml
http://mysqldatabaseadministration.blogspot.com/2006/09/case-sensitive-m...

J. Cohen’s picture

I've written a longer article about this problem here with some images.

Jean-Philippe Fleury’s picture

subscription

irakli’s picture

This could be solved if url_alias declared src and dst fields as binchar, instead of varchar. I don't think it's necessarily directly possible with current DB abstraction layer in Drupal. However an update hook in .install could update the table and change column type if the db_type=mysql?

Just athought.

nw’s picture

BTW the path module is not necessary for this problem to manifest itself. Disabling said module, we have:

/user == /USER

Disabling, friendly URLs we have

?q=user == ?q=USER

As discussed above the root cause of problem is collation. In addition to {url_alias}, the {menu_router} table and associated queries need consideration.

kpm’s picture

subscribe

sanduhrs’s picture

Project: Path » Drupal core
Version: » 8.x-dev
Component: Code » path.module

Some examples of the current behavior on a fresh drupal 8 dev install:

* Requesting /node/1 delivers node 1, displays the Breadcrumb 'Home'
* Requesting /NODE/1 delivers node 1, displays the Breadcrumb 'Home » Node 1 Title'

* Requesting admin/structure/block delivers Block admin page, displays system help message
* Requesting admin/structure/BLOCK delivers Block admin page, does not display system help message

* Requesting /admin/appearance displays Appearance admin page, switches to admin theme
* Requesting /ADMIN/appearance displays Appearance admin page, does not switch to admin theme

* Requesting #overlay=admin/structure displays Structure admin page in Overlay and switches to admin theme
* Requesting #overlay=ADMIN/structure displays Structure admin page without Overlay and does not switch to admin theme

* Requesting admin/reports/status displays Status admin page, displays system help, switches to admin theme
* Requesting ADMIN/reports/status displays Status admin page, displays system help, does not switch to admin theme

* Setting a path alias in node 1 to path/article works as expected
* Setting a path alias in node 2 to PATH/article display an error message 'The alias is already in use.'
* Requesting path/article delivers node/1
* Requesting PATH/article delivers node/1

* Setting a block to be visible only on path node/1
* Requesting node/1 displays the block
* Requesting NODE/1 displays the block
* Requesting path/article displays the block
* Requesting PATH/article displays the block

There are probably more inconsistencies.
From my point of view it would be desirable to have case sensitive URLs and if not for menu entries, at least the url aliases should be.

cuebix’s picture

The problem doesn't seem to be in the path module. The problem seems to be that the column types for the menu_router.path and url_alias.alias columns are NOT set to binary (case-sensitive), when they probably should be.

A patch to set it might make sense in 8.x since it's still in development. For 7.x, I've resolved it (pending further testing) by just changing the structure of those two columns.

GaëlG’s picture

Issue summary: View changes

For anyone reading, I worked on a D7 solution here: #2238389: Case sensitivity of unaliased paths

NB: This allows case sensitivity for base/machine/unaliased paths (node/42 for example). For aliases (Path, Pathauto), Global Redirect module can already handle case redirection, so that there's no SEO duplicate content problem.
BUT this won't allow to have one alias for a node and quite the same alias (just case changes) for another node. This is currently impossible.

azinck’s picture

cuebix seems to have the best solution overall but I don't know if it's going to fly since D7 has now behaved this way for a long time and I don't think we can suddenly have it returning 404s when paths don't match based on case.

Global Redirect has a decent solution to the problem: redirect to the "properly" cased version of the path. However, it is sometimes stymied in its efforts when contrib modules attempt to call drupal_get_path_alias() too early; calling it in hook_boot can sometimes have the effect of polluting the path cache such that Global Redirect can't retrieve the correctly-cased version of the alias without clearing the path cache.

I've opened an issue (with patch) that attempts to solve this for D7 in a way that will allow Global Redirect to work properly in all cases: #2400539: Path cache won't always contain the canonical alias, depending on order of calls to drupal_lookup_path()

Version: 8.0.x-dev » 8.1.x-dev

Drupal 8.0.6 was released on April 6 and is the final bugfix release for the Drupal 8.0.x series. Drupal 8.0.x will not receive any further development aside from security fixes. Drupal 8.1.0-rc1 is now available and sites should prepare to update to 8.1.0.

Bug reports should be targeted against the 8.1.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.2.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.1.x-dev » 8.2.x-dev

Drupal 8.1.9 was released on September 7 and is the final bugfix release for the Drupal 8.1.x series. Drupal 8.1.x will not receive any further development aside from security fixes. Drupal 8.2.0-rc1 is now available and sites should prepare to upgrade to 8.2.0.

Bug reports should be targeted against the 8.2.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.3.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.2.x-dev » 8.3.x-dev

Drupal 8.2.6 was released on February 1, 2017 and is the final full bugfix release for the Drupal 8.2.x series. Drupal 8.2.x will not receive any further development aside from critical and security fixes. Sites should prepare to update to 8.3.0 on April 5, 2017. (Drupal 8.3.0-alpha1 is available for testing.)

Bug reports should be targeted against the 8.3.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.4.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

catch’s picture

Status: Active » Closed (duplicate)

Marking this as duplicate of #2075889: Make Drupal handle incoming paths in a case-insensitive fashion for routing. Also we already use rel=canonical which should cover any SEO aspect.