Introduced here: http://drupal.org/node/99970
The issue noted above introduced case sensitive path caching. Drupal 6 removes this case sensitive patch caching. Are paths case sensitive? Should this be part of Drupal 6?

And if this shouldn't be part of Drupal 6, then an update() needs to happen to update Drupal 5 page_cache tables

Comments

m3avrck’s picture

Subscribing...

gábor hojtsy’s picture

Title: Page Cache Binary » Page cache should be case sensitive (regression)

Oh, ok, this looks like a regression, since #99970: Page cache matching is case insensitive is indeed not reflected in the Drupal 6 schema.

damien tournoud’s picture

Version: 6.x-dev » 7.x-dev

That's a fun issue: the schema API introduced in Drupal 6 doesn't know anything about collation, and can't set the "BINARY" flag. Bumping to 7. This is a critical issue to solve and backport to 6.

Scott Reynolds’s picture

Wow never thought this would ever get any attention. So I never explain a bit more.

With the new menu system, not sure that u can have case sensitive paths any more. For instance in Drupal 6, /NODE and /node are the same on MySQL. It is actually a lot more involved then just adding a feature to schema. Menu system needs to updated first, before changing the page_cache table. Otherwise, u end up caching NoDE and node but both line up with the same callback and result. This then puts two records in the page cache that are redundant. There is no difference in Drupal between NoDE and node, so caching two sets is useless.

Now this behaviour I believe will be different on Postgres as it is case sensitive ? Thus the menu system will behave like it does in D5( NoDE and node will be different).

c960657’s picture

Issue tags: +caching
c960657’s picture

StatusFileSize
new57.1 KB

This patch makes all varchar columns case-sensitive. AFAIK this makes MySQL behave like PostgreSQL. Amazingly enough all tests still pass (I have only tested with MySQL).

So far the patch does not convert existing tables. In fact, making URLs case-insentive may break some non-standard URLs when upgrading from D6->D7 (e.g. http://example.com/NODE/1), unless some other trickery is done.

Also, I have no idea about the consequences for other parts of Drupal.

c960657’s picture

StatusFileSize
new1.13 KB

Sorry, wrong patch.

c960657’s picture

Status: Active » Needs review
StatusFileSize
new7.87 KB

This also makes existing tables case-sensitive during upgrade from D6.

I chose to upgrade tables from D6 core modules. I couldn't find a way to get the tables belonging to the currently installed modules only - drupal_get_schema() reflects the table names after the upgrade process has completed, but I think it's better to modify tables belonging to external modules before their hook_update_N has been invoked (so they have a chance to change back the collation if actually really want the case-insensitive behaviour and are MySQL-only). Another option is to modify all tables in the database, but that is probably a bit to bold.

The system_update_N function was added as 0012 in order to appear prior to any table renames.

Note that to test this, you may need to comment out system_update_7019 after having applied the patch (see #363262: Missing index on url_alias table).

dries’s picture

At some point, it is probably useful to have a bit more control over this at the API level but right now, this patch improves the current situation. This patch looks good to me, but

1. Let's add a code comment to:
+ $table['mysql_suffix'] = 'DEFAULT CHARACTER SET UTF8 COLLATE utf8_bin';
That code comment should capture some of the knowledge in this issue.

2. Please also add a test case for this problem to support future database backends; e.g. it helps make sure that new backends works as expected.

Thanks!

Scott Reynolds’s picture

Do we really want to make every VARCHAR case sensitive ? Kindof a scary statement to me. I think it probably warrants investigation into how it affects other things.

My first reaction to this issue was to make the menu router and the page cache case sensitive, not all VARCHARs.

c960657’s picture

I became aware that at least one place in core (user_load_multiple()) currently relies on LIKE being case-insensitive. With this patch, this is no longer the case. This can be fixed (I think) by adding COLLATE utf8_general_ci to mapConditionOperator() for LIKE, though this prevents index usage. But do we want to keep LIKE case-insensitive? If we make it case-sensitive, people needing to do a case-insensitive match will need to resort to e.g. LOWER(username) = LOWER('foo') - is that acceptable? If there is a great need to this, possibly the query builder could offer support for ILIKE.

Also note that SQLite does not support proper case-insensitive comparison for strings containing non-ASCII characters.

Do we need to do something to catch old non-standard URLs and redirect them to the proper URL (e.g. from /nOdE/1 to /node/1) to prevent the D6->D7 upgrade from breaking existing URLs?

@Scott Reynolds: I agree that it's scary. But on PostgreSQL all VARCHARs are case-sensitive, so we have to move in that direction in order to keep consistency across databases.

Scott Reynolds’s picture

Ok couple interesting questions that not prepared to answer myself at this time.

  • In PostgreSQL, can I make a user with name Kathy and another with kathy? Is this desirable and intended?
  • Performance impact? Easy case to examine user name auto complete.
  • does the PostgreSQL behave the same as MySQL? Does PostgreSQL on some fields case insensitive and can we emulate that behavior in MySQL?

Ok so thats all I got, if that makes sense to anybody else then /me shakes ur hand. I will take a look at these issues this week and come up with answers. But if anyone else beats me too it plz post.

c960657’s picture

StatusFileSize
new21.12 KB

Usernames are made case-insensitive using LOWER(), also on PostgreSQL, so you cannot have both Kathy and kathy. This behaviour is consistent between MySQL and PostgreSQL, and the patch doesn't change this.

I changed user_load_multiple() to use LOWER() like elsewhere in core. This will have a negative performance impact on MySQL. This may be reduced by #279851: Replace LOWER() with db_select() and LIKE() where possible. Autocomplete already uses LOWER(), so no performance impact there.

I also added some tests as requested, and removed the explicit use of mysql_type in locale.install.

dries’s picture

The last version of this patch looks good and addresses my issues. I will commit it tomorrow if no one objects. Thanks.

Status: Needs review » Needs work

The last submitted patch failed testing.

c960657’s picture

Status: Needs work » Needs review
StatusFileSize
new21.15 KB

Reroll.

dries’s picture

Status: Needs review » Fixed

Committed to CVS HEAD. Thanks!

catch’s picture

Title: Page cache should be case sensitive (regression) » Make all varchar columns case sensitive (regression) (was make page cache case sensitive)
Status: Fixed » Needs work

Seems like I missed the boat on voicing objections, but I've got to re-open this:

I became aware that at least one place in core (user_load_multiple()) currently relies on LIKE being case-insensitive. With this patch, this is no longer the case. This can be fixed (I think) by adding COLLATE utf8_general_ci to mapConditionOperator() for LIKE, though this prevents index usage. But do we want to keep LIKE case-insensitive? If we make it case-sensitive, people needing to do a case-insensitive match will need to resort to e.g. LOWER(username) = LOWER('foo') - is that acceptable? If there is a great need to this, possibly the query builder could offer support for ILIKE.

Also note that SQLite does not support proper case-insensitive comparison for strings containing non-ASCII characters.

I changed user_load_multiple() to use LOWER() like elsewhere in core. This will have a negative performance impact on MySQL. This may be reduced by #279851: Performance: Add column 'username' to {user}, remove LOWER(name). Autocomplete already uses LOWER(), so no performance impact there.

It's not acceptable to use LOWER(foo) like we currently do - this is the cause of slow queries on a lot of Drupal sites and there's only two ways to fix it.

1. Add an extra column
--- previously this was the only cross-database compatible way to deal with this issue. You'll note that #279851: Replace LOWER() with db_select() and LIKE() where possible is a spin-off of at least 3 similar issues which tried to do the same thing - all have so far been rejected by Dries or stalled for months or years. I don't think that issue is ever going to be committed. If Dries is prepared to look at the approach on there and +1 it for at least users and taxonomy terms, then let's close this one again, but given previous feedback on adding extra columns I don't have high hopes.

2. Map LIKE() to case-insensitive operators (ILIKE in postgresql) - which was removed in this patch, and only made possible by dbtng.
--- doing it that way gives us indexable queries on MySQL - something we've never had before, without requiring more than a few lines additional logic in database drivers. It also isn't any penalty in other database systems. I much prefer this approach to #1 personally, this patch makes it impossible to achieve.

SQLITE not having case-insensitive LIKE() for non-ASCII characters is a red-herring, it doesn't do LOWER() for non-ascii characters either unless you load the icu extension. http://www.sqlite.org/lang_corefunc.html - so it's going to be equally broken either way.

We can fix the page cache issue without completely hamstringing our only real opportunity to remove a major cause of slow queries in core. I don't see why it's necessary to conflate the two especially when c960657 already picked up potential issues with it.

catch’s picture

Here's at least one of the taxonomy issues which is now unfixable without an additional column #277209: Remove lower from taxonomy autocomplete

damien tournoud’s picture

Oh my... this is a big change, nearly not tested. No input was taken from any of the database maintainers. I believe we should revert that, and think about it a little bit more.

catch’s picture

Status: Needs work » Needs review
StatusFileSize
new14.14 KB

Here's a patch to revert the change.

catch’s picture

And a pending patch with tests which now fail due to the change - #343788: Taxonomy module doesn't use it's own APIs (yes we could use LOWER() there if we really have to, but can you tell I was hoping not...)

dries’s picture

I agree that we should revert this patch and discuss it more. It's broken indeed -- too bad the tests didn't catch this regression. The patch in #21 that reverts the change should be rolled back.

Status: Needs review » Needs work

The last submitted patch failed testing.

c960657’s picture

Status: Needs work » Needs review
StatusFileSize
new18.99 KB

Updated patch for reverting the change.

catch’s picture

Status: Needs review » Reviewed & tested by the community

Since this is a straight revert and Dries suggested rolling this back, marking to RTBC.

webchick’s picture

Status: Reviewed & tested by the community » Active

Ok, committed the rollback patch. Setting back to active, since it sounds like we're back to the drawing board on this. What a thorny issue. :\

Paul Natsuo Kishimoto’s picture

#299433: Sort out query operator mapping may be a duplicate. It's older but has less discussion.

Crell’s picture

Oh my, somehow this never ended up on my radar.

I agree with catch above that we need case-insensitive behavior more often than not. That's why the Postgres driver forces all built queries (we can't do anything about static queries) to be case-insensitive.

If we use a built query (don't we already?) does this issue go away?

I'm marking the other issue as a dupe, since it never got any traction.

sun.core’s picture

Version: 7.x-dev » 8.x-dev

I guess this is D8 stuff now.

damien tournoud’s picture

Version: 8.x-dev » 7.x-dev

Only feature requests and tasks can be assigned to D8 for now, because the Drupal 8 tree is not opened yet.

This is marked as a critical bug, we should assess if we still consider it critical.

mustanggb’s picture

Could you force it to lowercase when caching then retrieve with a case insensitive LIKE statement

Bojhan’s picture

Version: 7.x-dev » 8.x-dev
damien tournoud’s picture

Version: 8.x-dev » 7.x-dev

See #31.

webchick’s picture

Version: 7.x-dev » 8.x-dev
Category: bug » task
Priority: Critical » Major

I'm not sure how this could possibly be a critical bug given that 19 point releases of Drupal 6 have shipped with it.

pwolanin’s picture

It seems like we never addressed the bug described in #4 - that the cache CID may be different for 2 pages that map to the same callback (node vs. Node). Should we open a new issue for that?

Also, in general it seems like we may have Drupal behaving inconsistently based on the database and its settings or collation. While in general URLs are case sensitive, should Drupal be explicit that path ($_GET['q']) handling is case insensitive? Perhaps within http://api.drupal.org/api/drupal/includes--bootstrap.inc/function/reques... we should lowercase it?

tim.plunkett’s picture

Priority: Major » Normal

Echoing #36, this has been the status quo for the entire lifetime of both current releases of Drupal.

Version: 8.0.x-dev » 8.1.x-dev

Drupal 8.0.6 was released on April 6 and is the final bugfix release for the Drupal 8.0.x series. Drupal 8.0.x will not receive any further development aside from security fixes. Drupal 8.1.0-rc1 is now available and sites should prepare to update to 8.1.0.

Bug reports should be targeted against the 8.1.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.2.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

  • Dries committed 27d7455 on 8.3.x
    - Patch #333054 by c960657: page cache should be case sensitive.
    
    
  • webchick committed d30a41e on 8.3.x
    Roll-back of #333054; needs more discussion.
    
    

  • Dries committed 27d7455 on 8.3.x
    - Patch #333054 by c960657: page cache should be case sensitive.
    
    
  • webchick committed d30a41e on 8.3.x
    Roll-back of #333054; needs more discussion.
    
    

Version: 8.1.x-dev » 8.2.x-dev

Drupal 8.1.9 was released on September 7 and is the final bugfix release for the Drupal 8.1.x series. Drupal 8.1.x will not receive any further development aside from security fixes. Drupal 8.2.0-rc1 is now available and sites should prepare to upgrade to 8.2.0.

Bug reports should be targeted against the 8.2.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.3.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

  • Dries committed 27d7455 on 8.4.x
    - Patch #333054 by c960657: page cache should be case sensitive.
    
    
  • webchick committed d30a41e on 8.4.x
    Roll-back of #333054; needs more discussion.
    
    

  • Dries committed 27d7455 on 8.4.x
    - Patch #333054 by c960657: page cache should be case sensitive.
    
    
  • webchick committed d30a41e on 8.4.x
    Roll-back of #333054; needs more discussion.
    
    

Version: 8.2.x-dev » 8.3.x-dev

Drupal 8.2.6 was released on February 1, 2017 and is the final full bugfix release for the Drupal 8.2.x series. Drupal 8.2.x will not receive any further development aside from critical and security fixes. Sites should prepare to update to 8.3.0 on April 5, 2017. (Drupal 8.3.0-alpha1 is available for testing.)

Bug reports should be targeted against the 8.3.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.4.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.3.x-dev » 8.4.x-dev

Drupal 8.3.6 was released on August 2, 2017 and is the final full bugfix release for the Drupal 8.3.x series. Drupal 8.3.x will not receive any further development aside from critical and security fixes. Sites should prepare to update to 8.4.0 on October 4, 2017. (Drupal 8.4.0-alpha1 is available for testing.)

Bug reports should be targeted against the 8.4.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.5.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.4.x-dev » 8.5.x-dev

Drupal 8.4.4 was released on January 3, 2018 and is the final full bugfix release for the Drupal 8.4.x series. Drupal 8.4.x will not receive any further development aside from critical and security fixes. Sites should prepare to update to 8.5.0 on March 7, 2018. (Drupal 8.5.0-alpha1 is available for testing.)

Bug reports should be targeted against the 8.5.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.6.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.5.x-dev » 8.6.x-dev

Drupal 8.5.6 was released on August 1, 2018 and is the final bugfix release for the Drupal 8.5.x series. Drupal 8.5.x will not receive any further development aside from security fixes. Sites should prepare to update to 8.6.0 on September 5, 2018. (Drupal 8.6.0-rc1 is available for testing.)

Bug reports should be targeted against the 8.6.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.7.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.6.x-dev » 8.8.x-dev

Drupal 8.6.x will not receive any further development aside from security fixes. Bug reports should be targeted against the 8.8.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.9.x-dev branch. For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

  • Dries committed 27d7455 on 9.1.x
    - Patch #333054 by c960657: page cache should be case sensitive.
    
    
  • webchick committed d30a41e on 9.1.x
    Roll-back of #333054; needs more discussion.
    
    

Version: 8.8.x-dev » 8.9.x-dev

Drupal 8.8.7 was released on June 3, 2020 and is the final full bugfix release for the Drupal 8.8.x series. Drupal 8.8.x will not receive any further development aside from security fixes. Sites should prepare to update to Drupal 8.9.0 or Drupal 9.0.0 for ongoing support.

Bug reports should be targeted against the 8.9.x-dev branch from now on, and new development or disruptive changes should be targeted against the 9.1.x-dev branch. For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

Version: 8.9.x-dev » 9.2.x-dev

Drupal 8 is end-of-life as of November 17, 2021. There will not be further changes made to Drupal 8. Bugfixes are now made to the 9.3.x and higher branches only. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 9.2.x-dev » 9.3.x-dev

Version: 9.3.x-dev » 9.4.x-dev

Drupal 9.3.15 was released on June 1st, 2022 and is the final full bugfix release for the Drupal 9.3.x series. Drupal 9.3.x will not receive any further development aside from security fixes. Drupal 9 bug reports should be targeted for the 9.4.x-dev branch from now on, and new development or disruptive changes should be targeted for the 9.5.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 9.4.x-dev » 9.5.x-dev

Drupal 9.4.9 was released on December 7, 2022 and is the final full bugfix release for the Drupal 9.4.x series. Drupal 9.4.x will not receive any further development aside from security fixes. Drupal 9 bug reports should be targeted for the 9.5.x-dev branch from now on, and new development or disruptive changes should be targeted for the 10.1.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 9.5.x-dev » 11.x-dev

Drupal core is moving towards using a “main” branch. As an interim step, a new 11.x branch has been opened, as Drupal.org infrastructure cannot currently fully support a branch named main. New developments and disruptive changes should now be targeted for the 11.x branch. For more information, see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 11.x-dev » main

Drupal core is now using the main branch as the primary development branch. New developments and disruptive changes should now be targeted to the main branch.

Read more in the announcement.