PostgreSQL: deal with case insensitivity [#2464481]

Comment	File	Size	Author
#82	2464481-nr-bot.txt	172 bytes	needs-review-queue-bot
#71	postgres-case-insensitivity_2464481-71.patch	5.08 KB	kalpaitch
8.9.x: PHP 7.4 & MySQL 5.7 28,604 pass
#57	2464481-57.patch	5.13 KB	stefan.r

#32	test-results-for-31.txt	21.41 KB	mradcliffe
#31	drupal-2464481-pgsql-case-sensitive-31.patch	7.24 KB	mradcliffe

#31	interdiff-2464481-29-31.txt	5.76 KB	mradcliffe
#30	node-taxonomy-entity-database-test-results-30.txt	18.49 KB	mradcliffe
#29	drupal-2464481-pgsql-case-sensitive-29.patch	7 KB	mradcliffe

#27	drupal-2464481-citext-types-27.patch	5.36 KB	mradcliffe

#8	citex-run.txt	195.58 KB	bzrudi71

Comment #1

he/him

Russian

CreditAttribution: andypost commented 2 April 2015 at 13:59

According http://www.postgresql.org/docs/9.3/static/citext.html there's still dependency on LC_* settings.
Another trick that CITEXT introduced in 8.3 but as option to compile so probably we need to update requirements https://www.drupal.org/requirements pointing that we need 8.4 at least

Log in or register to post comments

Comment #2

mradcliffe

he/him

English

CreditAttribution: mradcliffe commented 2 April 2015 at 14:13

Issue summary:

View changes

Added some questions.

It's easy to install postgresql-contrib package on CentOS and Debian dedicated and managed servers, but what about Drupal hosting companies from the larger companies to the smaller companies (postgresql hosts)?

Log in or register to post comments

Comment #3

bzrudi71 CreditAttribution: bzrudi71 commented 2 April 2015 at 14:18

@andypost: I'm not sure if we will ever have docker testing for PG 8.4 and IMHO we should require PG 9.1 (released 2011). I think to finish this task we need to:

Make make Installer check for enabled CITEXT extension
Write a patch to map TEXT/VARCHAR/CHAR->CITEXT
Write patch for the docker containers
Update all documentation and requirements for PostgreSQL

@mradcliffe: Yes, I already thought about this. But I don't think that there are many shared Servers around supporting PG at all?

Log in or register to post comments

Comment #4

mradcliffe

he/him

English

CreditAttribution: mradcliffe commented 2 April 2015 at 14:35

I know one smaller web host, and I'm not sure if Acquia's clients that use postgres are using DevCloud.

I'm not sure if everything should be case insensitive.

The issue with LC_* may be relevant for a Drupal 8 Multilingual install where a site builder has enabled multiple languages but the database collation is the same. Does it matter for UTF-8? Gabor may have an answer for that.

Log in or register to post comments

Comment #5

bzrudi71 CreditAttribution: bzrudi71 commented 2 April 2015 at 14:58

@mradcliffe: I'm with you regarding the requirement of case insensitiveness. But I think that will lead to DrupalWTF in possible future tests if only PG is doing it the other way? I'm totally open to close this issue and make use of #2459745: Allow the database driver to skip test classes ;-)
The LC_* should not matter for UTF-8 IMO and we force UTF-8 on install and connection.

[EDIT] I'm currently doing a full test run with citext and will post it here later to help us make a decision if it's worth or not :-) Interesting BTW, it seems that some of the migrate tests have pass now with enabled citext!?!?

Log in or register to post comments

Comment #6

chx CreditAttribution: chx at MongoDB commented 2 April 2015 at 15:45

Debian Jessie drops this month https://lists.debian.org/debian-devel-announce/2015/03/msg00016.html making wheezy oldstable which contains 9.1.15. Ubuntu 10.04 also ends in a month, Ubuntu 12.04 defaults to 9.1.15 as well. http://packages.ubuntu.com/precise/postgresql We can't deal with the 10 year cycle of RHEL/CentOS, RHEL5 EOL is extended to March 31, 2017 -- released in 2007. PostgreSQL has a http://yum.postgresql.org/repopackages.php repo for them.

http://www.postgresql.org/docs/9.1/static/release-9-1-2.html 9.1.2 was released more than three years ago and given the citext related upgrade problem that's the release where I would put the requirement. Also, http://www.postgresql.org/support/versioning/ 9.0 is EOL this September so putting a requirement on it makes little sense.

tl;dr: raise to 9.1.2 and be done.

Log in or register to post comments

Comment #7

jaredsmith CreditAttribution: jaredsmith commented 2 April 2015 at 18:05

It seems to me like this is approaching the problem backward -- trying to fix something with a workaround in PostgreSQL to "fix" something that is arguably a problem (or simple difference of opinion) with regards to text comparisons in MySQL.

Here are my reasons for leaning away from using the CITEST module:

The CITEXT extension certainly adds further complication, and isn't typically available in most shared hosting environments. (Full disclosure... I work for one of the largest shared hosting companies in the world. I'm speaking from experience here.) I'm concerned that this makes using Drupal on PostgreSQL even more complicated than it already is.
Using CITEXT assumes that Drupal is *always* going to do case-insensitive comparisons. Is that the case? Will there every be a time when we *do* want to do case-sensitive comparisons?
Unicode/multi-lingual support. To quote the PostgreSQL manual:

CITEXT's case-folding behavior depends on the LC_CTYPE setting of your database. How it compares values is therefore determined when the database is created. It is not truly case-insensitive in the terms defined by the Unicode standard. Effectively, what this means is that, as long as you're happy with your collation, you should be happy with citext's comparisons. But if you have data in different languages stored in your database, users of one language may find their query results are not as expected if the collation is for another language.PostgreSQL Manual
This is likely to expose a large number of additional failures in the testing, which will be automatically attributed to PostgreSQL -- just at a time when the number of tests failing due to PostgreSQL is at a historic low.

Log in or register to post comments

Comment #8

bzrudi71 CreditAttribution: bzrudi71 commented 2 April 2015 at 18:10

File	Size
citex-run.txt	195.58 KB

Test run finally completed and attached. Here is the list of new failing tests for the impatient:

Drupal\comment\Tests\CommentAdminTest 1 exceptions
Drupal\comment\Tests\CommentCacheTagsTest 1 exceptions
Drupal\config\Tests\ConfigImportRecreateTest 1 fails
Drupal\system\Tests\Database\CaseSensitivityTest 1 exceptions
Drupal\system\Tests\Entity\EntityQueryTest 2 fails
Drupal\system\Tests\Entity\FieldSqlStorageTest 1 fails
Drupal\entity_reference\Tests\Views\EntityReferenceRelations 4 fails
Drupal\field_ui\Tests\EntityFormDisplayTest 1 exceptions
Drupal\field_ui\Tests\EntityDisplayTest 1 exceptions
Drupal\file\Tests\SaveTest 1 exceptions
Drupal\views\Tests\Handler\FieldGroupRowsTest 2 fails
Drupal\views\Tests\Handler\FieldGroupRowsWebTest 1 fails
Drupal\views\Tests\Handler\FilterStringTest 1 exceptions

@jaredsmith: Thanks for the feedback! Now that we have a first impression of what will fail I tend more and more to a nogo for CITEX ;-)

Log in or register to post comments

Comment #9

chx CreditAttribution: chx at MongoDB commented 2 April 2015 at 18:54

> Using CITEXT assumes that Drupal is *always* going to do case-insensitive comparisons. Is that the case? Will there every be a time when we *do* want to do case-sensitive comparisons?

We have a field and a schema setting for case sensitive collations. Please refer to #2068655: Entity fields do not support case sensitive queries

Log in or register to post comments

Comment #10

stefan.r CreditAttribution: stefan.r commented 4 April 2015 at 22:44

Would using UPPER()/LOWER() on comparisons with binary columns be a valid workaround?

Log in or register to post comments

Comment #11

stefan.r CreditAttribution: stefan.r commented 5 April 2015 at 11:41

Hmm it looks like that is what the CITEXT extension does anyway. Only problem there seems to be that not all upper case letters have a matching lower case and vice versa so results won't 100% match MySQL.

So just to clarify, I was suggesting we update the pgsql query classes so that case insenstive columns turn into LOWER('String') = LOWER('STRING') .. or if we want to be brave we parse the final query and do the LOWER() wrapping there. We'd also have to update the Schema class and create indexes on LOWER(case_insensitive_column).

Log in or register to post comments

Comment #12

chx CreditAttribution: chx at MongoDB commented 5 April 2015 at 11:54

You'd need to ORDER BY on LOWER() as well. SQL string parsing from PHP is a questionable choice I am afraid although of course using prepared statements as we have it helps tremendously because you do not need to account for user data in there.

Also, you'll have some problems knowing on the SQL level which columns are case sensitive.

Edit: I looked up the citext source, it's very short and understandable and yes, it does lowercase everything before comparison and doesn't do anything else.

Log in or register to post comments

Comment #13

stefan.r CreditAttribution: stefan.r commented 5 April 2015 at 15:10

Also, you'll have some problems knowing on the SQL level which columns are case sensitive.

@chx what do you mean by that, can you explain why this is a problem? Are we talking about unique constraints etc? Aren't most of the issues covered by sending the right query? Thanks to the schema we know in PHP which columns are case sensitive...

@jaredsmith:

Using CITEXT assumes that Drupal is *always* going to do case-insensitive comparisons. Is that the case?

Not necessarily, it looks like with CITEXT you can define case insensitivity on a column level. Only columns which have binary=TRUE in the schema need have case sensitivity.

The CITEXT extension certainly adds further complication, and isn't typically available in most shared hosting environments. (Full disclosure... I work for one of the largest shared hosting companies in the world. I'm speaking from experience here.) I'm concerned that this makes using Drupal on PostgreSQL even more complicated than it already is.

Not only that, it could also be a showstopper on large enterprise deployments where there are rigid constraints in terms of customization. Some of which still use 8.4 by the way, although with it being EOL'd that may change by the time D8 comes out .

Log in or register to post comments

Comment #14

chx CreditAttribution: chx at MongoDB commented 5 April 2015 at 18:34

The DBTNG driver will face an almost insurmountable challenge of figuring out which database columns are case sensitive and which are not. This information is available to, say, an entity field storage driver but not the DB driver.

What you can do instead of parsing SQL string is to override various storage drivers in D8 and set up the queries as you need them. This will be quite an endeavour -- there are a lot.

Log in or register to post comments

Comment #15

stefan.r CreditAttribution: stefan.r commented 5 April 2015 at 19:02

chx yes that was what I suggested earlier, the drawback is we would have no support for case insensitivity in raw db_query statements.

As to figuring out case insensitive fields, couldn't we cache this information and pass it to the DB driver, it's only a handful of table.column combinations which won't really change that often.

Log in or register to post comments

Comment #16

chx CreditAttribution: chx at MongoDB commented 5 April 2015 at 19:35

Well, the Schema class on create and alter time can certainly capture those columns and store somewhere.

Log in or register to post comments

Comment #17

stefan.r CreditAttribution: stefan.r commented 5 April 2015 at 21:09

Title:

PostgreSQL: Require CITEXT extension

» PostgreSQL: deal with case insenstivity

Do we know how well the LOWER() trick works though? Does it work well with the full UTF8 charset?

I looked into this a bit, and it's a long shot as I don't know how this will work with UTF8 data (if at all), but the cs_CZ.ISO8859-2 collation is case insensitive, and PostgreSQL 9.1 allows "per column" collations: http://michael.otacoo.com/postgresql-2/collation-in-postgresql-9-1/

$ sort test 
A
B
C
F
a
b
d
e
g
$ export LC_COLLATE="cs_CZ.ISO8859-2"
$ sort test
A
a
B
b
C
d
e
F
g
$ export LC_COLLATE="cs_CZ.UTF8"
$ sort test 
A
B
C
F
a
b
d
e
g

/edit: but seems pretty limited http://collation-charts.org/fbsd54/fbsd54.cs_CZ.ISO8859-2.html

Log in or register to post comments

Comment #18

stefan.r CreditAttribution: stefan.r commented 11 April 2015 at 16:20

So as a first step, would the following make sense as far as the pgsql driver is concerned?

In Schema::createFieldSql() we store whether a varchar/character/text column is binary or not somewhere (maybe in a separate drupal_pgsql_case_sensitive_fields table in the database?), so this information gets saved on field creation/alteration.
In the Connection object, we load this information from the database and put it into a property $caseSensitiveFields = array('tablename.fieldname', 'tablename2.fieldname2');
We then wrap the field names that are case insensitive with LOWER() on indexes, orderBy, on field names in conditions and in Select::__toString()

Log in or register to post comments

Comment #19

stefan.r CreditAttribution: stefan.r commented 21 April 2015 at 21:12

Just to give an update, I discussed this with @amateescu and he pointed me to CREATE FUNCTION, however I haven't been able to find any functionality that is smart enough to alter incoming queries the way we need to. The only thing I can think of is adding another "copy" column for every case insensitive field with a trigger that lowers the case for the "copied" value, which would essentially double the storage usage of the database. Not really a workable solution :)

He also pointed to the table name parsing in the Select class, which would make db_select etc easy to process but it seems we can't use that for regular db_query() statements. These would be more challenging as without CITEXT we'd inevitably need to do some sort of query parsing. However from an initial look at the db_query statements used in core and contrib, covering 100% of queries in core and ~99.9% of queries in the most popular contrib module looks doable and shouldn't pose a huge performance penalty as 99% of db_query() statements will be simple anyway.

We could do smarter parsing for the other 0.9% of queries, and ask people to rewrite the other 0.1% that are too complicated to process in a simple parser (that wouldn't try to cover any possible MySQL query), offering a pgsql_citext driver as a module in contrib for those who need 100% case sensitivity accuracy as opposed to >99.9%.

I will have a look at this over the next week. If we don't manage to implement a workable LOWER() solution, I think we'll have to require the postgresql citext module...

Log in or register to post comments

Comment #20

chx CreditAttribution: chx at MongoDB commented 21 April 2015 at 21:49

My big concern here is security. Are there places where a case sensitive query versus a case insensitive can lead to information disclosure or worse? I do not think so... but I haven't went through the possible cases really.

Log in or register to post comments

Comment #21

stefan.r CreditAttribution: stefan.r commented 21 April 2015 at 22:08

Well the opposite might be true (doing case insensitive checks where we really want case sensitive checks), but it's hard to imagine accidental case sensitivity to pose a problem, at least in core. We usually don't really care about case.

In any case this would need a security audit, I guess wrapping placeholders in LOWER() would be possible to implement safely but I'd be very wary of doing this in queries with user input where the placeholders have already been applied.

Log in or register to post comments

Comment #22

mradcliffe

he/him

English

CreditAttribution: mradcliffe commented 21 April 2015 at 22:08

Password columns shouldn't be stored as case-insensitive text (citext), and those columns are in schema as type "text" (varchar), right?

Users table is still generated by hook_schema()? Or does it use TypedData Data Type plugins? PasswordItem exists if we can use the latter.

Log in or register to post comments

Comment #23

stefan.r CreditAttribution: stefan.r commented 24 April 2015 at 10:23

The password column is actually case insensitive right now ;)

Technically it shouldn't be but it's not like we store the actual password, just a hash

Created #2475539: Make password entity fields case sensitive by default anyway

Log in or register to post comments

Comment #24

xjm

she/her

English

CreditAttribution: xjm at Acquia commented 24 April 2015 at 18:36

Also see: #2477413: Increase minimum version requirement for Postgres to 9.1.2

Log in or register to post comments

Comment #25

bzrudi71 CreditAttribution: bzrudi71 commented 26 April 2015 at 15:00

If we could cover 90% of all cases by an own implementation that would be lovely. I just think it will require more work as we expect right now as the citext module does a bit more magic than just a simple LOWER() ;-) On the other hand citext doesn't seem to work out-of-the box and requires a lot of work also. I'm curious of the results if @stefan.r starts the challenge :-)

Log in or register to post comments

Comment #26

mradcliffe

he/him

English

CreditAttribution: mradcliffe commented 16 May 2015 at 01:17

Issue tags:

+Needs issue summary update

I'm going to tackle/discuss this tomorrow with people, but need to organize the comments from the past month.

Log in or register to post comments

Comment #27

mradcliffe

he/him

English

CreditAttribution: mradcliffe at Kosada commented 17 May 2015 at 01:00

File	Size
drupal-2464481-citext-types-27.patch	5.36 KB

This is a citext patch:

1. Requires super user to create extension citext.
2. Fixes TermTest
3. Does not fix GlossaryTest because Views does a SUBSTRING which compares string to citext = fail.
4. I think number of additional fails may be low, but need to run a test.

Tomorrow I think I will not do this and look at a suggestion from @jaredsmith regarding trying to change Conditions into Having statements. I tried to add LOWER in Condition, but ran into too many regular expression hacks in the driver. I could also have been low on blood sugar because lunch was really really really late.

Log in or register to post comments

Comment #28

mradcliffe

he/him

English

CreditAttribution: mradcliffe at Kosada commented 17 May 2015 at 01:34

Status:

Active

» Needs work

I couldn't quickly get the drupalci testbot working with citext because the postgres user must create the extension in a specific schema, but the web container does not have access to do this and the database container does not know the database.

Log in or register to post comments

Comment #29

mradcliffe

he/him

English

CreditAttribution: mradcliffe at Kosada commented 17 May 2015 at 21:26

File	Size
drupal-2464481-pgsql-case-sensitive-29.patch	7 KB

Here is a different approach that fixes the 2 term tests, but breaks taxonomy create/delete and possibly lots of other tests.

Log in or register to post comments

Comment #30

mradcliffe

he/him

English

CreditAttribution: mradcliffe at Kosada commented 17 May 2015 at 21:59

File	Size
node-taxonomy-entity-database-test-results-30.txt	18.49 KB

Okay, this gives me an idea of the test fails that the patch above created.

It looks like langcode in condition needs to be handled and that entity delete is not working in tests (I tried taxonomy term delete page manually and it worked). However I didn't try multiple terms.

Log in or register to post comments

Comment #31

mradcliffe

he/him

English

CreditAttribution: mradcliffe at Kosada commented 17 May 2015 at 23:01

File	Size
interdiff-2464481-29-31.txt	5.76 KB
drupal-2464481-pgsql-case-sensitive-31.patch	7.24 KB

This patch fixes some bad query construction in the previous patch, which probably caused many of the exceptions in the test run for #30.

Edit: There are still many fails.

Log in or register to post comments

Comment #32

mradcliffe

he/him

English

CreditAttribution: mradcliffe at Kosada commented 17 May 2015 at 23:34

File	Size
test-results-for-31.txt	21.41 KB

Okay, yeah, that's much better... 1 exception, 1 fatal error, and 3 fails.

Log in or register to post comments

Comment #33

mradcliffe

he/him

English

CreditAttribution: mradcliffe at Kosada commented 18 May 2015 at 06:00

VocabularyPermissionsTest passes for me on my local development environment.

Log in or register to post comments

Comment #34

chx CreditAttribution: chx commented 18 May 2015 at 13:48

So then we would only support entity queries ?

Log in or register to post comments

Comment #35

stefan.r CreditAttribution: stefan.r commented 18 May 2015 at 14:05

@mradcliffe wouldn't we have to do something similar for db_query() etc?

Log in or register to post comments

Comment #36

mradcliffe

he/him

English

CreditAttribution: mradcliffe at Kosada commented 18 May 2015 at 16:05

I wanted to focus on the queries inside the failing tests, but yes, it's possible that something similar needs to be done for the query builder, but I also think it will be very messy per @chx's comment in #14 after I tried to do something similar for entity queries and modify escapeField to handle it.

Entity Query and Entity Manager will be what developers use to fetch entities, but it does use more memory and is not as efficient as fetching one column. Maybe also Views queries?

Other things that needs to happen:

- @jaredsmith mentioned indexes need to be created as case insensitive, but this should be done for any database system.

Log in or register to post comments

Comment #37

david_garcia CreditAttribution: david_garcia commented 18 May 2015 at 22:27

The locales_source table is using specific MySQL storage type "BLOB" but the locale() translation algorithm is designed to work with case sensitive data.

The result is that locale() internal caching is half broken on non MySQL engines under some circumstances.

Maybe I'm wrong, but this means a big bunch of queries on every t() call because some strings will not get cached.

#2490976: Locale caching algorithm is broken on Non MySQL/PostgreSQL databases

Log in or register to post comments

Comment #38

stefan.r CreditAttribution: stefan.r commented 18 May 2015 at 22:55

@david_garcia BLOBs are binary data and as such they should always be case sensitive anyway, right?

Log in or register to post comments

Comment #39

david_garcia CreditAttribution: david_garcia commented 18 May 2015 at 23:39

That is exactly the issue, take a look at this field definition:

 'source' => array(
        'type' => 'text',
        'mysql_type' => 'blob',
        'not null' => TRUE,
        'description' => 'The original string in English.',
      ),

This field is going to be case sensitive on MySQL but case insensitive on any other database engine.

For this field definition to be portable it should be something like this:

 'source' => array(
        'type' => 'text',
        'mysql_type' => 'blob',
        'not null' => TRUE,
        'binary' => TRUE,
        'description' => 'The original string in English.',
      ),

'binary': A boolean indicating that MySQL should force 'char', 'varchar' or 'text' fields to use case-sensitive binary collation. This has no effect on other database types for which case sensitivity is already the default behavior.

Or is the default behaviour of a text field supposed to be 'BINARY=TRUE' ? All the usages of 'binary' I have found do a 'binary=true' which hints as binary=FALSE being the default behaviour.

Or maybe this has changed from D7 to D8 and I'm totally confused :(

Log in or register to post comments

Comment #40

david_garcia CreditAttribution: david_garcia commented 20 May 2015 at 15:08

Log in or register to post comments

Comment #41

erik.erskine CreditAttribution: erik.erskine as a volunteer commented 20 May 2015 at 16:39

Just tried the citext patch in #27

+++ b/core/lib/Drupal/Core/Database/Driver/pgsql/Install/Tasks.php
@@ -131,6 +135,24 @@ protected function checkEncoding() {
+      $version = db_query("SELECT installed_version FROM pg_available_extensions WHERE name = %name", array('%name' => 'citext'))->fetchField();

Getting an error during installation - should this be :name instead of %name?

Log in or register to post comments

Comment #42

bzrudi71 CreditAttribution: bzrudi71 commented 29 May 2015 at 06:57

Log in or register to post comments

Comment #43

daffie CreditAttribution: daffie commented 11 June 2015 at 18:08

Status:

Needs work

» Needs review

4 files were hidden/shown/deleted

File	Size
citex-run.txt	195.58 KB
drupal-2464481-citext-types-27.patch	5.36 KB

drupal-2464481-pgsql-case-sensitive-29.patch	7 KB

node-taxonomy-entity-database-test-results-30.txt	18.49 KB

For the testbot.

Log in or register to post comments

Comment #44

11 June 2015 at 20:56

Status:

Needs review

» Needs work

The last submitted patch, 31: drupal-2464481-pgsql-case-sensitive-31.patch, failed testing.

Log in or register to post comments

Comment #45

11 June 2015 at 21:28

daffie queued 31: drupal-2464481-pgsql-case-sensitive-31.patch for re-testing.

Log in or register to post comments

Comment #46

daffie CreditAttribution: daffie commented 12 June 2015 at 08:19

Issue summary:

View changes

Log in or register to post comments

Comment #47

daffie CreditAttribution: daffie commented 12 June 2015 at 08:27

Title:	PostgreSQL: deal with case insenstivity	» PostgreSQL: deal with case insensitivity
Status:	Needs work	» Reviewed & tested by the community
Issue tags:	-Needs issue summary update

The patch from comment #31 fixes the tests from #2443679: PostgreSQL: Fix taxonomy\Tests\TermTest.

I think that for should create a followup issue for fixing the case insensitivity for db_query().

The patch looks good to me. It gets a RTBC from me.

Log in or register to post comments

Comment #48

bzrudi71 CreditAttribution: bzrudi71 commented 12 June 2015 at 08:38

Status:

Reviewed & tested by the community

» Needs review

Setting back for needs review for now ;) I think we need at least a full testbot run to see what else will fail. Also the problem of lowercased indexes seems not addressed within this patch. I will do a full bot run within the next hours and report...

Log in or register to post comments

Comment #49

bzrudi71 CreditAttribution: bzrudi71 commented 12 June 2015 at 10:42

Just an idea while bot is running. If we decide to go for PG 9.1.x as minimum requirement there is a new feature that could help us here. As of 9.1 there is support for column based collations! So I wonder if we could implement something like amateescu did for SQLite #2454733: Add a user-space case-insensitive collation to the SQLite driver and can get rid of all these workarounds here by implementing an all no case collation?

Log in or register to post comments

Comment #50

stefan.r CreditAttribution: stefan.r commented 12 June 2015 at 10:49

I mentioned this in #17 as well but PostgreSQL uses the collations from the system -- and there aren't any case insensitive collations for UTF8 that I'm aware of.

Log in or register to post comments

Comment #51

bzrudi71 CreditAttribution: bzrudi71 commented 12 June 2015 at 11:05

Argh, no ;) Thanks @stefan.r!
I read a bit and it seems my idea is a no-go. Hopefully we will get good results from bot to move forward with the current approach :)

Log in or register to post comments

Comment #52

bzrudi71 CreditAttribution: bzrudi71 commented 12 June 2015 at 11:48

Nice, bot run just completed and here is what fails with patch attached:

Drupal\aggregator\Tests\ImportOpmlTest 63 passes   3 fails
Drupal\comment\Tests\CommentLanguageTest  40 passes   2 fails   7 exceptions
Drupal\system\Tests\Entity\EntityDefinitionUpdateTest 501 passes                                      

Fatal error: Call to a member function getTranslation() on a non-object in /var/www/html/core/modules/system/src/Tests/Entity/EntityTranslationTest.php on line 281
FATAL Drupal\system\Tests\Entity\EntityTranslationTest: test runner returned a non-zero error code (255).
- Found database prefix 'simpletest310119' for test ID 305.
[12-Jun-2015 08:55:23 UTC] PHP Fatal error:  Call to a member function getTranslation() on a non-object in /var/www/html/core/modules/system/src/Tests/Entity/EntityTranslationTest.php on line 281

Drupal\system\Tests\Entity\EntityQueryTest 138 passes   3 fails

Great, I expected more fails :)

Log in or register to post comments

Comment #53

daffie CreditAttribution: daffie commented 13 June 2015 at 14:45

Status:

Needs review

» Needs work

As bzrudi71 in the previous comment stated: the patch from #31 does not pass all the tests with a PostgreSQL backend. So back to need work!

Log in or register to post comments

Comment #54

bzrudi71 CreditAttribution: bzrudi71 commented 13 June 2015 at 15:49

Status:

Needs work

» Postponed

We have a new patch based on this work to just fix the TermTest part over in #2443679: PostgreSQL: Fix taxonomy\Tests\TermTest. As this patch causes new fails and exceptions which isn't a good idea at this point let's postpone this one for now and do a re-roll as soon as #2443679: PostgreSQL: Fix taxonomy\Tests\TermTest is in.

Log in or register to post comments

Comment #55

daffie CreditAttribution: daffie commented 1 July 2015 at 15:22

Status:	Postponed	» Needs work
Issue tags:		+Needs reroll

#2443679: PostgreSQL: Fix taxonomy\Tests\TermTest is in.

Log in or register to post comments

Comment #56

stefan.r CreditAttribution: stefan.r commented 1 July 2015 at 16:37

seems chx is still working on that issue so we may want to hold off on the reroll?

Log in or register to post comments

Comment #57

stefan.r CreditAttribution: stefan.r commented 1 July 2015 at 16:52

Status:	Needs work	» Needs review
Issue tags:	-Needs reroll

File	Size
2464481-57.patch	5.13 KB

attempt at a reroll anyway, I assume this will need further work once @chx finishes working on the other issue

Log in or register to post comments

Comment #58

stefan.r CreditAttribution: stefan.r commented 17 August 2015 at 20:50

cross-posting from @Damien Tournoud in #2477413: Increase minimum version requirement for Postgres to 9.1.2:

(..) I'm not optimistic about citext. It would make more sense to me to switch to case-sensitive, collation-sensitive *by default everywhere* (including on MySQL), and handle case and collation sensitivity manually in PHP. This is the only way to have a well-defined, consistent and efficient behavior on all database engines.

...which would be quite an invasive change but may actually be a good idea...

Log in or register to post comments

Comment #59

jaredsmith CreditAttribution: jaredsmith as a volunteer commented 20 September 2015 at 13:12

I have to agree with @stefan.r and @damien Tournoud.

I don't think using citext is the right answer either... I'd rather we switch to handling case-sensitive, collation-sensitive everywhere as well.

It may be an invasive change, but I think it's worthwhile.

Log in or register to post comments

Comment #60

20 September 2015 at 13:12

Version:

8.0.x-dev

» 8.1.x-dev

Drupal 8.0.6 was released on April 6 and is the final bugfix release for the Drupal 8.0.x series. Drupal 8.0.x will not receive any further development aside from security fixes. Drupal 8.1.0-rc1 is now available and sites should prepare to update to 8.1.0.

Bug reports should be targeted against the 8.1.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.2.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Log in or register to post comments

Comment #61

20 September 2015 at 13:12

Version:

8.1.x-dev

» 8.2.x-dev

Drupal 8.1.9 was released on September 7 and is the final bugfix release for the Drupal 8.1.x series. Drupal 8.1.x will not receive any further development aside from security fixes. Drupal 8.2.0-rc1 is now available and sites should prepare to upgrade to 8.2.0.

Bug reports should be targeted against the 8.2.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.3.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Log in or register to post comments

Comment #62

20 September 2015 at 13:12

Version:

8.2.x-dev

» 8.3.x-dev

Drupal 8.2.6 was released on February 1, 2017 and is the final full bugfix release for the Drupal 8.2.x series. Drupal 8.2.x will not receive any further development aside from critical and security fixes. Sites should prepare to update to 8.3.0 on April 5, 2017. (Drupal 8.3.0-alpha1 is available for testing.)

Bug reports should be targeted against the 8.3.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.4.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Log in or register to post comments

Comment #63

20 September 2015 at 13:12

Version:

8.3.x-dev

» 8.4.x-dev

Drupal 8.3.6 was released on August 2, 2017 and is the final full bugfix release for the Drupal 8.3.x series. Drupal 8.3.x will not receive any further development aside from critical and security fixes. Sites should prepare to update to 8.4.0 on October 4, 2017. (Drupal 8.4.0-alpha1 is available for testing.)

Bug reports should be targeted against the 8.4.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.5.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Log in or register to post comments

Comment #64

andypost

he/him

Russian

CreditAttribution: andypost commented 12 December 2017 at 01:20

Btw CITEXT as capability could be exposed at driver level and drivers should take collation under control

Log in or register to post comments

Comment #65

12 December 2017 at 01:20

Version:

8.4.x-dev

» 8.5.x-dev

Drupal 8.4.4 was released on January 3, 2018 and is the final full bugfix release for the Drupal 8.4.x series. Drupal 8.4.x will not receive any further development aside from critical and security fixes. Sites should prepare to update to 8.5.0 on March 7, 2018. (Drupal 8.5.0-alpha1 is available for testing.)

Bug reports should be targeted against the 8.5.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.6.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Log in or register to post comments

Comment #66

12 December 2017 at 01:20

Version:

8.5.x-dev

» 8.6.x-dev

Drupal 8.5.6 was released on August 1, 2018 and is the final bugfix release for the Drupal 8.5.x series. Drupal 8.5.x will not receive any further development aside from security fixes. Sites should prepare to update to 8.6.0 on September 5, 2018. (Drupal 8.6.0-rc1 is available for testing.)

Bug reports should be targeted against the 8.6.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.7.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Log in or register to post comments

Comment #67

andypost

he/him

Russian

CreditAttribution: andypost commented 21 January 2020 at 12:13

Version:

8.6.x-dev

» 8.9.x-dev

Pg12 compatibility fixed

Log in or register to post comments

Comment #68

andypost

he/him

Russian

CreditAttribution: andypost as a volunteer and at Skilld commented 10 March 2020 at 21:19

It used to select pg_trgm in #2988018: [PP-1] Performance issues with path alias generated queries on PostgreSQL
Probably this one is duplicate

Log in or register to post comments

Comment #69

daffie CreditAttribution: daffie commented 10 March 2020 at 21:41

Version:

8.9.x-dev

» 9.1.x-dev

I do not think this is a duplicate, but it is related. Maybe we can fix this issue in Drupal 9 with the requirement of the pg_trgm extension.

Log in or register to post comments

Comment #70

johnwebdev CreditAttribution: johnwebdev commented 28 June 2020 at 18:00

Yeah, path_alias is definitely slow, but there are multiple queries being affected by this. This problem is a huge bottleneck for performance, and I think the numbers was around 100 % faster on our site after doing a test migration to MySQL.

Log in or register to post comments

Comment #71

kalpaitch CreditAttribution: kalpaitch as a volunteer and commented 5 October 2020 at 12:56

File	Size
postgres-case-insensitivity_2464481-71.patch	5.08 KB
8.9.x: PHP 7.4 & MySQL 5.7 28,604 pass

1 file was hidden/shown/deleted

File	Size
2464481-57.patch	5.13 KB

Re-rolling #2464481-57: PostgreSQL: deal with case insensitivity compatible with latest 8.9.x. We're still using this fix.

Log in or register to post comments

Comment #72

5 October 2020 at 12:56

Version:

9.1.x-dev

» 9.2.x-dev

Drupal 9.1.0-alpha1 will be released the week of October 19, 2020, which means new developments and disruptive changes should now be targeted for the 9.2.x-dev branch. For more information see the Drupal 9 minor version schedule and the Allowed changes during the Drupal 9 release cycle.

Log in or register to post comments

Comment #73

luisnicg CreditAttribution: luisnicg as a volunteer commented 22 October 2020 at 03:42

The patch #71 has some conflicts with Pathauto, after applying this patch I got this error:

The alias [alias] is already in use in this language.

The query change from this:

SELECT base_table.revision_id AS revision_id, base_table.id AS id
FROM
{path_alias} base_table
INNER JOIN {path_alias} path_alias ON path_alias.id = base_table.id
WHERE (path_alias.alias LIKE :db_condition_placeholder_0 ESCAPE '\\') AND (path_alias.langcode = :db_condition_placeholder_1) AND (path_alias.path NOT LIKE :db_condition_placeholder_2 ESCAPE '\\')
LIMIT 1 OFFSET 0

to this:

SELECT base_table.revision_id AS revision_id, base_table.id AS id
FROM
{path_alias} base_table
INNER JOIN {path_alias} path_alias ON path_alias.id = base_table.id
WHERE ((LOWER(path_alias.alias) = LOWER(:value))) AND (path_alias.langcode = :db_condition_placeholder_0) AND ((LOWER(path_alias.path) <> LOWER(:value)))
LIMIT 1 OFFSET 0

Which got some results on this file core/lib/Drupal/Core/Path/Plugin/Validation/Constraint/UniquePathAliasConstraintValidator.php on this line:

if ($result = $query->range(0, 1)->execute()) { }

is anyone else experiencing this problem?

Log in or register to post comments

Comment #74

22 October 2020 at 03:42

Version:

9.2.x-dev

» 9.3.x-dev

Drupal 9.2.0-alpha1 will be released the week of May 3, 2021, which means new developments and disruptive changes should now be targeted for the 9.3.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Log in or register to post comments

Comment #75

22 October 2020 at 03:42

Version:

9.3.x-dev

» 9.4.x-dev

Drupal 9.3.0-rc1 was released on November 26, 2021, which means new developments and disruptive changes should now be targeted for the 9.4.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Log in or register to post comments

Comment #76

kalpaitch CreditAttribution: kalpaitch as a volunteer and commented 17 March 2022 at 12:05

Yes, same issues with my patch in #71, doesn't work, as you indicate in #73

Log in or register to post comments

Comment #77

17 March 2022 at 12:05

Version:

9.4.x-dev

» 9.5.x-dev

Drupal 9.4.0-alpha1 was released on May 6, 2022, which means new developments and disruptive changes should now be targeted for the 9.5.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Log in or register to post comments

Comment #78

VladimirAus

English

Brisbane 🇦🇺

CreditAttribution: VladimirAus at Tomato Elephant Studio commented 2 June 2022 at 09:13

Status:

Needs review

» Needs work

As per #73 and #76.

Log in or register to post comments

Comment #79

VladimirAus

English

Brisbane 🇦🇺

CreditAttribution: VladimirAus at Tomato Elephant Studio for Terem commented 4 June 2022 at 00:15

Status:

Needs work

» Needs review

See patch here that should solve condition issue: https://www.drupal.org/project/drupal/issues/2490294#comment-14546671

Log in or register to post comments

Comment #80

daffie CreditAttribution: daffie commented 4 June 2022 at 08:36

We are not going to use the LOWER operator on PostgreSQL. The decision has been made to use GIST indexes instead. For that is the pg_trgm extension required for Drupal 10. See: https://www.drupal.org/docs/system-requirements/database-server-requirem....

The biggest problem is with performance problems with the path alias queries. The patch from that issue also add the code to create those indexes. See: #2988018-71: [PP-1] Performance issues with path alias generated queries on PostgreSQL . When that has landed, we can use that solution in other places, like with #2490294: User email should not be case sensitive.

Log in or register to post comments

Comment #81

4 June 2022 at 08:36

Version:

9.5.x-dev

» 10.1.x-dev

Drupal 9.5.0-beta2 and Drupal 10.0.0-beta2 were released on September 29, 2022, which means new developments and disruptive changes should now be targeted for the 10.1.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Log in or register to post comments

Comment #82

needs-review-queue-bot CreditAttribution: needs-review-queue-bot as a volunteer commented 30 January 2023 at 23:08

Status:

Needs review

» Needs work

File	Size
2464481-nr-bot.txt	172 bytes

The Needs Review Queue Bot tested this issue. It either no longer applies to Drupal core, or fails the Drupal core commit checks. Therefore, this issue status is now "Needs work".

Apart from a re-roll or rebase, this issue may need more work to address feedback in the issue or MR comments. To progress an issue, incorporate this feedback as part of the process of updating the issue. This helps other contributors to know what is outstanding.

Consult the Drupal Contributor Guide to find step-by-step guides for working with issues.

Log in or register to post comments

Comment #83

Chi CreditAttribution: Chi commented 20 May 2023 at 06:45

Added a simple benchmark for the current implementation of case-insensitive conditions in Postgres.
#3361618: Postgres forcing cases case-insensitivity causes serious performance degradation.

Log in or register to post comments

Comment #84

20 May 2023 at 06:45

Version:

10.1.x-dev

» 11.x-dev

Drupal core is moving towards using a “main” branch. As an interim step, a new 11.x branch has been opened, as Drupal.org infrastructure cannot currently fully support a branch named main. New developments and disruptive changes should now be targeted for the 11.x branch, which currently accepts only minor-version allowed changes. For more information, see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Log in or register to post comments

PostgreSQL: deal with case insensitivity

Problem/Motivation

Proposed resolution

Remaining tasks

User interface changes

API changes

Comments