Splitting off part of the work from #2075889: Make Drupal handle incoming paths in a case-insensitive fashion for routing
Problem/Motivation
Whilst working on #1987898: Convert user_view_page() to a new style controller and #2075889: Make Drupal handle incoming paths in a case-insensitive fashion for routing we note that path aliases handling will be case insensitive on mysql but case sensitive on pgsql and others.
So on some sites:
/Path/Alias and /path/alias will resolve the same, on other sites one will be a 404
Proposed resolution
we agreed to fix this by keeping most of the existing behavior from Drupal 7.
/Path/Alias- not 404. Same as /path/alias on all supported databases
Remaining tasks
User interface changes
none
API changes
Change the query to the select builder so we take advantage of the ILIKE on pgsql
Data model changes
none
Beta phase evaluation
Issue category | Bug because links in D6/D7 sites can break during an upgrade to D8. |
---|---|
Issue priority | Major because previously working links can break during a D8 upgrade. Not critical because:
See #23 for more details. |
Disruption | Disruptive for sites running on beta versions that added case-sensitive aliases in Drupal 6-7 or in Drupal 8. |
Comment | File | Size | Author |
---|---|---|---|
#53 | 2584243-53.patch | 20.59 KB | stefan.r |
#53 | interdiff-50-53.patch | 2.92 KB | stefan.r |
| |||
#50 | increment.txt | 4.24 KB | pwolanin |
#50 | 2584243-50.patch | 21.04 KB | pwolanin |
#42 | 2584243-42.patch | 21.33 KB | stefan.r |
Comments
Comment #2
pwolanin CreditAttribution: pwolanin as a volunteer and at Acquia commentedThis has just has the 3 hunks I think are needed from the other patch https://www.drupal.org/files/issues/2075889-197_0.patch
Comment #3
pwolanin CreditAttribution: pwolanin as a volunteer and at Acquia commentedHere's just the test hunk to verify it fails somewhere.
Comment #4
pwolanin CreditAttribution: pwolanin as a volunteer and at Acquia commentedunused use
Comment #5
YesCT CreditAttribution: YesCT commentedis this correcting incorrect docs, or is it changing the param?
Comment #6
pwolanin CreditAttribution: pwolanin as a volunteer and at Acquia commented@YesCt - it's fixing the docs since the code actually just takes a string, not an array of strings.
Comment #7
dawehnerSo i'm wondering whether we really need to switch from simple queries to an object. Is this a requirement to fix the issue?
Comment #8
Crell CreditAttribution: Crell as a volunteer commentedAnother alternative is to just have separate MySQL, Postgres, and SQLite implementations of the AliasStorage class, each optimized accordingly. That's fully supported in core and something we should take more advantage of. :-)
Comment #9
kgoel CreditAttribution: kgoel as a volunteer commented+1
I like the idea of separate implementation. It's sounds very straight forward but don't know how easy/difficult it would be to implement this.
Comment #10
pwolanin CreditAttribution: pwolanin as a volunteer and at Acquia commented@dawehner - I thought we needed to, but maybe not - it seems the pgsql driver also replaces queries in strings.
\Drupal\Core\Database\Driver\pgsql\Connection::prepareQuery()
So you are right, we could just make them LIKE instead of select objects
Comment #11
Crell CreditAttribution: Crell as a volunteer commentedkgoel: Quite simple: https://www.palantir.net/blog/d8ftw-customizing-your-back-end
(That's assuming the challenge is cross-DB SQL. If that's not the issue that won't help.)
Comment #12
pwolanin CreditAttribution: pwolanin as a volunteer and at Acquia commentedI don't think it's worth the complexity when we already have a way to make it DB agnostic using LIKE and it's already got the backend_overridable tag.
Should probably include an alias in the test that has _ and/or % and one that does not that would match?
This puts pack the simple query() calls but escapes all the values for LIKE which was missing before.
Comment #13
catchLooks like real fails on postgres and sqlite.
Comment #14
pwolanin CreditAttribution: pwolanin as a volunteer and at Acquia commentedHmm, should I not have escaped the values?
Comment #15
effulgentsia CreditAttribution: effulgentsia at Acquia commentedI discussed this with the other committers, and we decided to raise this to Critical for now.
Because, the upgrade path might require data loss on case-sensitive databases such as Sqlite. For example, on Sqlite, you can in both 7.x and 8.0.0-rc2 create "foo" and "Foo" as aliases to different paths. If we fix this issue in a way that starts disallowing that, then figuring out how to migrate such 7.x databases will be Migrate's responsibility, which is still flagged as an experimental module, so that's ok. But updating existing 8.x databases will be hook_update_N()'s responsibility. And making such an update function delete semi-duplicate-but-not-really aliases might be more palatable during RC than after people deploy sites on 8.0.0. For example, a similar kind of deletion of same-except-for-case records in #1260938: d6 to d7 update fails on file duplicates '#7061 Integrity constraint violation' ended up taking a very long time to land.
That said, if we don't solve this relatively quickly, we may choose to downgrade it rather than hold up 8.0.0's release, but that's a discussion/decision for a later date. Hopefully, it won't come to that.
Comment #16
kgoel CreditAttribution: kgoel as a volunteer commentedComment #17
dawehnerJust a general comment, it would be nice to also test all the various changed methods like save() / delete() etc ... lookupPathSource and aliasExists
Comment #18
catchDoesn't need to be assigned to me.
Comment #19
dawehnerJust to clarify my previous comment, the amount of additional tests does not really match yet the amount of changed functionality.
Comment #20
catchDo we actually need the case-insensitivity when looking up alias from source? That's not used for incoming paths.
Comment #21
dawehnerThis is a valid question.
So yeah that example is certainly not needed ... ideally we would distinct between what you use on a more storage API level
and what you use on runtime, which should have the case insensitive behaviour.
Comment #22
pwolanin CreditAttribution: pwolanin as a volunteer commentedI'm not sure I understand. IF you don't make aliasExists behave consistently with the other function, that will lead to inconsistent behavior in the UI. That function is used in form validation.
Needs work sicne clearly it's still failing on postgres and sqlite
Comment #23
catchThe alias lookup needs to be case insensitive, but the source lookup implies we're storing case insensitive source. Do we actually allow people to enter the source path in random case in 7.x? If we do then maybe that's needed.
Comment #24
pwolanin CreditAttribution: pwolanin as a volunteer commentedOh, hmm - well we plan to handle routing in a case-insensitive way, then I think source lookup should be case insensitive?
Comment #25
mradcliffeWe allow to add aliases in mix case in 6.x.
Comment #26
catch@pwolanin wouldn't we do that in #2075889: Make Drupal handle incoming paths in a case-insensitive fashion for routing ?
Comment #27
catchTagging minor version target again, just to make it clear we can't fix this in a patch release.
Comment #28
pwolanin CreditAttribution: pwolanin as a volunteer commented@catch - we are not talking to the routing system at all here.
Comment #29
dawehnerLooking into the issue for a while
Comment #30
dawehnerI think its fine to convert all API functions, its more consistent, though a better help text for the validation would be nice
Also adding documentation about the behaviour isn't a bad idea, as well as some general tests.
Comment #32
swentel CreditAttribution: swentel commentedHmm, this sounds a bit verbose, and maybe a bit unclear for some people ? Don't have a better proposal atm though.
Comment #33
dawehnerYeah I certainly don't think its at all a good sentence, maybe someone has a better idea?
Fixed the failure.
Comment #34
stefan.r CreditAttribution: stefan.r commentedReviewing this and having a look at the sqlite/pgsql test failures
Comment #35
stefan.r CreditAttribution: stefan.r commentedComment #36
stefan.r CreditAttribution: stefan.r for Drupal Association commentedSome small changes, test coverage and an attempt at fixing the SQLite/PostgreSQL test failure.
I changed the db_query into a dynamic query as SQLite requires an
ESCAPE '\'
after the LIKE, which dynamic queries do automatically. I tried to hardcode the the ESCAPE '/' into the db_query but didn't succeed as I kept getting anunknown token
error. We don't do LIKE x ESCAPE y anywhere else in core either so the dynamic query may be the best way to go?I also suspect SQLite has a collation issue as it can't find the upper-case version of some non-ASCII characters (#1518506: Normalize how case sensitivity is handled across database engines) so skipping those in the test when using SQLite for now...
Also what about preloadPathAlias, we skip the case insensitivity there?
Comment #37
swentel CreditAttribution: swentel commentedthe the
The form error sounds a lot clearer to me already!
Comment #38
stefan.r CreditAttribution: stefan.r for Drupal Association commentedThis adds case insensitivity to preloadPathAlias (do we need it though?), and adds a todo pointing to a followup I just opened at #2607432: SQLite driver does not allow for case insensitive LIKE comparisons on non-ASCII characters. It also cleans up the dynamic queries a bit.
I just noticed we already tried and then reverted using the query builder in #12 - does the approach in this patch pose a significant performance slowdown? We could go back to hardcoded SQL queries but SQLite has a problem with wildcards in the LIKE there (% and _ need to be escaped, which might require unnecessary changes), and in #12 we said doing database-specific implementations was not worth the complexity.
Comment #39
mradcliffeShould we use $langcode_undetermined instead of LanguageInterface::LANGCODE_NOT_SPECIFIED for consistency?
Comment #40
stefan.r CreditAttribution: stefan.r for Drupal Association commented@mradcliffe the original code didn't do so either despite having the variable available and I wanted to avoid unrelated changes, but other than that I don't see why not...
Comment #42
stefan.r CreditAttribution: stefan.r for Drupal Association commentedComment #44
stefan.r CreditAttribution: stefan.r for Drupal Association commentedComment #47
dawehnerSo yeah just those 2 queries potentially runs often, on the other hand due to our caching, I guess we need to care less about it
Comment #48
stefan.r CreditAttribution: stefan.r for Drupal Association commented@dawehner the difference between the dynamic query in lookupPathSource vs the db_query is about .3 ms on average on my slow laptop.
For the routing we only do the database query on the first page visit, right? Most pages have a few calls to Url::fromUri() as well, which goes through PathValidator::getPathAttributes() and then does a PathProcessorAlias::getPathByAlias(). So altogether I see some 3-4 calls to these newly introduced dynamic queries per page on average.
As to how to proceed here while still fixing SQLite, we could either use this patch, take the ~1ms performance hit and optimize in another release, or alternatively we could find a way to make
LIKE x ESCAPE y
work in a db_query.Comment #49
pwolanin CreditAttribution: pwolanin as a volunteer commentedIf we cared most about performance, we'd have used the original approach of forcing aliases to lower case.
I had a version of this earlier on using dynamic queries that radically cut the different, confusing, conditions and repeated code. See patch at #4. Can we go back to something like that plus the missing LIKE escaping?
Comment #50
pwolanin CreditAttribution: pwolanin as a volunteer commentedOk, applying that pattern. Also, the fixes all select objects to be
$select
, which is important to reduce confusion when looking at those methods.Comment #51
stefan.r CreditAttribution: stefan.r commentedInterdiff looks great, does anything still needs to happen to the current patch?
Comment #52
chx CreditAttribution: chx commentedWhy on earth did we add local variables for constants??
is this expected to change? Is there a tax on the number of characters? Is there some incredible PHP bug we are coding around? I am so baffled.
Comment #53
stefan.r CreditAttribution: stefan.r commentedComment #54
catchThis looks good to me. If we're really worried about the query building, then switching to database-specific implementations would do it, but then we should reconsider the LIKE/ILIKE optimization we have in the database layer, which is used for a lot of things other than path aliases like user names.
If I mark this RTBC I can't really commit it, but no complaints.
Comment #55
plachEasily solved :)
As a side note, it seems we are applying the same logic multiple times, what about creating a follow-up to factor it out to an helper method and reuse it?
Comment #56
dawehnerThank you peter and stefan
Comment #61
catchRemoving tags now this is both critical and RTBC.
Comment #62
catchCommitted/pushed to 8.0.x, thanks!