Problem/Motivation
In #1314214: MySQL driver does not support full UTF-8 (emojis, asian symbols, mathematical symbols), the MySQL schema class added the ability to normalize index length. However, this only works when the table is first created. When calling Schema::addIndex()
, indexes do not get normalized. Furthermore, the code calling Schema::addIndex()
should not attempt to perform the work of Schema::getNormalizedIndexes()
, since that implementation is MySQL-specific.
This will be an issue for the automatic entity schema update system when new indexes are added that should be normalized in MySQL. In SqlContentEntityStorageSchema::onEntityUpdateType
:
// Create new indexes and unique keys.
$entity_schema = $this->getEntitySchema($entity_type, TRUE);
foreach ($this->getEntitySchemaData($entity_type, $entity_schema) as $table_name => $schema) {
if (!empty($schema['indexes'])) {
foreach ($schema['indexes'] as $name => $specifier) {
$schema_handler->addIndex($table_name, $name, $specifier);
}
}
will throw an uncaught exception if the index is too long in MySQL (for instance, see #2507201-3: Upgrade path for MySQL switch to multibyte UTF8)
Proposed resolution
- Expand the
Schema::addIndex()
method to get the full table schema as 4th argument
Alternative proposal which have been out:
- Replace $fields with $spec: This makes it less obvious for code to update AND it makes the code really hard to read because the index fields are not directly visible
- Add a new method
Schema::addNormalizedIndex()
which takes$table, $index_index_name, $spec
(Same problems as before)
Remaining tasks
Commit.
User interface changes
None.
API changes
New required table schema:
Before
db_add_index('test_table', 'test_field', array('test_field'));
After
db_add_index('test_table', 'test_field', array('test_field'), $table_specification);
Data model changes
None.
Comment | File | Size | Author |
---|---|---|---|
#105 | 2537928.104.patch | 25.43 KB | alexpott |
#105 | 103-104-interdiff.txt | 1.83 KB | alexpott |
#103 | 2537928-103.patch | 24.65 KB | dawehner |
#99 | 2537928-97.patch | 21.28 KB | dawehner |
#99 | interdiff.txt | 5.2 KB | dawehner |
Comments
Comment #1
stefan.r CreditAttribution: stefan.r commentedWhat other use cases can we think of where this is potentially a problem, is this isolated to yet to be developed contrib modules?
If as far as core is concerned the problem only shows up in onEntityUpdateType() I wonder if we actually need to be normalizing in addIndex() or whether it would be enough to just update onEntityUpdateType() to do $indexes = getNormalizedIndexes($schema);
Otherwise catching the exception with a more helpful error message might be another possible solution.
Comment #2
jhedstrom$indexes = getNormalizedIndexes($schema)
this is protected though, and currently MySQL-specific. We could add that method to the other drivers and have the current implementation in sqlite and pgsql do nothing.Comment #3
jhedstromI would think anytime a new index needs to be added to an existing table, this would be a potential issue. The current implementation is MySQL-specific, so callers should not attempt to shorted indexes themselves.
Comment #4
jhedstromUpdating IS with approach from #2.
Comment #5
stefan.r CreditAttribution: stefan.r commentedHmm OK so doing this in onEntityUpdateType would still be a bit ugly given the index normalization is likely to remain mysql-specific.
In order to do the normalization in addIndex() we'd need to have the full table schema somehow (not just the index specification).
Comment #6
jhedstromThis does seem a bit ugly. However, it would allow sqlite or pgsql to adjust indexes in the future if need be (they do have some finite length from what I can tell, it just seems much larger than mysql).
Comment #7
jhedstromComment #8
stefan.r CreditAttribution: stefan.r commentedYes that should work. Still, as you mentioned we may run into the same problem elsewhere, any time contrib/custom/core code were to call addIndex() directly it'd also need to call getNormalizedIndexes().
So it could be good to either mention that in the docblock for addIndex(), or maybe we could deprecate addIndex() and add an addNormalizedIndex() method that takes the full table spec as a parameter, what do you think?
Comment #9
jhedstromI like adding
addNormalizedIndex
and markingaddIndex
deprecated. It would be really nice if we could keepaddIndex
as the name in the long run though. Perhaps deprecate `addIndex` for 8.0, then remove in 8.1, then markaddNormalizedIndex
deprecated and re-addaddIndex
in 8.2?Comment #10
jhedstromOr the new method could just be
addIndexes()
Comment #12
stefan.r CreditAttribution: stefan.r commentedI could go with either... if it's just addIndexes() maybe we could still deprecate addIndex().
Comment #13
catchComment #14
catchBumping to critical - this means fatal errors on some entity auto-updates, and they're also fatal errors which the current testbot configuration doesn't catch because it has long index support enabled, so they won't necessarily be caught when writing/testing the updates.
Comment #15
dawehnerSounds like this could us ethe parent method
Comment #16
jhedstromUpdating the IS to reflect more recent direction.
Comment #17
stefan.r CreditAttribution: stefan.r commentedJust a draft of a patch, anyone feel free to work further on this
Comment #19
jhedstromThis won't give MySQL everything it needs--can we just copy
$schema
and then set$schema_copy['indexes'][$name] = $real_columns
?Comment #20
catchWe discussed this on the 8.x critical triage call with webchick, xjm, alexbronstein and alexpott and all agreed that it's release blocking - due to fatal errors for straightforward updates that would have worked as table creations, and also the unpredictability between environments.
Adding triaged tag.
Comment #21
stefan.r CreditAttribution: stefan.r commentedComment #22
stefan.r CreditAttribution: stefan.r commentedComment #23
dawehnerQuick note: we don't translate exceptions ... is there a reason why we can't share all the common code between the implementations?
Comment #24
stefan.r CreditAttribution: stefan.r commentedopened #2538814: Do not use t() in exceptions for that.
I guess we could share the exception code, would you suggest a separate method? Something like
validateNormalizedIndexes($indexes, $table, $name)
Comment #25
stefan.r CreditAttribution: stefan.r commentedComment #26
jhedstromAs an aside, by 9.0.0, we could reclaim the
addIndex()
name and simply change the arguments.This exception doesn't exist yet from what I can tell.
Comment #28
dawehner@stefan.r
Well there is no reason to introduce new ones.
Comment #29
stefan.r CreditAttribution: stefan.r commentedAdded test & exception
Comment #32
jhedstromWorking on fixing the failing tests.
Comment #33
jhedstromThis should address the remaining failing tests.
Comment #34
jhedstromComment #35
jhedstromSince contrib drivers are going to need updating anyway, thoughts on just changing the arguments to
addIndex()
?Comment #36
stefan.r CreditAttribution: stefan.r commentedYes, if we think we can make an exception to the API freeze here, addIndex() does make more sense as a method name.
The worry would be that maintainers forget to upgrade their modules but considering the number of D8 modules is currently manageable maybe we could grep through all existing contrib modules (both now and before D8 release) and personally chase the maintainers? Which contrib modules currently call this method, is it limited to database drivers?
Also if we do change the 3rd parameter we'd need to do some slightly stricter checking of the 3rd argument so we can distinguish between a table spec and a list of index fields and throw a specific exception in case someone still uses $fields instead of $spec.
Comment #37
jhedstromUpdating IS to reflect #35 and #36.
Comment #38
dawehnerSo to be clear, I think naming it
addIndex()
is the right thing to do, given that this is what people expect to exist.The current signature of the patch looks like the following:
add(normalized)Index($table, $name, $fields)
What about doing the following:
addIndex($table, $name, $fields, array $full_table_spec)
This makes it a) really clear how your index looks like while reading the code. This is not really the case IMHO with the current proposed API.
On top of it b) existing calls to that method will break immediately which is IMHO both okay (given we solve a critical) but at the same time makes it really visible, so contrib/custom code can be adapted. With that signature I would suggest merging in the index fields into the full table spec, so you don't need to specify it twice.
Curious why we can't take into account #23 beside laziness :)
Comment #39
stefan.r CreditAttribution: stefan.r commentedsounds fine to me, do we have an idea about which contrib modules call addIndex() so we know what exactly we're breaking?
Comment #40
dawehnerI bet the only common usecase are in
hook_update_N()
functions.Comment #41
dawehnerCool, let me give that a quick try.
Comment #42
dawehnerComment #44
dawehnerLet's fix the failures.
Comment #45
dawehnerUpdated the issue summary
Comment #46
stefan.r CreditAttribution: stefan.r commentedNice work, patch looks good! This will probably need a change record?
Comment #47
dawehnerHere is one.
Comment #48
dawehnerUpdated the IS
Comment #49
stefan.r CreditAttribution: stefan.r commentedComment #50
dawehner+1 Thank you stefan.r
So I think we just need someone who RTBCs this patch.
Comment #51
benjy CreditAttribution: benjy at CodeDrop commentedPatch looks OK, one small thing:
We don't normally use abbreviated variable names?
Comment #52
stefan.r CreditAttribution: stefan.r commentedIf we decide to unabbreviate that we may want to update the following too as they use the abbreviated name
$spec
as well:Schema::createKeysSql(),
Schema::createFieldSql(),
Schema::addField(),
Schema::getNormalizedIndexes()
Schema::changeField()
db_add_field()
db_change_field()
Comment #53
dawehnerYeah this is why I think this is fine. Spec is now also not a horrible name for a specification.
Comment #54
jhedstrom+1 for RTBC here. Very glad we are keeping
addIndex()
:)Comment #55
benjy CreditAttribution: benjy at CodeDrop commentedOK, +1 from me.
Comment #56
alexpottSchemaTest passes in both mysql and pgsql. But pgsql is broken cause...
in
core/lib/Drupal/Core/Database/Driver/pgsql/Schema.php
the call to addIndex is broken.Adding this unnecessary for the fix. I think we don't need to do this. Less change is nice.
Comment #57
dawehnerThere we go.
Comment #58
dawehnerLet's add some comment.
Comment #61
dawehnerUps
Comment #62
jibran#56 is addressed so back to RTBC.
Comment #63
alexpottI've asked @Crell to have a look as sub system maintainer.
Comment #64
Crell CreditAttribution: Crell at Palantir.net commentedThanks for the tag, Alex!
Specification of what table? $table's specfication? That's unclear.
I talked through this in IRC with dawehner and catch. I am very skeptical about the DX of requiring the full table definition to be passed in each time, as there's no common place to get that information. It requires out-of-band data, retrieved from... somewhere context sensitive. Could be hook_schema(), could be derived from somewhere like Entity API or cache API if it's a dynamic table... you have to "just know" where to get this big array from. That's the terrible DX we've been trying to move away from.
Unfortunately, the only alternative we could come up with is to add schema introspection to the DB API, so that it could return a table spec of a table in the database already. Then the schema object could just look at the table itself and know what it needs to do. That, however, is a not currently a feature of the Database API and is a heck of a lot of work to add, in part "because MySQL". *sigh* That is out of scope at this time.
So if we're going to move forward here, we should:
1) Do a much better job of documenting where to get the value of $spec. Even if that's just "depending on the use case it could be from X, or Y, or Z". Give developers some sort of indication of how to use this method.
2) Leave a @todo on the method that if we ever do manage to add schema introspection later then the $spec array can be removed entirely.
3) catch pointed out in IRC that $spec technically doesn't need to be the full table, just the fields that are being used by the index. Any others are irrelevant. I am torn on if we should say that or not; it could make it easier to use if someone is adding a field and then an index on it, since they're almost guaranteed to have the spec for that field handy already, but it could also add complications as you're passing in a table definition that is not a full table definition but works anyway, and unstructured inputs lead to fun-to-find bugs. Open to thoughts on this point.
Comment #65
stefan.r CreditAttribution: stefan.r commentedComment #66
dawehnerMh, that can't work ... you need to add $spect to db_add_index as well
empty whitespace
+1
Comment #67
Crell CreditAttribution: Crell at Palantir.net commentedWhy undo this change? db_add_index() is deprecated but it still needs to work.
Nit: Stray whitespace at the end of line 423.
The revised comment looks good to me otherwise.
Comment #69
stefan.r CreditAttribution: stefan.r commentedHmm that was an oversight, $spec was in the docblock and inside the function but not in the parameters.
Comment #70
stefan.r CreditAttribution: stefan.r commentedComment #71
stefan.r CreditAttribution: stefan.r commentedwith the parameter restored
Comment #73
Crell CreditAttribution: Crell at Palantir.net commentedThanks, stefan.r! Back up to RTBC for Alex.
Comment #74
dawehnerRemoving the assignment
Comment #75
alexpottThis introduces SQLite regressions.
Drupal\system\Tests\Entity\FieldSqlStorageTest
fails on SQLite with an exception after applying this patch.Comment #76
alexpottWe can just make less change. Also I don't think we need to pollute the base Schema namespace with getNormalisedIndexes until we have a use case.
Comment #77
Gábor HojtsySeems to be failing less on SQLite :) More on PostreSQL :(
Comment #78
alexpott@Gábor Hojtsy the problem is the postgres and sqlite Drupalci runs are not stable (or passing) - so manual testing is required :(
Local test runs for me:
SQLite
Drupal\field\Tests\Number\NumberFieldTest fails on HEAD and PATCH
Drupal\image\Tests\ImageEffectsTest passes on HEAD and PATCH
Drupal\page_cache\Tests\PageCacheTest passes on HEAD and PATCH
Drupal\toolbar\Tests\ToolbarAdminMenuTest passes on HEAD and PATCH
Drupal\views_ui\Tests\NewViewConfigSchemaTest passes on HEAD and PATCH
No regression for SQLite
PostgreSQL
@todo
Comment #79
stefan.r CreditAttribution: stefan.r commentedJust from a manual review the current fails seem unlikely to be caused by this patch.
As apparently drupalci for pgsql/sqlite is not stable, could anyone run the patch vs HEAD on pgsql/sqlite locally?
Comment #80
alexpottPostgres
The suspicious tests are:
All the rest are DrupalCI set issues due to the clear down of a previous test or file system
Drupal\node\Tests\Views\RevisionRelationshipsTest fails on HEAD and PATCH
Drupal\views\Tests\Handler\FieldFieldTest fails on HEAD and PATCH
No regression for Postgres
Comment #81
bzrudi71 CreditAttribution: bzrudi71 commented@alexpott, Yep, both tests fail since some days and are not related to this issue. (as seen on PG-Bot) I think it's just another missing order by problem. So no showstopper for this issue :)
Comment #82
stefan.r CreditAttribution: stefan.r commentedinterdiff in #76 looks good and no regressions, so +1 from me
Comment #83
alexpottCreated:
None of these issues are related to this issue.
Comment #84
dawehnerSo the patch is green again.
Doing less in order to fix an issue is often a great idea
Comment #85
Gábor HojtsyUpdated the change notice draft significantly to make it easier to understand: https://www.drupal.org/node/2540310/revisions/view/8715856/8725758
Updating issue summary too.
Comment #86
neclimdul+1, thanks guys!
Comment #87
jhedstromJust one more +1 here. Patch is working wonderfully on a beta 7 upgrade!
Comment #88
webflo CreditAttribution: webflo at UEBERBIT GmbH commentedWe have to pass along the
fields
schema information toaddIndex
because the MySQL Driver needs the information to perform normalization on long indexes.Comment #89
webflo CreditAttribution: webflo at UEBERBIT GmbH commentedAnd the interdiff ...
Comment #90
webflo CreditAttribution: webflo at UEBERBIT GmbH commentedSorry uploaded the wrong patch. Ignore #88.
Comment #92
dawehnerGood catch!
Comment #93
alexpottFor #90
Comment #94
Gábor HojtsyBTW the fails in #90 on DrupalCI are to be expected:
Comment #95
dawehnerTrying to write some unit test for #90 atm.
Comment #96
alexpottHere are some tests. Found another bug introduce my this patch in the entity auto updater.
Comment #97
alexpottDrupal\system\Tests\Entity\EntityDefinitionUpdateTest still passes on both pgsql and sqlite.
Comment #98
alexpottLet's throw an exception if we don;t have the field spec.
Comment #99
dawehnerWrote a unit test in the meantime.
Comment #103
dawehnerMy patch lacked of #96
Comment #105
alexpottThrowing the exception found another bug.
Comment #106
Crell CreditAttribution: Crell at Palantir.net commentedWe've usually been using sprintf() in these cases. I looked at the string and tried to figure out if we were going to interpolate that variable or not, and got very confused. :-)
I'm otherwise satisfied with this patch. Queuing for SQLite and Pgsql just to be on the safe side.
Comment #107
alexpottWe have no standards on how to concatenate variables in an exception message.
sprintf()
is totally unnecessary, PHP is very good constructing strings from variables and hard coded strings - let's use the language features. An added bonus is that it reads way better - you actually have to think less. Especially when the number of variables increases.Comment #108
dawehnerI'm curious here, why do we not use an assertion? On runtime vs. development time this will never be different. Either code passes it along or not. I still try to wrap my head around the right usecases of assertions though
Comment #109
catchSo I also think that could be an assertion but seems like good material for a follow-up.
Committed/pushed to 8.0.x, thanks!
Comment #111
stefan.r CreditAttribution: stefan.r commentedfollow-up about the assertion at #2545312: Turn the SchemaException in mysql\Schema::getNormalizedIndexes() into an assertion
Comment #112
Gábor HojtsyPublished change record at https://www.drupal.org/node/2540310
Comment #113
Dave ReidOof, providing the full spec is likely going to be painful in contrib module update hooks, but I understand why.