On installation of the current 8.x I get the following error when using MyISAM:
Syntax error or access violation: 1071 Specified key was too long; max key length is 1000 bytes: CREATE TABLE {taxonomy_term_data} ( `tid` INT unsigned NOT NULL auto_increment COMMENT 'Primary Key: Unique term ID.', `uuid` VARCHAR(128) NULL DEFAULT NULL COMMENT 'Unique Key: Universally unique identifier for this entity.', `vid` VARCHAR(255) NOT NULL DEFAULT '' COMMENT 'The ID of the vocabulary to which the term is assigned.', `langcode` VARCHAR(12) NOT NULL DEFAULT '' COMMENT 'The language.langcode of this term.', `name` VARCHAR(255) NOT NULL DEFAULT '' COMMENT 'The term name.', `description` LONGTEXT NULL DEFAULT NULL COMMENT 'A description of the term.', `format` VARCHAR(255) NULL DEFAULT NULL COMMENT 'The filter_format.format of the description.', `weight` INT NOT NULL DEFAULT 0 COMMENT 'The weight of this term in relation to other terms.', PRIMARY KEY (`tid`), UNIQUE KEY `uuid` (`uuid`), INDEX `taxonomy_tree` (`vid`, `weight`, `name`), INDEX `vid_name` (`vid`, `name`), INDEX `name` (`name`) ) ENGINE = InnoDB DEFAULT CHARACTER SET utf8 COMMENT 'Stores term information.'; Array ( )
Specified key was too long of the table taxonomy_term_data
Reproducible on http://simplytest.me/project/drupal/8.x
About the server:
- MySQL 5.5 (5.5.24-0ubuntu0.12.04.1) using MyISAM
- PHP 5.3 (5.3.10-1ubuntu3.4)
Related issue
#1852896: Throw an exception if a schema defines a key that would be over 1000 bytes in MySQL
Comment | File | Size | Author |
---|---|---|---|
#36 | 1871032-vid-index-36.patch | 1.48 KB | andypost |
#25 | 1871032-vid-index-25.patch | 1.04 KB | andypost |
#11 | 1871032-vid-index-11.patch | 595 bytes | andypost |
#4 | drupal8.taxonomy-vid-length.4.patch | 1.52 KB | sun |
Comments
Comment #1
patrickd CreditAttribution: patrickd commentedTo reproduce this locally you have to make sure to use MyISAM instead of the default InnoDB engine
1. Edit your mysqld settings-file (on ubuntu it's
/etc/mysql/my.cnf
)2. Goto [mysql] and add the following lines
3. Restart the mysqld by
service mysql restart
or/etc/init.d/mysql restart
Note that your not able to use your old InnoDB databases anymore with this configuration, you either need to convert them or create a new Database with this config for installing drupal 8.
On simplytest.me I'm using MyISAM because it turned out to be 20%-30% faster on installations, as there are more installations than actual site usage on the service I'd like to keep it that way.
Comment #2
BerdirSomething similar happened recently in #347988: Move $user->data into own table, see comment #117.
Sounds like this could use some sort of test, loop over all hook_schema() implementations and calculate the index length?
Comment #3
catchComment #4
sunThis bug needs to be fixed independently, but FYI, we're trying to introduce an automated validation to prevent this from happening in the future: #1852896: Throw an exception if a schema defines a key that would be over 1000 bytes in MySQL
A) We can either limit the maximum field length of 'vid' (and thus ALL vocabulary IDs/machine names) to say, 64 characters.
B) Or we can limit the schema index lengths only.
C) Both.
I'd personally prefer to do A), and would actually like to start to think about introducing a common/globally accepted maximum length for config entity IDs/machine names of 64 chars (or if absolutely needed, 128 chars). FWIW,
#type 'machine_name'
defaults to a #maxlength of 64 since D7 already. That looks appropriate to me.Comment #5
Damien Tournoud CreditAttribution: Damien Tournoud commentedAlternatively, is there any reason not to make vid a varbinary? That would also make the indexes more compact, which is a really good thing.
Comment #6
patrickd CreditAttribution: patrickd commented#4
Your removing
- '#maxlength' => 255,
shouldn't there be a
+ '#maxlength' => 64,
?
Anyway, patch works and solves the issue on MyISAM! :)
thanks
Comment #7
sunI looked into @Damien's suggestion in #5 - I did not verify them, but the user comments on http://dev.mysql.com/doc/refman/5.0/en/binary-varbinary.html are rather concerning.
Comment #8
sunBetter issue title.
AFAICS, the only remaining part would be to double-check the D7 machine_name column of vocabularies, and if needed, perform some automated truncation of machine names (which would have consequences elsewhere).
And sadly, yes, D7's taxonomy_schema() defines a length of 255...
We will have to truncate the maximum length in one way or the other at some point, since #1186034: Length of configuration object name can be too small is still also on the table.
Comment #9
BerdirSee also #1852454: Restrict module and theme name lengths to 50 characters. I think 64 sounds like a good default length for most of these identifiers, be it modules, machine names, identifiers.
Comment #10
sunRelated: #1701014: Validate config object names is about to add a validation for the maximum config object name length.
We will need the identical solution for #1779026: Convert Text Formats to Configuration System — the filter format machine name also had a maximum of 255 chars in D7. :-/
I guess we'll need a update.inc helper function that allows to check existing machine names and potentially truncates them, though still ensuring uniqueness (hence, requiring to pass-in all existing names).
Something like this should do it:
Comment #11
andypostSuppose while we have no clue about length of machine name let's just make a index shorter
Comment #13
andypost#11: 1871032-vid-index-11.patch queued for re-testing.
Comment #14
xjmI agree with sun; I'd much prefer to enforce a maximum length for the values rather than #11.
Comment #15
BerdirSo do I.
But that might require more discussion, bigger changes, update functions, tests.. And it does not prevent the index length problem completely (Nothing stops you from adding an index over 5 machine name columns which might be a problem again).
What about getting the stop-gap in, so that we can get the index length validation in and then discuss cutting the max length down (64 sounds fine to me)
Comment #16
xjmYeah this is probably a duplicate of #1709960: declare a maximum length for entity and bundle machine names in the longrun, so I guess this patch is okay as a stopgap fix.
Comment #17
thedavidmeister CreditAttribution: thedavidmeister commentedSo you're saying to apply #11 now so that people can at least install the 8.x branch for dev but leave this issue open? or close this issue and immediately re-open a new critical issue about the data loss that will happen for people upgrading their D7 sites?
Comment #18
sunI think it makes sense to move forward with #11, in order to unblock related issues.
But in the end, we have to do #4/#10 anyway at some point, and it doesn't make sense for me to push the problem to yet another new issue, so I'd suggest to commit #11 but revert this issue to active/needs review/work after committing, so we can investigate and do #4/#10 afterwards.
Net result: Other issues unblocked. This issue still critical. To figure out how to resolve the upgrade path for vocabulary machine names, and ideally, supply a generic helper facility + pattern for all other instances of the problem.
Comment #19
thedavidmeister CreditAttribution: thedavidmeister commentedSeems fair. Pretty nasty that you can't install D8 at all right now on some servers.
Once this whole D8 thing is done, maybe we could try to think of a more sophisticated way to handle relations, dependencies and progress in issues. It really gets confusing to come into epic core threads that is already months old that is "fixed with a bandaid patch but left open for X".
Anyways, how does limiting the length of columns affect exporting/importing objects from Drupal 7 sites that were upgraded to D8 and presumably could have longer machine names already in the db than "fresh" D8 sites. I assume we need to handle changing the schema in the update script as well, not just on install?
Also, is it really a good idea to start numbering truncated machine name sequentially? doesn't that re-introduce the main problems that machine names were implemented to fix?
Ie.
Site A has in the database:
vocab 1 = long_name_foo
vocab 2 = long_name_bar
Site B has in the database:
vocab 1 = long_name_bar
vocab_2 = long_name_baz
vocab 3 = long_name_foo
Totally fine to deploy between the two currently using machine names. Run an update script that starts renaming them sequentially based on some query and things could start to get hairy. Even if you do something like order the machine names alphabetically it doesn't help if you have vocabs mid-alphabet that only exist in one instance of the site.
Would it be fair to say that a hash of the long name would be safer as it preserves the "identity" of the machine name, even if it is a little less human friendly at that point?
Comment #20
xjm@thedavidmeister, probably best to discuss over in #1709960: declare a maximum length for entity and bundle machine names, but the idea is not that we would always truncate machine names. We would only do it for machine names that are being upgraded from D7 and have more than the maximum allowed characters. Starting in D8, it will become impossible to even create a machine name length over the maximum length.
Comment #21
thedavidmeister CreditAttribution: thedavidmeister commented@xjm - I understand that there is no problem for brand new D8 sites. I'm talking about two D7 sites that currently sync content or configuration or both based on machine name. When both are upgraded to D8 they may start "syncing" the wrong data if the truncation is based on a sequential algorithm, depending on the current states of their database.
Not a problem if one of the sites is just a dev instance of the other and can have the database overwritten with the prod site. If you have two prod sites syncing data by machine name then the sequential approach could really break things.
Comment #22
xjm@thedavidmeister, Ah, now I understand what you are saying. I've added that to the thread at #1709960: declare a maximum length for entity and bundle machine names.
Comment #23
thedavidmeister CreditAttribution: thedavidmeister commentedAlso, IMO this issue should be closed when the patch is applied and the upgrade path discussion continued at #1709960: declare a maximum length for entity and bundle machine names (as well as bumping that issue to critical).
The title of this bug implies it is really just about the installation process, that other issue has some more discussion on how best to handle the upgrade process.
Comment #24
catchNo upgrade path for the index change?
Comment #25
andypostAdded update function
Comment #26
thedavidmeister CreditAttribution: thedavidmeister commentedNobody thinks we shouldn't prioritise an upgrade path... at the moment there's just two parallel discussions about the same issue and it really doesn't matter where our efforts go but let's not have extensive meta discussions about where the final patch should live and just focus on what it should look like.
Catch, are you advocating that we should *not* push the patch in #11 that prevents some people from installing D8 right now because we don't have an upgrade path for D7 users yet? or are you just saying that this situation "needs work" overall?
Is an md5 hash of existing machine names that are too long ok? Is there a case to not use hashes, other than human readability? (I do know that I used to swear at Panels every time it did that to me until somebody told me why it was doing that...)
The numeric thing is obviously a far more user friendly option but there's at least one easy way to confound it so I'm really worried it will just ruin somebody's day.
Also, is there somewhere where should be documenting changes like this as we go so they can be picked up by Coder or similar when the contrib migrations begin?
Comment #27
thedavidmeister CreditAttribution: thedavidmeister commentedaaah looking at andypost's patch makes sense to me. I know what you're talking about now catch :)
Comment #28
catch@thedavidmeister I'm saying we should update the index change for 7.x sites so that 8.x and 7.x upgraded sites are kept in sync at all times.
As soon as we make schema changes in 8.x that aren't reflected in 7.x we introduce hidden bugs for the 7-8 upgrade path that might never get fixed, since there is no test coverage for that.
#25 looks good if it comes back green.
Comment #29
thedavidmeister CreditAttribution: thedavidmeister commentedyeah, i got that when I looked in the patch. I think i just misunderstood what you were saying.
Comment #30
thedavidmeister CreditAttribution: thedavidmeister commentedlol:
so, that would be andypost?
Comment #31
catchNo we already have some upgrade path tests fortunately, that one is only to have generic testing for schema changes.
Comment #32
andypostPatch is green, needs RTBC
Comment #33
sunThanks!
Comment #34
catchThat should happen in taxonomy_update_8006() where the field is changed.
That should have been done anyway in fact in the original commit, but I missed it, see comment on db_change_field():
vs. the update:
Comment #35
catchComment #36
andypostLearning from #1879022: Enforced dependencies errors updating to recent versions of Drupal 7
Comment #37
webchickCommitted and pushed to 8.x. Thanks!
Do we also need this change in 7.x?
Comment #38
andypost@webchick D7 is not affected by this
Comment #39
xjmHere are the indexes from taxonomy in D7:
Since only the
name
field is a non-integer, they're all safely below 333 characters. Converting to a (very long) machine name for vocabulary is what caused the issue in D8.Comment #40
webchickGotcha, thanks. :)
Comment #41.0
(not verified) CreditAttribution: commentedUpdated issue summary.