MySQL driver does not support full UTF-8 (emojis, asian symbols, mathematical symbols)

added summary of additional scope this issue needs in order to avoid a follow-up critical.

Log in or register to post comments

Comment #54

yesct commented 7 February 2013 at 03:32

Title:	UTF-8: 4 bytes characters bug using MySQL backend	» UTF-8: fix data loss in 4 bytes characters bug using MySQL backend and limit to charsets that start with utf8
Status:	Needs review	» Needs work

needs work to avoid critical follow-up issue.
updated issue summary.
updated issue title.

Log in or register to post comments

Comment #55

ergophobe commented 7 February 2013 at 06:18

I wrote the documentation to reflect the code, which is currently very permissive. Upon reflection, it would even accept typos would it not?

If it's the case that it absolutely must be utf8 or utf8mb4 then it should test specifically for those, not just strings that begin with 'utf8'. That way it would catch the case where someone accidentally types 'utf8mb' (e.g. as Crell did in #48).

So it seems that it needs to
- test specifically for utf8 or utf8mb4.
- if it's not one of those, throw an exception and default to utf8

Log in or register to post comments

Comment #56

chx commented 8 February 2013 at 18:54

I. am fine with that -- once again, for default. The reason I emphasize default cos the any-charset-goes is an *excellent* opportunity for (careful) migrating from legacy databases.

Log in or register to post comments

Comment #56.0

chx commented 8 February 2013 at 18:54

Issue summary:

Updated issue summary, point out need to avoid critical follow-up

Log in or register to post comments

Comment #57

damienwhaley commented 9 February 2013 at 00:09

I've updated the issue summary to make it easy to figure what's left to do.

Log in or register to post comments

Comment #58

damienwhaley commented 9 February 2013 at 05:56

Status	File	Size
new	interdiff-51-81.txt	6.19 KB
new	database-1313214-58.patch	8.43 KB

I've started work on checking of the charset setting or MySQL and it's pretty much right I hope. The only concern I had is about throwing the exception. I've left the code commented out where I think the throw could happen, but if you let it throw there, then you end up with a white screen as the object is not returned and the error handler does not like null objects.

Perhaps the throw should be further up the chain (in the cache fetch)? Also the exception I've chosen is probably not quite right.

My thoughts after playing with this is that we should probably log it via watchdog as the connection should always work.

This probably needs a bit more thought. As I'm not 100% familiar with the new config stuff and I'm worried I have not respected @chx desire to only override the default setting.

I'd like some feedback, please.

Log in or register to post comments

Comment #58.0

damienwhaley commented 9 February 2013 at 05:56

Issue summary:

src: http://docs.oracle.com/cd/E17952_01/refman-5.5-en/charset-unicode.html

Updated the issue summary

Log in or register to post comments

Comment #59

yesct commented 9 February 2013 at 06:53

+++ b/core/lib/Drupal/Core/Database/Connection.phpundefined
@@ -135,20 +135,6 @@
-  /**
-   * The character set of this connection.
-   *
-   * @var string
-   */

I think that the interdiff was made a bit ... well, I dont think these were really removed in the patch in comment 81. wait, there is no 81! :) Probably a typo.
if a branch with the patch from 51 was called utf-51, and there was a branch with the patch in 58 called utf-58

then the interdiff could be made by: git diff utf-51 utf-58 > interdiff-51-58.txt (maybe the interdiff above reversed the arguments)

Log in or register to post comments

Comment #60

yesct commented 9 February 2013 at 06:56

+++ b/core/lib/Drupal/Core/Database/Driver/mysql/Connection.phpundefined
@@ -69,14 +68,29 @@ public function __construct(array $connection_options = array()) {
+      if (strcmp($connection_options['charset'], 'utf8') == 0
+        || strcmp($connection_options['charset'], 'utf8mb4') ==0) {

I think any charset starting with utf8 would be ok. That would be more general than just adding two specific allowed ones.

Log in or register to post comments

Comment #61

chx commented 10 February 2013 at 16:41

Status	File	Size
new	interdiff.txt	5.79 KB

Thanks everyone for moving this forward.

> I think any charset starting with utf8 would be ok.

See #55 for the reasoning -- checking specifically for utf8 / utf8mb4 is a good idea to avoid typos.

Re the new patch ( I attached a real interdiff for easy review): we do not need a new allowedCharset property and method as a disallowed charset should terminate immediately. strcmp() is needlessly complicated, it can be a simply ==. If the error handling is not yet set up, then just print and die, this should be superb rare, you can't hit that error message without misediting settings.php no need to be nice. However, moving the charset method up to the base Connection class is good.

As for default connection, just add self::$databaseInfo[$key][$target]['key'] = $key into Database::openConnection before calling the constructor and then the constructor can easily test for the default key.

Log in or register to post comments

Comment #62

ergophobe commented 11 February 2013 at 16:04

I'll try to make a patch tomorrow morning unless someone beats me to it.

The final patch needs/should/could...

Add the $key to the connectionOptions array. I have already done this, just need to merge it into the new patch.
Limit to utf8 or utf8mb4 for the default connection and be more permissive for all other connections. So for the default connection only it needs to test for the charset, and otherwise it's up to the user to get it right (I think this is reasonable givne that anyone who is defining multiple connections is or should be an advanced user and, as chx said, it offers possibilities for connecting to legacy databases that may be non-Drupal DBs. Useful for migration, for pulling data from some other dataset, etc.
Move the mysql-specific comments out of the base Connection class
RE "print and die" solution - I got hung up the other day on exception handling and if that's not truly necessary, then that simplifies things.

One comment on item #2 - there are only two charsets that begin with 'utf8' currently available in MySQL. There are many collations and there, if we're going to test for collations, it makes sense to take anything that begins with 'utf8', but I don't think it does with charsets. Consider that utf8mb4 requires special considerations. Let's say MySQL introduces a third utf8 charset - we can't know at this point what special considerations will be required to handle it correctly, so even if it's a valid charset, it should probably be disallowed until the implications are understood. Maybe that's wrong and it's up to the user to test it.

There is, however, an alias for the classic utf8 charset which can now be referred to as utf8mb3. So even though there are only two charsets with uft8 in the name, there are three valid handles for those two charsets. So perhaps utf8mb3 should be allowed too.

In a future version of MySQL, it is possible that utf8 will become the 4-byte utf8, and that users who want to indicate 3-byte utf8 will have to say utf8mb3. To avoid some future problems which might occur with replication when master and slave servers have different MySQL versions, it is possible as of MySQL 5.5.3 for users to specify utf8mb3 in CHARACTER SET clauses, and utf8mb3_collation_substring in COLLATE clauses, where collation_substring is bin, czech_ci, danish_ci, esperanto_ci, estonian_ci, and so forth

Log in or register to post comments

Comment #63

chx commented 11 February 2013 at 05:46

Sure, let's go with that too. And yes, if exception handling is broken then print and die -- it's just a failsafe and not an error anyone will ever see.

Log in or register to post comments

Comment #64

ergophobe commented 11 February 2013 at 19:17

Status	File	Size
new	interdiff-49-64.txt	6.5 KB
new	database-1314214-64.patch	7.66 KB

One more try. Since this is both my first core patch and I'm just getting started with D8, I hope this is moving things in the right direction. But I'm adding a lot of notes here to hopefully make it easier for people to decipher.

The main principles are the same
- restrictive for default connection
- permissive for all other connections
- added a $connection_options array element 'index' which is the first index of the $databases array. Sometimes this is called 'key' so perhaps that would be a better nomenclature, but I was just following advice from chx on IRC.
- probably other stuff

Exception Handling
The big thing still problematic is agreeing on an exception/error handling strategy. I understand why print and die might make the most sense, but I expect there are a lot of cases where just defaulting to UTF8 and warning the user would save some headaches.

Obviously, the best thing would be a try/throw/catch, but I could not get that to work. If I try to throw an exception (even just using \Exception, I get a fatal error:

( ! ) Fatal error: Call to a member function get() on a non-object in path/to/core/includes/theme.maintenance.inc on line 59

This is true even if I use the base \Exception handler or if I try to use one of the custom ones. There's too much I don't know about exception handling in D8!

That said, for now I have created a drupal_set_message() warning with the message

'The default connnection charset must be "utf8", "utf8mb3" or "utf8mb4". Currently defined as: ' . $connection_options['charset'] . '. Reverting to "utf8". Please check your charset definition in settings.php'

I know that's not the way this should be, but hopefully it moves this patch forward.

Relation to Patch 58

Sorry - I worked off my previous patch mostly and that's what's reflected in the interdiff. Unless I've been misunderstanding all along, I think damienwhaley misconstrued what chx was driving at with respect to the default connection. The point was not to allow a change only for the default connection, it was to be very restrictive on the changes allowed on the default connection and permissive on all other connections.

I did move the charset function and property to base connection class as in #58.

I removed the MySQL-specific comments and refer the user to documentation in default.settings.php. I think it's best for the general documentation for this problem to reside in one place anyway and not be repeated all over. That way if it changes (for example the reference URL) it only needs to be updated once.

I also did not include the allowedCharset() method and deleted the commented code and comments regarding exceptions in /core/lib/Drupal/Core/Database/Database.php.

Other Notes

In my patch, the code in /mysql/Connection.php lines 78-83 is unnecessarily repetitive, but given the long discussion here ( http://drupal.org/node/935284 ) I gather that especially for core it is preferred to keep lines short at the expense of inefficient code (and it does make it more readable, but it feels "wrong" to have to short conditionals that execute the same statement).

I have also added perhaps too much documentation in default.settings.php. I'm not sure how much is too much. Comments welcome.

Log in or register to post comments

Comment #65

yesct commented 11 February 2013 at 21:15

Status:

Needs work

» Needs review

changing to needs review to let the testbot try it.

Log in or register to post comments

Comment #66

ergophobe commented 11 February 2013 at 22:54

Status:

Needs review

» Needs work

Oops, I forgot to change that. But I was looking over the code and realize it still needs work. As is, the patch has this code in mysql/Install/Tasks.php:

    // If we are using utf8mb4 charset, make sure the database supports it.
    if (isset($database['charset']) && $database['charset'] == 'utf8mb4') {
      if (!db_query("SHOW CHARACTER SET WHERE Charset = 'utf8mb4'")->rowCount()) {
        $errors['mysql_charset'] = st('Your database does not support the utf8mb4 character set');
      }
    }

This should be generalized, because for non-default connections we're now allowing any charset.

It also raises another question: should the charset only be validated on install? If so, then the other validation code should be moved here. If not, this check should become part of the code that is validating the connection. Also, though it probably won't stay, the drupal_set_message() call does not wrap a t().

Log in or register to post comments

Comment #67

ergophobe commented 16 February 2013 at 23:22

Status:

Needs review

» Needs work

I just realized this can't be used on any column over 191 characters that will be an index because InnoDB only allows 767 bytes on an index. See phayes comments above in #10.

Log in or register to post comments

Comment #68

ergophobe commented 12 February 2013 at 03:49

Status	File	Size
new	interdiff-49-68.txt	10.49 KB
new	database-1314214-68.patch	9.55 KB

I think there's still a fair bit of work to be done here. Most importantly, any patch needs to be tested to make sure that it actually works with an interactive install and a custom charset/collation in settings.php

I'm not sure any of the patches submitted so far did, because the default collation was getting set to the custom collation and that doesn't work because you get a mismatch on VARCHAR columns. If you change it to use utf8mb4 on VARCHAR cols, then you get lots of crashes because several indexes exceed the 191 character limit (see comment above).

Also there remains the question of how to check that a given charset is in fact supported. This can be done on install, but if someone changes the settings.php file post-install, there's no test in here yet.

So this is probably the best effort I'm going to make without some guidance or someone else to step in and straighten things out. I hope I've made more fixes than problems.

Log in or register to post comments

Comment #69

ergophobe commented 12 February 2013 at 04:01

Status:

Needs work

» Needs review

Status	File	Size
new	interdiff-49-69.txt	10.54 KB
new	database-1314214-69.patch	9.62 KB

Remove whitespace. Queued for review so it gets tested, but really it's "needs work" - see comments in previous post.

Log in or register to post comments

Comment #70

yesct commented 12 February 2013 at 08:24

Status	File	Size
new	interdiff-69-70.txt	5.92 KB
new	database-1314214-70.patch	9.71 KB

Caution, this is standards stuff, forgive me for distracting here. No need to worry about getting code perfect before posting. Really. I just started on some white space stuff and got carried away.

Patch attached.

+++ b/core/lib/Drupal/Core/Database/Connection.phpundefined
@@ -135,6 +135,14 @@
+  protected $charset = 'utf8';
+
+
   function __construct($dsn, $username, $password, $driver_options = array()) {

@@ -1172,4 +1180,17 @@ public function commit() {
+  }
 }
+
+

+++ b/sites/default/default.settings.phpundefined
@@ -165,6 +165,40 @@
+ * 'utf8mb4' character set on all text columns. More information on 'utf8mb4'
+ *  can be found here:
+ * http://dev.mysql.com/doc/refman/5.5/en/charset-unicode-utf8mb4.html

extra whitespace

+++ b/core/lib/Drupal/Core/Database/Connection.phpundefined
@@ -1172,4 +1180,17 @@ public function commit() {
+   * Fetch the current character set for this connection.

function one line summaries should begin with third person verbs, like: Fetches

+++ b/core/lib/Drupal/Core/Database/Connection.phpundefined
@@ -1172,4 +1180,17 @@ public function commit() {
+   * documentation in sites/default/default.settings.php for more information.
+   * @return string

@params have not newlines between them, but @return always has a newline before it. http://drupal.org/node/1354#order

And has a description on the second line.

+++ b/core/lib/Drupal/Core/Database/Driver/mysql/Connection.phpundefined
@@ -69,14 +69,35 @@ public function __construct(array $connection_options = array()) {
+// Default to 'utf8', but allow user to change this to
+    //  - any user-defined charset if this is not the default connection
+    //  - any allowed charset if this is the default connection
+    // See sites/default/default.settings.php for full documentation

missing spaces in the Default line, to align it.

missing periods at end of sentences.

I'm not sure of the format for lists inside inline comments.

+++ b/core/lib/Drupal/Core/Database/Driver/mysql/Connection.phpundefined
@@ -69,14 +69,35 @@ public function __construct(array $connection_options = array()) {
+    $default_connection_charsets = array('utf8', 'utf8mb4', 'utf8mb3'); //utf8mb3 is an alias for utf8

it's in core in some places, but http://drupal.org/node/1354#inline says to put in-line comments on its own line.

+++ b/core/lib/Drupal/Core/Database/Driver/mysql/Connection.phpundefined
@@ -69,14 +69,35 @@ public function __construct(array $connection_options = array()) {
+    if (isset($connection_options['charset']) && $connection_options['charset'] != 'utf8' ) {

expressions don't use space before the parens (). http://drupal.org/coding-standards#controlstruct

Also, I dont know the order of precedence between && and !=, so maybe an extra set of parens () would help make the meaning clearer. [I did not fix this one.]

+++ b/core/lib/Drupal/Core/Database/Driver/mysql/Connection.phpundefined
@@ -69,14 +69,35 @@ public function __construct(array $connection_options = array()) {
+        drupal_set_message(t('The default connnection charset must be "utf8", "utf8mb3" or "utf8mb4". Currently defined as: ' . $connection_options['charset'] . '. Reverting to "utf8". Please check your charset definition in settings.php'), 'warning');

use arguments to t() to make it easier on translators.

http://api.drupal.org/api/drupal/core%21includes%21bootstrap.inc/functio...

for example: $text = t("@name's blog", array('@name' => user_format_name($account)));

+++ b/core/lib/Drupal/Core/Database/Driver/mysql/Install/Tasks.phpundefined
@@ -82,4 +82,21 @@ protected function connect() {
+        $errors['mysql_charset'] = st('Your database does not support the '.$database['charset'].' character set');

I'm not sure, but I think st() might be the same as t() in that should use args for variables instead of string cat. http://api.drupal.org/api/drupal/core%21includes%21install.inc/function/...

+++ b/sites/default/default.settings.phpundefined
@@ -165,6 +165,40 @@
+ * Finally, it is possible to use 'utf8mb3' which is currently simply an alias of

more than 80 chars.

+++ b/sites/default/default.settings.phpundefined
@@ -165,6 +165,40 @@
+ * 'utf8', but MySQL reserves the right at a future date to make 'utf8' default
+ * to a 4-byte character set at which point 'utf8mb3' would specifically
+ * indicate the legacy 3-byte version.

broke up this sentence into two, and added comma to separate phrase to add clarity.

+++ b/sites/default/default.settings.phpundefined
@@ -165,6 +165,40 @@
+ * utf8mb3. Other charsets are allowed on other connections at the users risk

missing period.

Log in or register to post comments

Comment #71

ergophobe commented 12 February 2013 at 16:40

Sorry about that and thanks so much for doing all that cleanup. I'll try to be more careful in the future.

>>I dont know the order of precedence between && and !=

the != takes precedence, so this works as expected, but for clarity, perhaps the extra parens would help.

Log in or register to post comments

Comment #72

yesct commented 12 February 2013 at 16:47

if the approach is ok, then a little cleaning up of the comments will be needed, there is some debug or other notes in there.

Log in or register to post comments

Comment #73

ergophobe commented 12 February 2013 at 22:26

My feeling is that it needs more than cleaning up comments. The comments are long and have debug type stuff in there because I'm not sure of several things.

I stopped where I did because I had a headache and had reached the point where I felt like I was making things worse, not better. I think it needs a fair bit of massaging, but I can't really get back to it until next week.

It also needs either a very robust test or it needs to be tested by installing from scratch in a variety of scenarios (or both) - a small change here or there in this patch can easily cause a crash on install. If you get a column mismatch of any sort, MySQL will die.

Log in or register to post comments

Comment #74

Crell commented 16 February 2013 at 19:24

+++ b/core/lib/Drupal/Core/Database/Driver/mysql/Connection.php
@@ -69,14 +69,36 @@ public function __construct(array $connection_options = array()) {
+        drupal_set_message(t('The default connnection charset must be "utf8", "utf8mb3" or "utf8mb4". Currently defined as: @name. Reverting to "utf8". Please check your charset definition in settings.php', array('@name' => $connection_options['charset'])), 'warning');

drupal_set_message() inside the Connection class is a no-go. Just throw an exception and let the global error handler deal with it.

+++ b/core/lib/Drupal/Core/Database/Driver/mysql/Schema.php
@@ -156,6 +162,15 @@ protected function createFieldSql($name, $spec) {
+    // If it's a text field, check to see if we should use utf8mb4 (4-byte UTF8)
+    // as the character set.
+    // InnoDB indexes have a max of 767 bytes. This means we can't use 4-byte
+    // charsets on VARCHAR because there are VARCHAR-based indexes of 255 chars.
+    if (in_array($spec['mysql_type'], array('TINYTEXT', 'MEDIUMTEXT', 'LONGTEXT', 'TEXT')) &&  Database::getConnection()->charset() == 'utf8mb4') {
+          //    isset($info['charset']) && $info['charset'] != 'utf8') {
+      $sql .= ' CHARACTER SET '.$info['charset'].' COLLATE ' . $info['collation'];

I don't fully understand the implications of this comment. Does that mean utf8mb4 + InnoDB + Varchar == kaboom? That would be... very very bad. I hope I'm misunderstanding that.

Have I mentioned how much I hate SQL databases?

Log in or register to post comments

Comment #75

Crell commented 16 February 2013 at 20:10

Note we may want to change the SET NAMES call to something that the driver supports natively:

http://www.php.net/manual/en/ref.pdo-mysql.connection.php

(Yes, it's deeply buried in the docs. I didn't even know about it until today.)

Log in or register to post comments

Comment #76

ergophobe commented 16 February 2013 at 23:29

Status:

Needs work

» Needs review

Does that mean utf8mb4 + InnoDB + Varchar == kaboom?

If the column has more than 191 characters, yes, it goes kaboom. See phayes comments in #10.

That would be... very very bad.

Yes, it would be. Maybe it's just because I don't have the chops for this, but the more I look into this, the more utf8mb4 support seems like a morass.

Log in or register to post comments

Comment #77

chx commented 20 February 2013 at 00:13

> because there are VARCHAR-based indexes of 255 chars.

Then we should ban that and establish sensible indexes. There's no way we need more than, say, 32 chars to index and every db and Drupal totally supports index lengths. Might be a followup.

Log in or register to post comments

Comment #78

ergophobe commented 20 February 2013 at 04:07

I was not reading my own comments - I was thinking the column length was the problem, but yes, it's just the index prefix length at issue and certainly 191 characters should be enough for that.

Log in or register to post comments

Comment #79

chx commented 20 February 2013 at 06:23

Re #75 it was not buried rather it didn't exist: Prior to PHP 5.3.6, this element was silently ignored. Note that Drupal at this moment is 5.3.5 so using charset in the DSN is not yet possible. Plans are to go 5.3.10 in March.

Log in or register to post comments

Comment #80

ergophobe commented 20 February 2013 at 20:26

Roughly speaking

135-140 varchar cols that are indexed (see * below for how I arrived at this)
82 with lengths of 128 or more (so failure in a two-col index with any col over 63)
41 with lengths of 191 or more (so failure all by themselves)
0 currently specify prefix lengths

That's the rough scope of it, so it's not hard, though it does end up being a patch that affects a very large number of files. It will also require contrib authors to update their modules. I can try to come up with another patch, but before I go changing all those files, I just want confirmation that approach is sensible.

I don't find any indexes that are specified with a length currently, but as chx says, it's built into Drupal and is a simple matter of changing, for example, the key_value table

'primary key' => array('collection', 'name')

to

'primary key' => array(array('collection', 60), array('name', 128))

and choosing sensible values. In a default install of D8, the largest value for key_value.collection is 13 chars ("system.schema") so it seems to make sense there, for example, to leave name at 128 and then use what's left. There may be other cases where both columns need truncating. I haven't looked that carefully yet and wanted a reaction from others following the thread before going down that road.

PS I don't think just exempting the VARCHAR cols from utf8mb4 support as phayes suggested in #10 really makes sense. Some of these columns are things like users.signature and it seems like supporting a character set for user names is rather important.
---------

*I arrived at this very roughly by writing a small script that does a SHOW TABLES and then iterates through the tables and grabs all columns that are of type VARCHAR and which have a Key (as per the SHOW COLUMNS) output (and in the latter two cases, that have a length above the given size. I also just grepped the codebase using ^[^\*\n\r]+'(primary key|indexes|unique keys)' =>.*?[;,] and got a second count that way.

Log in or register to post comments

Comment #81

damien tournoud commented 20 February 2013 at 21:28

Do we have an actual use case for this? I could not see one by quickly skimming through the comments.

Until we have, it seems like the drawbacks clearly outweight the benefits here.

Log in or register to post comments

Comment #82

damien tournoud commented 20 February 2013 at 21:34

Also, a primary key cannot use length prefixing. (Or at least I hope that MySQL doesn't allow that, because it doesn't make any sense.)

It's probably worth tidying up our schema (for example: machine name-type keys should probably use a ascii character set so as to reduce the size of the index), but that belongs in another issue.

Log in or register to post comments

Comment #83

ergophobe commented 21 February 2013 at 01:16

a primary key cannot use length prefixing

Actually, it can: http://dev.mysql.com/doc/refman/5.5/en/alter-table.html
I tested it and had no problem altering a table to use a 100 char PK based on a 255 char VARCHAR column.

And this is all related to #1852896: Throw an exception if a schema defines a key that would be over 1000 bytes in MySQL - it seems one way or another some sort of length checking on MySQL keys might be needed.

Do we have an actual use case for this?

I'm not sure what the OPs use case was nor that I understand the internationalization issues well enough, but I believe the use case goes something like this:

User wants to use a characters not included in the Basic Multilingual Plane but included in the Supplementary Linguistic Plane. This would include some CJK Unified Ideographs, obscure languages (Egyptian Hieroglyphics), and some modern character sets (emoji, emoticons).

- http://en.wikipedia.org/wiki/Plane_%28Unicode%29#Supplementary_Multiling...
- http://mzsanford.wordpress.com/2010/12/28/mysql-and-unicode/
- http://mathiasbynens.be/notes/mysql-utf8mb4

it seems like the drawbacks clearly outweight the benefits here.

That's sort of what I was driving at in #76 - this seems like a lot of trouble, but as it stands now (as I understand it), Drupal is not really compatible with 4-byte character sets.

It looks like over in the Wordpress community they're fighting with the same thing.

Log in or register to post comments

Comment #84

phayes commented 21 February 2013 at 05:48

My use case of this is that we host a lot of mathematical scientific journals, there are a lot of high order characters that are used in this content for properly displaying math. I've been running the patch in #12 for about a year now in production with no problems.

Edit: Is there a reason why we don't just say we only support utf8mb4 on text fields (not varchar) and skip the whole key-length problem. It's not a 100% ideal solution, but it's better than providing no support for high order characters.

Log in or register to post comments

Comment #85

Crell commented 21 February 2013 at 06:09

phayes: Mixed character set databases are a nightmare to manage and lead to all sorts of creative bugs unless you really know what you're doing. You may really know what you're doing, but I wager most people using MySQL don't know what a character set is, much less what happens when you try to mix them. :-)

Log in or register to post comments

Comment #86

damien tournoud commented 21 February 2013 at 10:04

Title:	UTF-8: fix data loss in 4 bytes characters bug using MySQL backend and limit to charsets that start with utf8	» UTF-8: support 4 bytes characters bug in MySQL
Category:	bug	» feature
Priority:	Major	» Normal
Status:	Needs review	» Needs work

I tested it and had no problem altering a table to use a 100 char PK based on a 255 char VARCHAR column.

I don't know what it does exactly, but it's probably not desirable.

I'm going to reclassify this as a feature request. There is no data loss related to this since Drupal 7, so it is not a bug per say.

In addition, I don't think we can support this until we introduce support for an ascii or binary charset and use it to tidy up our indexes and primary keys (especially machine names and UUIDs that have poped up all over the place in Drupal 8).

Log in or register to post comments

Comment #87

damien tournoud commented 21 February 2013 at 10:04

Title:

UTF-8: support 4 bytes characters bug in MySQL

» UTF-8: support 4 bytes characters in MySQL

Log in or register to post comments

Comment #88

sun

German

Karlsruhe

commented 21 February 2013 at 12:19

#1923406: Use ASCII character set on alphanumeric fields so we can index all 255 characters

Log in or register to post comments

Comment #89

ergophobe commented 21 February 2013 at 19:59

Before I answer this I want to reemphasize - much of this is over my head in terms of both Drupal 8 internals and in terms of character sets (I only actually use French and English, so these issues don't affect me, I just ended up taking this on as somewhere to try to help with D8).
That said...

Edit: Is there a reason why we don't just say we only support utf8mb4 on text fields (not varchar)

Not using any Asian languages myself, I don't know how common the extensions beyond the Basic Multilingual Plane are. Mostly what you see in the SMP are special uses (Math, emoticons, alchemical symbols), dead languages (Lycian, Gothic, Cuneiform) and the unified Han characters.

But it seems to me that if you're supporting these characters in TEXT, they should be supported in VARCHAR for the cases where

someone wants to use such a character in a node title (could there be a situation where a mathematical symbol would appear in the title too?)
someone wants to use one of the Asian characters in a username (e.g. it's part of their real name)

Wouldn't it be confusing to a user that say that if you enter text into the body of a post, it's all good, but if you copy and paste it into the title of that post, it will truncate your data as soon as it encounters the first 4-byte character?

BTW - support for these character sets is shaky at best in the world at large. The emoji, for example, are supported in Win8 fully in IE, Firefox and Opera, but not in Chrome or Safari. Support on the Mac seems to be a little less. My Ubuntu is busy upgrading, so I can't tell.

Also, I came across one article using a 4-byte character in the title - it rendered fine in the Google search results, but the link was not clickable.

Log in or register to post comments

Comment #90

pancho

UTC+2 🇪🇺 EU

commented 12 May 2013 at 12:14

Category:

feature

» task

#75, #79:
#1800122: Bump minimum version of php required to 5.3.10 has been committed, so we can use DSN now, instead of 'SET NAMES'.

#80:
From what I got the 767 Bytes (= 191 chars) InnoDB restriction isn't about index length but about the length of a (single) prefix.
Otherwise, key length may be up to 1000 B (= 250 chars) as restricted by MyISAM, see #1852896: Throw an exception if a schema defines a key that would be over 1000 bytes in MySQL.
We might need to test this again, unless someone can clarify.

#81:
Another use case are LTR tags allowing English text be correctly embedded in an RTL environment. We're having bugs that can't be correctly solved without this, see #1165476: if t() string has no translation or fallback language, text should have lang attribute.

#86:
Agree that we should avoid length prefixing on PKs that would be opening up one more can of worms. The more we should go and shorten keys.
Plus: While It's certainly true that there is no data loss, which is the criterion for a critical bug though, not generally for a bug. Still I'm recategorizing only as task, because it's not like everybody would expect us to support 4 byte chars. But "feature" this isn't either.

Log in or register to post comments

Comment #91

mgifford

he/him

English

commented 13 May 2013 at 11:52

Status:

Needs work

» Needs review

#70: database-1314214-70.patch queued for re-testing.

Log in or register to post comments

Comment #92

damien tournoud commented 13 May 2013 at 12:04

@Pancho: LTR and RTL are both 3-byte UTF-8 characters. The linked issue describes the use of language tags from U+E0000 to U+E007F, which are totally deprecated anyway.

Log in or register to post comments

Comment #93

ergophobe commented 21 May 2013 at 19:46

@Pancho

1. key length versus prefix length.

My mistake - you are correct

By default, an index key for a single-column index can be up to 767 bytes. The same length limit applies to any index key prefix.

The InnoDB internal maximum key length is 3500 bytes, but MySQL itself restricts this to 3072 bytes. This limit applies to the length of the combined index key in a multi-column index.

-- http://dev.mysql.com/doc/refman/5.5/en/innodb-restrictions.html

Log in or register to post comments

Comment #94

hanno commented 22 May 2013 at 11:53

Great work here. This patch could fix this issue.

It might also need a follow up. I think we should reconsider this as a bug as Drupal isn't fully utf-8 compliant.

This issue is not only theory, but also relevant for
- emoji, these emoticons are supported by Google and Apple on mobile phones and use the 4 bytes utf-8. Some applications even use emoji as part of the username
http://blog.manbolo.com/2011/12/12/supporting-ios-5-new-emoji-encoding
- mathematics used in scientific journals: http://drupal.stackexchange.com/questions/50868/configuring-drupal-to-us...

Some questions
- could we require MYSQL >5.5.28 for Drupal 8?
- There is no workaround if people can't or don't use utf8mb4:
- The actual behavior with normal utf8 is for user content is that your text is lost and no user feedback is given. I think we should reconsider this as a bug and we need to give user feedback.
- when importing data from external sources the import fails. Instead we could skip, escape, or replace these characters when sending to the database. This needs further research. See for example this bug when importing Tweet data: #1824506: When importing Tweets: SQLSTATE[HY000]: General error: 1366 Incorrect string value
- we could clarify in the system report which characterset is used, and if utf8 with mysql is used place a warning that it isnt fully compatible.
- could we test if this bug also exists with other databasesystems like PostgreSQL, SQLite and Oracle?

Log in or register to post comments

Comment #95

damien tournoud commented 22 May 2013 at 16:53

As I said in #86:

In addition, I don't think we can support this until we introduce support for an ascii or binary charset and use it to tidy up our indexes and primary keys (especially machine names and UUIDs that have poped up all over the place in Drupal 8).

Increasing the size of all our indexes by 33% is not something we can afford. Let's open a separate issue for the tidying up and postpone this one on it.

- The actual behavior with normal utf8 is for user content is that your text is lost and no user feedback is given. I think we should reconsider this as a bug and we need to give user feedback.

This should not happen. If it does, it's a bug elsewhere. The database layer triggers an exception, that's all that it does. Catching it and processing it belongs in the upper layers. If they don't do that correctly, it's a bug there, not in the database layer.

Log in or register to post comments

Comment #96

hanno commented 23 May 2013 at 21:16

This should not happen. If it does, it's a bug elsewhere.

Just checked with a fresh install and pasting a unicode 4 bytes character in a post gives a crash. Will post the bug in a seperate issue.
EDIT: #2002100: pasting text with 4 byte UTF-8 characters leads to broken screen

Log in or register to post comments

Comment #97

hanno commented 6 June 2013 at 08:10

Similar ticket and discussion in Wordpress:
http://core.trac.wordpress.org/ticket/21212 MySQL tables should use utf8mb4 character set
Is the by Pento mentioned solution in that ticket to use the option innodb_large_prefix something to consider?

The 767 byte limit won't be increased any time soon, due to http://bugs.mysql.com/bug.php?id=32915

It turns out this is possible, thanks to the innodb_large_prefix option, introduced in 5.5.14:
http://dev.mysql.com/doc/refman/5.5/en/innodb-parameters.html#sysvar_innodb_large_prefix

So, the requirements are:

MyISAM tables:
MySQL >= 5.5.3 (Assuming we don't add any indexes larger than 250 characters.)

InnoDB tables:
MySQL >= 5.5.14
innodb_file_format=barracuda
innodb_file_per_table=true
innodb_large_prefix=true
All tables with ROW_FORMAT=(DYNAMIC|COMPRESSED)

Any other table formats:
*head explode*

We can certainly detect these settings, it's just a question of whether that kind of complex (and edge-case-y) test should be in core.

Log in or register to post comments

Comment #98

pfrenssen

Sofia

commented 17 October 2013 at 09:03

Component:

mysql database

» database system

Status	File	Size
new	database-1314214-98.patch	9.68 KB

Rerolled, wanted to try this out. These emoji characters are not going away any time soon it seems.

Log in or register to post comments

Comment #99

pfrenssen

Sofia

commented 17 October 2013 at 09:04

Component:

database system

» mysql db driver

Log in or register to post comments

Comment #101

pfrenssen

Sofia

commented 17 October 2013 at 09:15

This doesn't work any more after the database connections have become static in #1953800: Make the database connection serializable.

Log in or register to post comments

Comment #102

mgifford

he/him

English

commented 18 October 2013 at 17:55

So is there another way round this?

Log in or register to post comments

Comment #102.0

mgifford

he/him

English

commented 18 October 2013 at 17:55

Issue summary:

Updated issue summary, took out duplicate and integrated the added problem motivation

Log in or register to post comments

Comment #103

morgantocker commented 14 November 2013 at 14:38

For prior art: the way this problem was handled in mediawiki was to store strings in VARBINARY and let the application handle character-sets. This was a pre MySQL 5.5 decision, and it's important to point out that it means no collation which is the bigger deal.

With collation:
'Montréal' == 'MONTREAL'
+ a consistent sorting order is provided with multi-byte characters.

It may be an option to 'degrade' to VARBINARY for versions older than MySQL 5.5. I'll leave this one for discussion.

RE: Index size limited to 191 characters discussion
There is actually a workaround to this problem via prefix indexes. i.e.
ALTER TABLE my_table ADD INDEX text_col_index (text_col(190));

This isn't suitable for PRIMARY or UNIQUE indexes since it breaks a constraint, but it should work fine for others.

Log in or register to post comments

Comment #104

marco commented 8 May 2014 at 15:43

Status	File	Size
new	mysql_utf8mb4_support-1314214-104-drupal7.28-backport.patch.txt	6.37 KB

I've updated the patch for Drupal 7 from #30:
- to Drupal 7.28
- applying all the remarks, but limited to utf8 and utf8mb4 (no custom encodings)

To apply:
1) backup the database
2) apply the patch
3) add the settings to your settings.php file
4) run this to upgrade the text fields:

  $schema = drupal_get_schema();
  foreach ($schema as $table_name => $table) {
    $repair_needed = FALSE;
    foreach ($table['fields'] as $field_name => $field) {
      if ($field['type'] == 'text') {
        db_change_field($table_name, $field_name, $field_name, $field);
        $repair_needed = TRUE;
      }
    }
    // According to http://mathiasbynens.be/notes/mysql-utf8mb4 when upgrading
    // it's also necessary to repair and optimize the tables.
    if ($repair_needed) {
      db_query("REPAIR TABLE $table_name");
      db_query("OPTIMIZE TABLE $table_name");
    }
  }

Log in or register to post comments

Comment #105

cdeepan commented 27 May 2014 at 05:16

Dear All,

Thanks for the patch for changing character set to utf8mb for columns with Text data type. I believe this would allow supplementary characters to be added to the node body. However I believe we will have issues if the user enters these supplementary characters in Menu Title, Node Title, Tags etc. Do we have any solution to take care of these fields as well?

Thanks & Regards,
Deepan Choudhary

Log in or register to post comments

Comment #106

chx commented 27 May 2014 at 09:12

We do not have a solution. See the patch:

+    // InnoDB indexes have a max of 767 bytes. This means we can't use 4-byte
+    // charsets on VARCHAR because there are VARCHAR-based indexes of 255 chars.

Log in or register to post comments

Comment #107

Antti J. Salminen commented 14 June 2014 at 11:10

What about the innodb_large_prefix and associated settings available in MySQL >5.5.14 mentioned in #97? Could this just be a manually enabled option and no autodetection or anything?

Log in or register to post comments

Comment #108

chx commented 15 June 2014 at 00:18

That would be swell. MySQL 5.1 was EOL'd on December 31, 2013. Maria will EOL 5.1 on 1 Feb 2015. Are we ready to raise the requirements to 5.5?

Edit: don't forget you need the Barracuda format for that and innodb table per file. Perhaps we need to test creating a table with a long index? Put barracuda in the table options?

Log in or register to post comments

Comment #109

Crell commented 15 June 2014 at 22:44

2 files were hidden/shown/deleted

Status	File	Size
hidden	database-1314214-69.patch	9.62 KB
hidden	interdiff-69-70.txt	5.92 KB

Given the number of other legacy versions we're dropping in Drupal 8 (PHP, IE, etc.) I would be OK with moving to MySQL 5.5 as the minimum if it bought us something. Would switching to MySQL 5.5 make this problem effectively go-away? That's what it sounds like but I want to confirm...

Log in or register to post comments

Comment #110

chx commented 16 June 2014 at 00:59

Weeeeeell. If your MySQL 5.5 InnoDB is set to use innodb_file_per_table and uses the Barracuda innodb_file_format http://dev.mysql.com/doc/refman/5.5/en/innodb-parameters.html#sysvar_inn... then your problem goes away. Fun.

This is the default file format in MySQL 5.6 though (and also the default between MariaDB/MySQL 5.5.0 and 5.5.6 but since 5.5.7 has reverted back to Antelope).

Log in or register to post comments

Comment #111

morgantocker commented 16 June 2014 at 16:15

@Crell: Yes, it would allow you to use utf8mb4 safely, with a couple of other issues to solve. To summarize the now quite long comment thread:

In comment #10 phayes noted that using utf8mb4 creates a secondary issue - it restricts the maximum index size of InnoDB tables to 191 characters. In comment #97 Hanno noted that there is an option innodb_large_prefix to supported larger indexes, but we should not assume all 5.5 users will be able to turn this on (InnoDB does not enable it by default for file-format backward compatibility).

There is a workaround suggested by ergophobe in comments #78 and #80, which is to use prefix indexes (index on the first 191 characters). There are little downsides to doing this: most strings will have enough selectivity in the first 191 characters. It may prevent 'covering index' optimizations, but I am not sure how many of these drupal has.

Also as chx notes in comment #108, upstream has dropped support for MySQL 5.1 (released November 2008). The latest Ubuntu release support MySQL 5.5 (2010) & 5.6 (2013). Red Hat 7.0 is MariaDB 5.5. So I would +1 a suggestion to make 5.5 the minimum.

Log in or register to post comments

Comment #112

ergophobe commented 16 June 2014 at 02:22

@margantocker - thanks for the concise summary of a super long discussion.

There is the question you mention of whether 191 chars is enough to provide specificity (it's certainly conceivable that, say, some ridiculously long numeric series could have all the significant information at the end).

But in addition to that, I can't figure out from the documents what happens if I have a 191-char prefix and I search for a 200-char string. Does it return if the first 191 match?
http://dev.mysql.com/doc/refman/5.5/en/mysql-indexes.html

Log in or register to post comments

Comment #113

damien tournoud commented 16 June 2014 at 04:17

Please keep in mind #95. Long indexes are bad, they use more memory, storage and are less efficient. The limit in InnoDB is there for a reason.

What needs to happen is that we need to stop using Unicode columns (and indexes) where they are not necessary. This is a comprehensive change that cannot be workaround by simply bumping database version requirements.

Log in or register to post comments

Comment #114

morgantocker commented 17 June 2014 at 05:56

@ergophobe: I have tested prefix-length on various string types on various data-sets before. If we take IMDB movie title names, the first 15 characters provides around 99% of the cardinality of an index on the full length. An edge-case where this is less-true is URLs, where uniqueness starts after the first 11 characters (http://www.). If the URLs all start with the same domain, this could be the first 50 characters... but I would say that's a less common case.

To answer your question:
In the event that MySQL can not filter rows via the index, it will filter at the row-level, so there is no risk of wrong results. It is simply a risk of following too many pointers from an index to a row, only to eliminate the need for it there (performance).

@Damien: I find mixing and matching character-sets to be a micro optimization. I have done it before, and no longer recommend it. InnoDB internally will use variable length storage for utf8 character-set, so from a storage perspective there is no added cost. In memory-temporary tables can not do variable length, and may take more memory or perhaps convert to disk more frequently.. but in the typical case I've found that drupal queries are well optimized here, and I would be surprised to see measurable difference.

The problem with shortening index prefixes, is that you need to fully examine the dataset to determine the prefix-length. And in many cases, Drupal users will have varying datasets. Long indexes are worst in the case of an index-scan (select avg(x), count(x) etc.), so more attention could be paid to optimization here. In other cases, InnoDB will usually only load the index pages into memory as required, so it largely becomes a storage problem. The worst situation I could see is a micro-optimization to shorten the prefix length, and some installations of Drupal can not use the index because it is not selective.

Log in or register to post comments

Comment #115

ergophobe commented 16 June 2014 at 20:18

Thanks for the clarification - that's reassuring

Log in or register to post comments

Comment #116

mgifford

he/him

English

commented 1 October 2014 at 13:07

Log in or register to post comments

Comment #117

he/him

English

Vancouver

commented 28 November 2014 at 18:53

Contrib related issue, saving tweets with Emoji as mentioned in #98. Adding to related issues.

Log in or register to post comments

Comment #118

https://www.drupal.org/project/strip_utf8mb4

he/him

English

Vancouver

commented 1 December 2014 at 21:27

Adding another contrib related issue with emojis

Log in or register to post comments

Comment #119

morgantocker commented 21 January 2015 at 14:32

A little bit more MySQL version context on innodb_large_prefix:
- It was created so that MySQL downgrades would be possible (i.e. 5.5 -> 5.1)
- This purpose is now obsolete, since 5.5+ support large prefix (and 5.1 is no longer officially supported).
- We are proposing[1] to enable innodb_large_prefix by default in MySQL 5.7, since it is now a safe change.

So I realize it's a while out for many people, but the story should be simpler with MySQL 5.7.

[1] I have the call out on my blog here:
http://www.tocker.ca/2015/01/05/what-defaults-would-you-like-to-see-chan...

Log in or register to post comments

Comment #120

rajab natshah

he/him

commented 25 February 2015 at 10:49

Hi

Now we do have a workaround module as a fix for this issue.

Strip 4-byte UTF8

We do have an interface for it too.
Strip 4-byte UTF8 back-end configuration page.

Rewarding time working on this fix :)

Log in or register to post comments

Comment #121

yesct commented 2 April 2015 at 19:49

Issue tags:		+affects drupal.org
Related issues:		+#2463607: Drupal.org chokes on emojis

.

Log in or register to post comments

Comment #122

stefan.r commented 4 April 2015 at 00:04

Issue summary:	View changes
Status:	Needs work	» Postponed

27 files were hidden/shown/deleted

Status	File	Size
hidden	mysql_utf8mb4_support-1314214-5.patch	1.42 KB
hidden	mysql_utf8mb4_support-1314214-6.patch	3.01 KB
hidden	mysql_utf8mb4_support-1314214-7.patch	5.37 KB
hidden	mysql_utf8mb4_support-1314214-8.patch	5.35 KB
hidden	mysql_utf8mb4_support-1314214-9.patch	5.35 KB
hidden	mysql_utf8mb4_support-1314214-10.patch	5.32 KB
hidden	mysql_utf8mb4_support-1314214-11.patch	5.38 KB
hidden	interdiff-26-29.txt	4.7 KB
hidden	mysql_utf8mb4_support-1314214-29.patch	5.39 KB
hidden	mysql_utf8mb4_support-1314214-30-drupal7.10-backport.patch.txt	4.28 KB
hidden	mysql_utf8mb4_support-1314214-33.patch	5.32 KB
hidden	mysql_utf8mb4_support-1314214-34.patch	5.27 KB
hidden	database-1314214-43.patch	5.39 KB
hidden	interdiff.txt	1.43 KB
hidden	database-1314214-47.patch	5.24 KB
hidden	interdiff.txt	1.84 KB
hidden	database-1313214-49.patch	5.41 KB
hidden	interdiff-49-51.txt	1.09 KB
hidden	database-1313214-51.patch	5.41 KB
hidden	interdiff-51-81.txt	6.19 KB
hidden	database-1313214-58.patch	8.43 KB
hidden	interdiff.txt	5.79 KB
hidden	interdiff-49-64.txt	6.5 KB
hidden	database-1314214-64.patch	7.66 KB
hidden	interdiff-49-68.txt	10.49 KB
hidden	database-1314214-68.patch	9.55 KB
hidden	interdiff-49-69.txt	10.54 KB

Maybe relevant: Wordpress managed to fix this 2 months ago: https://core.trac.wordpress.org/ticket/21212#comment:15

Also, the next development release for MySQL 5.7 will be a Release Candidate and will likely have the folllowing changes, supporting large enough indexes on utf8mb4 out of the box:

Important Change; InnoDB: The following changes were made to InnoDB configuration parameter default values:

The innodb_file_format default value was changed to Barracuda. The previous default value was Antelope. This change allows tables to use Compressed or Dynamic row formats.

The innodb_large_prefix default value was changed to ON. The previous default was OFF. When innodb_file_format is set to Barracuda, innodb_large_prefix=ON allows index key prefixes longer than 767 bytes (up to 3072 bytes) for tables that use a Compressed or Dynamic row format.

https://dev.mysql.com/doc/relnotes/mysql/5.7/en/news-5-7-7.html

Before we can support this though, if I can quote Damien Tournoud:

It's probably worth tidying up our schema (for example: machine name-type keys should probably use a ascii character set so as to reduce the size of the index), but that belongs in another issue.

Long indexes are bad, they use more memory, storage and are less efficient. The limit in InnoDB is there for a reason.

What needs to happen is that we need to stop using Unicode columns (and indexes) where they are not necessary. This is a comprehensive change that cannot be workaround by simply bumping database version requirements.

In addition, I don't think we can support this until we introduce support for an ascii or binary charset and use it to tidy up our indexes and primary keys (especially machine names and UUIDs that have poped up all over the place in Drupal 8).

Increasing the size of all our indexes by 33% is not something we can afford. Let's open a separate issue for the tidying up and postpone this one on it.

So anyone correct me if I'm wrong but I think this would have to be postponed on #1923406: Use ASCII character set on alphanumeric fields so we can index all 255 characters.

Log in or register to post comments

Comment #123

stefan.r commented 3 April 2015 at 23:48

Issue summary:

Further updating issue summary

Log in or register to post comments

Comment #124

stefan.r commented 4 April 2015 at 13:57

Issue summary:

Log in or register to post comments

Comment #125

stefan.r commented 4 April 2015 at 14:04

Issue summary:

We now have a patch over at #1923406: Use ASCII character set on alphanumeric fields so we can index all 255 characters so if anyone could chime in it might unblock this issue :)

Log in or register to post comments

Comment #126

stefan.r commented 16 April 2015 at 17:53

Title:	UTF-8: support 4 bytes characters in MySQL	» MySQL driver does not support full UTF-8 (emojis, asian symbols, mathematical symbols)
Category:	Task	» Bug report
Issue summary:	View changes
Priority:	Normal	» Major

Considering how prevalent UTF-8 characters from the additional character planes are these days (emojis, mathematical characters etc) shouldn't this actually be considered a bug?

Now that we have a workable patch in the blocking issue, would something like this work for a patch for this issue:

- If using MySQL, detect in the installer whether MySQL supports utf8m4, innodb_large_prefix=on and engine=Barracuda. If so:
- SET NAMES = utf8mb4 in the connection object and use utf8mb4 instead of utf8 in the schema, as well as row format = DYNAMIC.
- If MySQL has other settings or an override was provided in settings.php, use utf8 instead of utf8mb4.

Log in or register to post comments

Comment #127

stefan.r commented 17 April 2015 at 14:23

Issue tags:

+drupaldevdays

So I've discussed this at the Dev Days and @Damien Tournoud and @pwolanin mentioned that after we get the ASCII issue in, this would actually be easily solved by raising the MySQL requirement to 5.5.3 and using 191 characters on the few remaining UTF-8 indexes that are left. We can then just use utf8mb4 by default in the MySQL driver, without worrying about InnoDB/row settings.

Log in or register to post comments

Comment #128

yannickoo

Berlin

commented 17 April 2015 at 14:31

I have created a new issue to increase the minimum required version of MySQL.

Log in or register to post comments

Comment #129

yannickoo

Berlin

commented 17 April 2015 at 14:59

Status:

Postponed

» Needs review

Status	File	Size
new	drupal-utf8mb4-1314214-129.patch	6.12 KB

3 files were hidden/shown/deleted

Status	File	Size
hidden	database-1314214-70.patch	9.71 KB
hidden	database-1314214-98.patch	9.68 KB
hidden	mysql_utf8mb4_support-1314214-104-drupal7.28-backport.patch.txt	6.37 KB

This patch replaces all occurrences of utf8 with utf8mb4 when this is related to database encoding.

Log in or register to post comments

Comment #130

morgantocker commented 17 April 2015 at 15:03

Increasing the size of all our indexes by 33% is not something we can afford. Let's open a separate issue for the tidying up and postpone this one on it.

I've clarified in #1923406 where the increase in size will be (it's not a direct 33% increase). I will comment in #2473301 on raising the minimum version - it's quite a good solution imho.

Log in or register to post comments

Comment #133

stefan.r commented 17 April 2015 at 16:06

#129 seems close to what we want... After we fix:

#2473301: Raise MySQL requirement to 5.5.3
#1923406: Use ASCII character set on alphanumeric fields so we can index all 255 characters
Reducing index size to 191 characters on remaining non-binary/ascii indexed fields, or remove the index if it wasn't actually necessary.

Log in or register to post comments

Comment #134

stefan.r commented 20 April 2015 at 23:38

Status:	Needs review	» Needs work
Issue tags:		+Needs tests

1. This still needs a test to verify we store/display utf8mb4 characters correctly.

2. As soon as #1923406: Use ASCII character set on alphanumeric fields so we can index all 255 characters goes in it will also need to test for the standard utf8mb4 collation in SchemaTest.

3.

+++ b/core/lib/Drupal/Core/Database/Driver/mysql/Connection.php
@@ -85,27 +85,27 @@ public static function open(array &$connection_options = array()) {
 
     // Force MySQL to use the UTF-8 character set. Also set the collation, if a
-    // certain one has been set; otherwise, MySQL defaults to 'utf8_general_ci'
+    // certain one has been set; otherwise, MySQL defaults to 'utf8mb4_general_ci'
     // for UTF-8.

+++ b/core/lib/Drupal/Core/Database/Driver/mysql/Schema.php
@@ -101,22 +101,22 @@ protected function createTableSql($name, $table) {
     // By default, MySQL uses the default collation for new tables, which is

80 cols, s/UTF-8/utf8mb4/

Log in or register to post comments

Comment #135

hass commented 21 April 2015 at 15:12

Issue summary:

Added note about Drupal system requirements as todo

Log in or register to post comments

Comment #136

stefan.r commented 21 April 2015 at 21:35

That note can actually be removed, after #1923406: Use ASCII character set on alphanumeric fields so we can index all 255 characters goes in we can limit the remaining few UTF8 keys (if any) to 190 characters, that way we can also support installs without innodb_large_prefix.

Log in or register to post comments

Comment #137

stefan.r commented 21 April 2015 at 21:28

Issue summary:

Updated issue summary.

Log in or register to post comments

Comment #138

stefan.r commented 21 April 2015 at 21:29

Issue summary:

Log in or register to post comments

Comment #139

stefan.r commented 21 April 2015 at 21:35

Issue summary:	View changes
Issue tags:		+D8 upgrade path

Log in or register to post comments

Comment #140

he/him

English

Vancouver

commented 26 April 2015 at 06:19

This option looks like a better option for D7. #2382707: UTF8MB4 for MySQL Maybe we can re-open it to get some support there too?

Log in or register to post comments

Comment #141

he/him

English

Vancouver

commented 26 April 2015 at 06:21

Reason I say that is because here we are setting the default to utf8mb4, where as there it's default stays utf8 but has the option to change it if a site wants to go through with it.

Log in or register to post comments

Comment #142

stefan.r commented 4 May 2015 at 14:28

Status:	Needs work	» Needs review
Issue tags:	-Needs tests

Status	File	Size
new	drupal-utf8mb4-1314214-142.patch	9.66 KB
new	interdiff-129-142.txt	6.8 KB

This patch adds a test, addresses comments in #134 and limits the following fields with unique constraints to 191 characters:

title/name of website providing a feed
description of block content field (widget is updated automatically)
URI of file field
username field (which is limited to 60 characters anyway)

If we have a problem with the URI field only allowing for 191 characters, we could also get rid of the database-level constraint and add it on the application level instead.

In core we don't use any non-ASCII primary keys anymore, nor do we need to explicitly define length for regular indexes in PHP, as MySQL will take care of that for us anyway; it will cut off regular indexes at 191 characters without innodb_large_prefixes and at 255 characters with innodb_large_prefixes.

Log in or register to post comments

Comment #144

stefan.r commented 4 May 2015 at 15:55

Status:

Needs work

» Needs review

Status	File	Size
new	interdiff-142-144.txt	2.45 KB
new	drupal-utf8mb4-1314214-144.patch	12.68 KB

This should fix some failing tests as we had skipped the test-only tables in #1923406: Use ASCII character set on alphanumeric fields so we can index all 255 characters

Log in or register to post comments

Comment #146

stefan.r commented 5 May 2015 at 10:45

Status:

Needs work

» Needs review

Status	File	Size
new	drupal-utf8mb4-1314214-146.patch	15.12 KB
new	interdiff-142-146.txt	4.56 KB

This should further decrease test failures by marking some more fields as ASCII.

Log in or register to post comments

Comment #148

stefan.r commented 5 May 2015 at 11:49

Status:

Needs work

» Needs review

Status	File	Size
new	drupal-utf8mb4-1314214-148.patch	15.44 KB
new	interdiff-142-148.txt	4.88 KB

7 files were hidden/shown/deleted

Status	File	Size
hidden	drupal-utf8mb4-1314214-129.patch	6.12 KB
hidden	drupal-utf8mb4-1314214-142.patch	9.66 KB
hidden	interdiff-129-142.txt	6.8 KB
hidden	interdiff-142-144.txt	2.45 KB
hidden	drupal-utf8mb4-1314214-144.patch	12.68 KB
hidden	drupal-utf8mb4-1314214-146.patch	15.12 KB
hidden	interdiff-142-146.txt	4.56 KB

Log in or register to post comments

Comment #150

stefan.r commented 5 May 2015 at 12:30

Status:

Needs work

» Needs review

Status	File	Size
new	interdiff-148-150.txt	552 bytes
new	interdiff-142-148.txt	4.88 KB

Missed another test field.

@joelpittet as to the D7 version of this patch, @catch had proposed in #2473301: Raise MySQL requirement to 5.5.3 that we allow people to use either utf8 or utf8mb4 in D7, if @David_Rothstein is OK with that.

Log in or register to post comments

Comment #151

stefan.r commented 5 May 2015 at 12:32

Status	File	Size
new	drupal-utf8mb4-1314214-150.patch	15.72 KB

4 files were hidden/shown/deleted

Status	File	Size
deleted	interdiff-148-150.txt	552 bytes
deleted	interdiff-142-148.txt	4.88 KB
hidden	drupal-utf8mb4-1314214-148.patch	15.44 KB
hidden	interdiff-142-148.txt	4.88 KB

Log in or register to post comments

Comment #153

stefan.r commented 5 May 2015 at 14:59

Status:

Needs work

» Needs review

Status	File	Size
new	interdiff-150-153.txt	5.33 KB

This removes the unique keys from feed title and block content info as reducing field size to 191 could break D6/D7 upgrades.

Log in or register to post comments

Comment #154

stefan.r commented 5 May 2015 at 14:58

Status	File	Size
new	drupal-utf8mb4-1314214-153.patch	18.77 KB

Log in or register to post comments

Comment #156

stefan.r commented 5 May 2015 at 16:20

Looks like this test run failed because #1923406: Use ASCII character set on alphanumeric fields so we can index all 255 characters had been reverted.

Whenever that gets back in I'll also remove the length cutoff from the URI field on File entities. Instead we can make it an ASCII field and urlencode it, as both removing the unique constraint and cutting off /all/ URL's at 191 characters seem like even worse options.

After we do that I think the patch is close to where it needs to be, this will just need some feedback from @benjy regarding how this will fit in with migrate.

Log in or register to post comments

Comment #157

stefan.r commented 5 May 2015 at 20:10

Status:

Needs work

» Needs review

Status	File	Size
new	interdiff-153-157.txt	3.75 KB
new	drupal-utf8mb4-1314214-157.patch	20.72 KB

This implements urlencoding on the storage layer for non-ASCII URLs, as it got too ugly trying to do this on the entity layer. DownloadTest has coverage for non-ASCII URLs.

This still looks a bit ugly to me. Maybe it would be better to do a query looking for file entities with the same URI in File::preSave() and we just lose the unique constraint?

Log in or register to post comments

Comment #158

stefan.r commented 5 May 2015 at 20:22

Actually I think it may be better if we add a uri hash field and put a unique constraint on that field instead?

Log in or register to post comments

Comment #160

stefan.r commented 5 May 2015 at 23:24

Status:

Needs work

» Needs review

Status	File	Size
new	drupal-utf8mb4-1314214-159.patch	22.89 KB
new	interdiff-157-159.txt	7.17 KB

This adds a hash as well as a test for the hash and the uniqueness constraint on the hash. The hashing still feels like a better solution than the urlencoding as it doesn't change what is stored in the URI field, as well as allowing us to save URLs longer than 255 characters (see #193954: {file}_uri and {file}_filename length limitations)

I think the new hash field would need a beta to beta upgrade hook though.

Log in or register to post comments

Comment #162

stefan.r commented 6 May 2015 at 07:07

Status:

Needs work

» Needs review

Status	File	Size
new	drupal-utf8mb4-1314214-162.patch	23 KB

Log in or register to post comments

Comment #163

Le Mont-Dore

commented 6 May 2015 at 07:38

If we go for URI encoding,it should be storage layer, the application layer should not have to be bothered with this kind of technical details. So #157 was an improvement.

However, as URIs can indeed become incredibly long and only differ in some query parameter values after position 191, I think that using a hash might even be better. I suppose that performance wise, computing a hash is not that expensive. So continue with #160.

+++ b/core/modules/migrate_drupal/src/Tests/Table/d6/System.php
@@ -6,7 +6,7 @@
  *
  * THIS IS A GENERATED FILE. DO NOT EDIT.
  *
- * @see cores/scripts/dump-database-d6.sh
+ * @see core/scripts/dump-database-d6.sh
  * @see https://www.drupal.org/sandbox/benjy/2405029
  */
 
@@ -26,7 +26,7 @@ public function load() {

@@ -26,7 +26,7 @@ public function load() {
       ),
       'fields' => array(
         'filename' => array(
-          'type' => 'varchar',
+          'type' => 'varchar_ascii',
           'not null' => TRUE,
           'length' => '255',
           'default' => '',

- Don't bother changing generated files.
- This seems like a D6 table, so it should not be changed anyway.
- The typo should be corrected in core\scripts\migrate-dump-d6.sh,but not as part of this issue.

Log in or register to post comments

Comment #165

stefan.r commented 6 May 2015 at 08:43

@fietserwin, thanks. The D6 table has to be changed as well, as it is being created as a utf8mb4 table with a primary key on the filename field. If we don't change it, it only allows 191 characters. This is being changed both in the generated script (so tests can pass) as in the generating script, but indeed the typo won't be picked up :)

Not sure why this test run is marked as failed, in the test log it just keeps terminating in the middle of the node tests, but it doesn't say which specific test is breaking. I'll log a ticket in the testbot queue.

Log in or register to post comments

Comment #166

stefan.r commented 6 May 2015 at 14:48

Status:

Needs work

» Needs review

Status	File	Size
new	drupal-utf8mb4-1314214-165.patch	21.55 KB

A few nitpicks in the mean time

Log in or register to post comments

Comment #168

6 May 2015 at 12:11

Status:

Needs work

» Needs review

stefan.r queued 166: drupal-utf8mb4-1314214-165.patch for re-testing.

Log in or register to post comments

Comment #169

stefan.r commented 6 May 2015 at 13:24

Apparently this may be a testbot problem as opposed to a problem in the patch: #2477583: Unexpected timeouts on testbot runs

I'll just keep re-queueing this patch then

Log in or register to post comments

Comment #170

stefan.r commented 6 May 2015 at 14:44

Issue summary:

3 files were hidden/shown/deleted

Status	File	Size
hidden	drupal-utf8mb4-1314214-159.patch	22.89 KB
hidden	interdiff-157-159.txt	7.17 KB
hidden	drupal-utf8mb4-1314214-162.patch	23 KB

Ah tests are green now, #166 is not that different to the previous patches so this may have been a testbot problem after all.

Updating issue summary.

Log in or register to post comments

Comment #171

stefan.r commented 6 May 2015 at 18:06

Issue summary:

5 files were hidden/shown/deleted

Status	File	Size
hidden	drupal-utf8mb4-1314214-150.patch	15.72 KB
hidden	interdiff-150-153.txt	5.33 KB
hidden	drupal-utf8mb4-1314214-153.patch	18.77 KB
hidden	interdiff-153-157.txt	3.75 KB
hidden	drupal-utf8mb4-1314214-157.patch	20.72 KB

Updating issue summary to clarify why we can lose the DB-level unique constraints on block description and feed title.

The uniqueness validation is just there because there's no way to distinguish between blocks/feeds in the UI other than by description, and we already have entity validators for this, nor is it a big deal if we still get two blocks with the same description in some edge case. All it'd be is an inconvenience, and workarounds exist (such as editing the description).

Just to add to this, the patch is MySQL only, but there are two tests which may affect PostgreSQL / SQLite: NodeViewTest and SaveTest. For now @amateescu has confirmed that SaveTest works with SQLite, the other 3 cases are yet to be confirmed.

Log in or register to post comments

Comment #172

amateescu commented 6 May 2015 at 19:20

Just tested now and both tests are fine on SQLite and Postgres.

Log in or register to post comments

Comment #173

stefan.r commented 7 May 2015 at 10:29

Status	File	Size
new	interdiff-165-173.txt	513 bytes
new	1314214-173.patch	22.23 KB

Thanks @amateescu! That also proves the PostgreSQL and SQLite drivers aren't affected by this issue and already support full UTF-8.

Having done another review I'm happy with this patch now. Tiny coding standards fix attached.

Just waiting for one of the database people to have a look at this and for @benjy to check this will play well with migrate.

Log in or register to post comments

Comment #174

stefan.r commented 6 May 2015 at 20:14

Status	File	Size
new	1314214-174.patch	21.56 KB
new	interdiff-165-174.txt	610 bytes

3 files were hidden/shown/deleted

Status	File	Size
hidden	drupal-utf8mb4-1314214-165.patch	21.55 KB
hidden	interdiff-165-173.txt	513 bytes
hidden	1314214-173.patch	22.23 KB

Oops

Log in or register to post comments

Comment #176

benjy commented 7 May 2015 at 01:42

This all looks fine to me from a migrate POV.

Log in or register to post comments

Comment #177

Le Mont-Dore

commented 7 May 2015 at 09:29

Issue summary:	View changes
Status:	Needs review	» Needs work

Added "font icons" to the list of lacking support and in text references to prerequisite issues.

+++ b/core/modules/aggregator/src/FeedStorageSchema.php
@@ -33,7 +33,7 @@ protected function getSharedTableFieldSchema(FieldStorageDefinitionInterface $st
-          $this->addSharedTableFieldUniqueKey($storage_definition, $schema);
+          $this->addSharedTableFieldIndex($storage_definition, $schema);
           break;

Pass TRUE for the 3rd param (not_null)? (unique fields are not null by default, but for other fields you have to pass that explicitly)

+++ b/core/modules/file/src/FileStorageSchema.php
@@ -30,6 +30,9 @@ protected function getSharedTableFieldSchema(FieldStorageDefinitionInterface $st
+          $this->addSharedTableFieldIndex($storage_definition, $schema, TRUE);
+          break;
+        case 'uri_hash':
           $this->addSharedTableFieldUniqueKey($storage_definition, $schema, TRUE);
           break;

addSharedTableFieldUniqueKey() does not accept a 3rd param (it is always not null).

+++ b/core/modules/file/src/FileStorageSchema.php
@@ -30,6 +30,9 @@ protected function getSharedTableFieldSchema(FieldStorageDefinitionInterface $st
+        case 'uri_hash':
           $this->addSharedTableFieldUniqueKey($storage_definition, $schema, TRUE);
           break;
       }

idem.

+++ b/core/modules/file/src/Tests/SaveTest.php
@@ -67,10 +70,25 @@ function testFileSave() {
+      $this->assertTrue($exception_triggered, 'SQL uniqueness constraint is triggered');
+    }

warning: Variable 'exception_triggered' might not have been defined.

Log in or register to post comments

Comment #178

stefan.r commented 7 May 2015 at 10:15

Status	File	Size
new	1314214-178.patch	21.75 KB
new	interdiff-174-178.txt	1.85 KB

Thanks for the review and the issue summary update @fietserwin.

Actually 2 and 3 may be referring to the same line of code? Or maybe to the code we are removing, as it is currently unnecessarily adding a 3rd parameter in HEAD as well:

- $this->addSharedTableFieldUniqueKey($storage_definition, $schema, TRUE);

As to 4, I have added a variable definition so we don't trigger a notice whenever the assertion fails.

Log in or register to post comments

Comment #179

stefan.r commented 7 May 2015 at 10:05

Status:

Needs work

» Needs review

Log in or register to post comments

Comment #180

stefan.r commented 7 May 2015 at 10:10

Status	File	Size
new	1314214-180.patch	21.75 KB
new	interdiff-178-180.txt	605 bytes

Had left an error in #178.

This is ready for further review now :)

Log in or register to post comments

Comment #182

Le Mont-Dore

commented 8 May 2015 at 09:53

- I would have assigned FALSE to $constraint_triggered, so that is a boolean and only a boolean, but I'm fine with this patch.
- 2 and 3 were indeed the same, my fault.

RTBC for me, but what about the 'D8 upgrade path' tag, does this patch need a hook_update?

Log in or register to post comments

Comment #183

stefan.r commented 8 May 2015 at 10:17

Well assertTrue will never consider NULL a test pass, but indeed FALSE would have been more readable. If this goes back to "Needs work" at some point, I'll try to remember to fix.

As this adds a field on the File entity and changes a few unique keys, indeed this will need an update hook at some point. But as we don't yet support beta to beta upgrades, that's out of scope for this issue :)

Log in or register to post comments

Comment #184

andypost

he/him

Russian

commented 10 May 2015 at 12:31

5 files were hidden/shown/deleted

Status	File	Size
hidden	1314214-174.patch	21.56 KB
hidden	interdiff-165-174.txt	610 bytes
hidden	1314214-178.patch	21.75 KB
hidden	interdiff-174-178.txt	1.85 KB
hidden	interdiff-178-180.txt	605 bytes

Just 2 nits

+++ b/core/modules/block_content/src/Entity/BlockContent.php
@@ -22,7 +22,7 @@
- *     "storage_schema" = "Drupal\block_content\BlockContentStorageSchema",
+ *     "storage_schema" = "Drupal\Core\Entity\Sql\SqlContentEntityStorageSchema",

this is a default, just remove the line

+++ b/core/modules/file/src/Entity/File.php
@@ -194,7 +195,10 @@ public static function preCreate(EntityStorageInterface $storage, array &$values
+    // Save the hash of the URI so we can enforce uniqueness.
+    $this->get('uri_hash')->value = Crypt::hashBase64($uri);

@@ -252,6 +256,15 @@ public static function baseFieldDefinitions(EntityTypeInterface $entity_type) {
+      // As this is a base64-encoded SHA-256 hash, we can limit the length to
+      // 100 characters.
+      ->setSetting('max_length', 100)

why 100? sessions.sid uses 128 but hash('sha256', $data) is 256 bits so char(64)

Log in or register to post comments

Comment #185

stefan.r commented 10 May 2015 at 14:30

As to 2, it's a base64'd hash which adds another ~33% to to the 64 characters, so I just rounded up to 100.

strlen(hash('sha256', 'hello')) == 64;
strlen(base64_encode(hash('sha256', 'hello'))) == 88;
strlen(base64_encode(hash('sha256', 'hello', TRUE))) == 44;

But I just noticed hashBase64 uses raw binary data, so actually ~44 chars may have been enough in this case. Let's set it to 128 though just to be consistent with what we use for sessions.sid and as I think this discussion is out of scope for this patch.

Our max_lengths make little sense in general, usually we set it to 255 or 128 for no reason and we define our fixed-length columns as varchar, so maybe let's discuss that, along with limiting both columns to ~44 characters in a separate followup issue?

Log in or register to post comments

Comment #186

stefan.r commented 10 May 2015 at 14:19

Status	File	Size
new	1314214-185.patch	21.57 KB
new	interdiff-180-185.txt	1.32 KB

Log in or register to post comments

Comment #187

13 May 2015 at 13:44

stefan.r queued 186: 1314214-185.patch for re-testing.

Log in or register to post comments

Comment #188

stovak commented 13 May 2015 at 18:37

Created a d7 backport issue. https://www.drupal.org/node/2488180

Log in or register to post comments

Comment #189

stovak commented 17 May 2015 at 00:07

Log in or register to post comments

Comment #190

Le Mont-Dore

commented 18 May 2015 at 05:39

Status:

Needs review

» Reviewed & tested by the community

With the points raise in #177 and #184 solved or explained, this is OK now.

Log in or register to post comments

Comment #192

Le Mont-Dore

commented 18 May 2015 at 07:25

Patch no longer applies, so needs a reroll. In that case: can you also cover my 1st point from #182?

Log in or register to post comments

Comment #193

anavarre

French

🇪🇺

commented 18 May 2015 at 07:36

Issue tags:

+Needs reroll

Log in or register to post comments

Comment #194

stefan.r commented 18 May 2015 at 09:42

Issue tags:

-Needs reroll

Status	File	Size
new	interdiff-185-194.txt	567 bytes
new	1314214-194.patch	21.51 KB

3 files were hidden/shown/deleted

Status	File	Size
hidden	1314214-180.patch	21.75 KB
hidden	1314214-185.patch	21.57 KB
hidden	interdiff-180-185.txt	1.32 KB

re-rolled

Log in or register to post comments

Comment #195

Le Mont-Dore

commented 18 May 2015 at 10:20

Status:

Needs work

» Needs review

Log in or register to post comments

Comment #196

stefan.r commented 18 May 2015 at 11:25

Thanks @fietserwin for the review. Patch is green again.

Log in or register to post comments

Comment #197

Le Mont-Dore

commented 18 May 2015 at 12:15

Status:

Needs review

» Reviewed & tested by the community

As per #190, as there are no real changes in this patch compared to #186.

Log in or register to post comments

Comment #198

catch

he/him

English

commented 19 May 2015 at 11:15

I've given this a once over and I think it's OK.

Slightly concerned about the filename base64 encoding for uniqueness, but don't have a better idea (i.e. I think that's probably better than shortening the column). Want to think about that a little bit more before committing though.

Log in or register to post comments

Comment #199

stefan.r commented 19 May 2015 at 11:48

Thanks. What specifically is a worry about the hashing? Having a column on the database that really doesn't "mean" anything?

Just to clarify for anyone reading this, the hash is a base64-encoded sha256 hash -- not just base64-encoded.

Log in or register to post comments

Comment #200

catch

he/him

English

commented 19 May 2015 at 12:16

Status:

Reviewed & tested by the community

» Needs review

Thanks. What specifically is a worry about the hashing? Having a column on the database that really doesn't "mean" anything?

Yes exactly. We're changing the schema for something that's an InnoDB-specific limitation.

For example we avoided doing similar in #83738: LOWER(name) queries perform badly; add column name_lower and index. and eventually went with #279851: Replace LOWER() with db_select() and LIKE() where possible. An example of where the additional column could go wrong is if someone directly updated a filename in the database - then the hash would no longer match. We don't support manually updating things in the database outside of update handlers, but it's not impossible an update handler would need to update some filenames then they've additionally got to base64 hash those which is easy to forget. Typing this out is increasing my unease - so back to CNR for more opinions I think.

One option would be to not fix this particular column in this issue, and revive efforts to get transliteration for filenames into core again. Or at least to split it out into its own issue.

As well as that, we could use sqlite and postgres test runs before committing this.

Log in or register to post comments

Comment #201

stefan.r commented 19 May 2015 at 12:28

SQLite and PostgreSQL test runs were done by @amateescu in #172, the patch is not fundamentally different.

Maybe we could we add a trigger that hashes the field instead of doing this on the PHP level?

"Not fixing" this column would imply we limit the length to 191 characters, which would mean migrations have that as a "known issue" until we get in transliteration. I had also tried implemented URLencoding in an earlier patch (#157) but didn't really like the looks of it -- similar concerns would apply if people bypass core APIs and change the column directly.

Log in or register to post comments

Comment #202

morgantocker commented 19 May 2015 at 12:35

@stefan.r - I don't recommend using triggers. Prior to MySQL 5.7, there is one trigger per-event per-table (event: after insert, before insert etc.). Many DBAs require triggers to be able to perform schema changes and migrations. If the application also requires triggers, they won't be able to do this.

Triggers in Amazon RDS are also a pain:
https://www.percona.com/blog/2014/07/02/using-mysql-triggers-and-views-i...

Log in or register to post comments

Comment #203

stefan.r commented 19 May 2015 at 12:46

Then probably our best bet is to take out the hashing, limit to 191 and reopen the filename transliteration issue so we can increase the limit back to 255.

Alternatively, we take out the unique constraint (temporarily) on the database level. We could still have a Drupal-level unique constraint for filenames.

Log in or register to post comments

Comment #204

Le Mont-Dore

commented 19 May 2015 at 14:58

Status:

Needs review

» Needs work

#200:

We don't support manually updating things in the database outside of update handlers

#203:

Alternatively, we take out the unique constraint (temporarily) on the database level. We could still have a Drupal-level unique constraint for filenames.

I think that, in terms of a solution, both of you are saying the same thing here ... so drop that column and only check on application level.

Log in or register to post comments

Comment #205

stefan.r commented 19 May 2015 at 17:31

Status:

Needs work

» Needs review

Status	File	Size
new	interdiff-194-205.txt	5.62 KB
new	1314214-205.patch	20.54 KB

This takes out the database-level unique constraint and adds a Symfony constraint on the URI field instead.

The database-level constraint seems useful though, so maybe we should have another look into transliterating filenames so we can bring it back as soon as that's in?

Log in or register to post comments

Comment #206

Le Mont-Dore

commented 20 May 2015 at 07:35

Status:

Needs review

» Needs work

3 minors:

+++ b/core/modules/file/src/Entity/File.php
@@ -252,7 +252,8 @@ public static function baseFieldDefinitions(EntityTypeInterface $entity_type) {
+      ->addConstraint('FileUri', []);

I wouldn't pass in the 2nd param and rely on the default (being NULL, not an empty array).

```
+++ b/core/modules/file/src/Plugin/Validation/Constraint/FileUriConstraint.php
@@ -0,0 +1,31 @@
+class FileUriConstraint extends Constraint {
+
```
Looking a bit around regarding the naming of constraints, I found:
- UserNameUnique
- UserMailUnique
- UserNameConstraint
- LinkAccess

So, it looks that if the constraint is self explaining, the word constraint is not added to the end. This would suggest FileUriUnique as id and class name.
```
+++ b/core/modules/file/src/Plugin/Validation/Constraint/FileUriConstraint.php
@@ -0,0 +1,31 @@
+  public $message = 'A file with this URI %value already exists. Enter a unique file URI.';
+
```
... this URI %value ... does not sound correct to me, but I'm not a native English speaker. I would either:
- Change 'this' to 'the', or
- (my pereference) Use something like (comparing to usernameunique and usermailunique): 'The file %value already exists. Enter a unique file URI.'.

If you change this, also change the test...

Log in or register to post comments

Comment #207

stefan.r commented 20 May 2015 at 14:43

Status:

Needs work

» Needs review

Status	File	Size
new	1314214-207.patch	20.27 KB
new	interdiff-205-207.txt	2.83 KB

Log in or register to post comments

Comment #209

stefan.r commented 20 May 2015 at 14:49

Status:

Needs work

» Needs review

Status	File	Size
new	1314214-208.patch	20.27 KB
new	interdiff-207-208.txt	539 bytes

Before RTBCing, can we confirm no new tests should fail in PostgreSQL/SQLite, either through a new test run or a manual code review comparing with the patch that had already been tested by @amateescu (drupal-utf8mb4-1314214-165.patch)

This would also need a follow-up issue where we discuss transliteration in core so we can put back the database-level unique constraint as soon as that's back in.

Log in or register to post comments

Comment #210

stefan.r commented 20 May 2015 at 15:11

Follow-up issue created at #2492171: Provide options to sanitize filenames (transliterate, lowercase, replace whitespace, etc)

Log in or register to post comments

Comment #211

Le Mont-Dore

commented 20 May 2015 at 16:29

I'm not sure about the id used for the FileUriUnique constraint (FileUri). Other constraints around in D8 seem to use the class name as id, only chopping off Constraint (bu not Unique or Access). So that would be a final correction to the patch if the postgres and sqlite tests do not uncover anything else.

Leaving to NR to as the tests may run without that final change and to invite others to give this patch a final review.

Log in or register to post comments

Comment #212

stefan.r commented 20 May 2015 at 16:41

Status	File	Size
new	interdiff-208-212.txt	1.01 KB
new	1314214-212.patch	20.28 KB

Let's be consistent with the rest of core then :)

Log in or register to post comments

Comment #214

21 May 2015 at 01:28

Status:

Needs work

» Needs review

isntall queued 212: 1314214-212.patch for re-testing.

Log in or register to post comments

Comment #215

Le Mont-Dore

commented 21 May 2015 at 08:44

Thanks, perfect. The (transliteration) follow-up has also been created, so RTBC for me. @amateescu: are you also OK with this patch?

Log in or register to post comments

Comment #216

23 May 2015 at 14:02

bzrudi71 queued 212: 1314214-212.patch for re-testing.

Log in or register to post comments

Comment #217

23 May 2015 at 14:32

Status:

Needs review

» Needs work

The last submitted patch, 212: 1314214-212.patch, failed testing.

Log in or register to post comments

Comment #218

stefan.r commented 23 May 2015 at 17:52

Status:

Needs work

» Needs review

Status	File	Size
new	1314214-218.patch	20.4 KB
new	interdiff-212-218.txt	342 bytes

This fixes the latest test failure as the generated Migrate files now have an MD5 checksum. Just for the record, benjy had OK'ed this in #176.

Log in or register to post comments

Comment #219

stefan.r commented 23 May 2015 at 15:32

Issue summary:

Updating issue summary to reflect latest changes.

Log in or register to post comments

Comment #220

stefan.r commented 26 May 2015 at 08:40

@bzrudi71 did test runs on PostgreSQL/SQLite this weekend:

Hi Stefan,

good news! No problems with the patch for PG and SQLite :) The only fail is the MigrateTableDumpTest that now also fails for current patch on MySQL…

So, RTBC :)

Cheers Rudi

The MigrateTableDumpTest failure was fixed in #218

Log in or register to post comments

Comment #221

Le Mont-Dore

commented 26 May 2015 at 18:21

Status:

Needs review

» Reviewed & tested by the community

So, RTBC :)

Log in or register to post comments

Comment #222

stefan.r commented 26 May 2015 at 18:40

Status	File	Size
new	interdiff-185-218.txt	5.93 KB

Thanks.

Just for reference, here's an interdiff with the previous RTBC patch that had been reviewed by @catch.

Log in or register to post comments

Comment #223

alexpott

he/they

English

🇪🇺🌍

commented 1 June 2015 at 18:12

Assigned:

Unassigned

» catch

Afaics @catch's concerns have been addressed. I'm going to assign it to him rather than commit myself given @catch's prior involvement.

Log in or register to post comments

Comment #224

1 June 2015 at 23:24

Status:

Reviewed & tested by the community

» Needs work

The last submitted patch, 218: 1314214-218.patch, failed testing.

Log in or register to post comments

Comment #225

2 June 2015 at 07:36

stefan.r queued 218: 1314214-218.patch for re-testing.

Log in or register to post comments

Comment #226

stefan.r commented 2 June 2015 at 08:25

Status:

Needs work

» Reviewed & tested by the community

back to rtbc

Log in or register to post comments

Comment #227

alexpott

he/they

English

🇪🇺🌍

commented 2 June 2015 at 08:26

@stefan.r we you requeue a patch because you think the fail is unrelated it is important to comment on the issue to say what failed and why it is unrelated. This helps us to not introduce new random fails.

Log in or register to post comments

Comment #228

stefan.r commented 2 June 2015 at 08:38

The failure was "Repository checkout: failed to checkout from [git://git.drupal.org/project/drupal.git].", the same error as was appearing on other issues.

Log in or register to post comments

Comment #229

2 June 2015 at 09:14

catch committed b105158 on 8.0.x

Issue #1314214 by stefan.r, phayes, ergophobe, YesCT, damienwhaley, Tor...

Log in or register to post comments

Comment #230

catch

he/him

English

commented 2 June 2015 at 09:15

Version:	8.0.x-dev	» 7.x-dev
Assigned:	catch	» Unassigned
Status:	Reviewed & tested by the community	» Patch (to be ported)

So dropping the unique database constraint isn't great, but given #2492171: Provide options to sanitize filenames (transliterate, lowercase, replace whitespace, etc) is the only viable way to add it back properly I think that's fine.

Committed/pushed to 8.0.x, thanks!

I'm not sure how viable this is to backport to 7.x, but like the varchar_ascii moving it there for discussion.

Log in or register to post comments

Comment #231

alexpott

he/they

English

🇪🇺🌍

commented 2 June 2015 at 15:46

This patch is causing problems for me - I can't install Drupal 8 anymore.

Drupal\Core\Database\DatabaseExceptionWrapper: SQLSTATE[42000]: Syntax error or access violation: 1071 Specified key was too long; max key length is 767 bytes: CREATE TABLE {router} (
`name` VARCHAR(255) CHARACTER SET ascii COLLATE ascii_general_ci NOT NULL DEFAULT '' COMMENT 'Primary Key: Machine name of this route',
`path` VARCHAR(255) NOT NULL DEFAULT '' COMMENT 'The path for this URI',
`pattern_outline` VARCHAR(255) NOT NULL DEFAULT '' COMMENT 'The pattern',
`fit` INT NOT NULL DEFAULT 0 COMMENT 'A numeric representation of how specific the path is.',
`route` LONGBLOB DEFAULT NULL COMMENT 'A serialized Route object',
`number_parts` SMALLINT NOT NULL DEFAULT 0 COMMENT 'Number of parts in this router path.',
PRIMARY KEY (`name`),
INDEX `pattern_outline_fit` (`pattern_outline`, `fit`)
) ENGINE = InnoDB DEFAULT CHARACTER SET utf8mb4 COMMENT 'Maps paths to various callbacks (access, page and title)'; Array
(
)
in db_create_table() (line 435 of /Volumes/devdisk/dev/sites/drupal8alt.dev/core/includes/database.inc).

Log in or register to post comments

Comment #232

arla commented 2 June 2015 at 16:02

I get the same error as @alexpott, with MySQL 5.6.24 (and 5.6.21).

Probably related (from https://news.ycombinator.com/item?id=7317519):

InnoDB limits index columns to 767 bytes. Why is this suddenly an issue? Because changing the charset also changes the number of bytes needed to store a given string. With MySQL’s utf8 charset, each character could use up to 3 bytes. With utf8mb4, that goes up to 4 bytes. If you have an index on a 255 character column, that would be 765 bytes with utf8. Just under the limit. Switching to utf8mb4 increases that index column to 1020 bytes (4 * 255).

Log in or register to post comments

Comment #233

catch

he/him

English

commented 2 June 2015 at 16:02

Version:	7.x-dev	» 8.0.x-dev
Status:	Patch (to be ported)	» Needs work

Reverted.

From irc this is affecting people who use actual Oracle MySQL. MariaDB seems fine.

Log in or register to post comments

Comment #234

2 June 2015 at 16:07

catch committed 537da55 on 8.0.x

Revert "Issue #1314214 by stefan.r, phayes, ergophobe, YesCT,...

Log in or register to post comments

Comment #235

morgantocker commented 2 June 2015 at 16:09

> From irc this is affecting people who use actual Oracle MySQL. MariaDB seems fine.

@catch: This probably has to do with default SQL mode selection. I would be surprised if with sql mode strict, they behave differently.

Log in or register to post comments

Comment #236

stefan.r commented 2 June 2015 at 16:57

Do we know what the problem here is yet?

I don't have access to any Oracle MySQL installs right now, maybe anyone could test whether the fist statement gives an error and the second doesn't? Just in case the problem is the index.

CREATE TABLE test (
`name` VARCHAR(255) CHARACTER SET ascii COLLATE ascii_general_ci NOT NULL DEFAULT '' COMMENT 'Primary Key: Machine name of this route',
`path` VARCHAR(255) NOT NULL DEFAULT '' COMMENT 'The path for this URI',
`pattern_outline` VARCHAR(255) NOT NULL DEFAULT '' COMMENT 'The pattern',
`fit` INT NOT NULL DEFAULT 0 COMMENT 'A numeric representation of how specific the path is.',
`route` LONGBLOB DEFAULT NULL COMMENT 'A serialized Route object',
`number_parts` SMALLINT NOT NULL DEFAULT 0 COMMENT 'Number of parts in this router path.',
PRIMARY KEY (`name`),
INDEX `pattern_outline_fit` (`pattern_outline`, `fit`)
) ENGINE = InnoDB DEFAULT CHARACTER SET utf8mb4 COMMENT 'Maps paths to various callbacks (access, page and title)';

CREATE TABLE test2 (
`name` VARCHAR(255) CHARACTER SET ascii COLLATE ascii_general_ci NOT NULL DEFAULT '' COMMENT 'Primary Key: Machine name of this route',
`path` VARCHAR(255) NOT NULL DEFAULT '' COMMENT 'The path for this URI',
`pattern_outline` VARCHAR(255) NOT NULL DEFAULT '' COMMENT 'The pattern',
`fit` INT NOT NULL DEFAULT 0 COMMENT 'A numeric representation of how specific the path is.',
`route` LONGBLOB DEFAULT NULL COMMENT 'A serialized Route object',
`number_parts` SMALLINT NOT NULL DEFAULT 0 COMMENT 'Number of parts in this router path.',
PRIMARY KEY (`name`)
) ENGINE = InnoDB DEFAULT CHARACTER SET utf8mb4 COMMENT 'Maps paths to various callbacks (access, page and title)';

Log in or register to post comments

Comment #237

alexpott

he/they

English

🇪🇺🌍

commented 2 June 2015 at 17:16

INDEX `pattern_outline_fit` (`pattern_outline`, `fit`) causes the problem

Log in or register to post comments

Comment #238

stefan.r commented 2 June 2015 at 17:44

@morgantocker there is no command we can send to Oracle MySQL to make it cut off indexes at 191 characters, like MariaDB does? I guess we'll just have to define index length explicitly?

Log in or register to post comments

Comment #239

morgantocker commented 2 June 2015 at 18:24

@stefan.r: yes. It is related to my earlier comment about differences in default sql mode. See this test case:

mysql56> ALTER TABLE test2 ADD INDEX `pattern_outline_fit` (`pattern_outline`, `fit`);
ERROR 1071 (42000): Specified key was too long; max key length is 767 bytes
mysql56> set sql_mode='';
Query OK, 0 rows affected (0.00 sec)

mysql56> ALTER TABLE test2 ADD INDEX `pattern_outline_fit` (`pattern_outline`, `fit`);
Query OK, 0 rows affected, 1 warning (0.01 sec)
Records: 0  Duplicates: 0  Warnings: 1

mysql56> show warnings;
+---------+------+---------------------------------------------------------+
| Level   | Code | Message                                                 |
+---------+------+---------------------------------------------------------+
| Warning | 1071 | Specified key was too long; max key length is 767 bytes |
+---------+------+---------------------------------------------------------+
1 row in set (0.00 sec)

mysql56> show create table test2\G
*************************** 1. row ***************************
       Table: test2
Create Table: CREATE TABLE `test2` (
  `name` varchar(255) CHARACTER SET ascii NOT NULL DEFAULT '' COMMENT 'Primary Key: Machine name of this route',
  `path` varchar(255) NOT NULL DEFAULT '' COMMENT 'The path for this URI',
  `pattern_outline` varchar(255) NOT NULL DEFAULT '' COMMENT 'The pattern',
  `fit` int(11) NOT NULL DEFAULT '0' COMMENT 'A numeric representation of how specific the path is.',
  `route` longblob COMMENT 'A serialized Route object',
  `number_parts` smallint(6) NOT NULL DEFAULT '0' COMMENT 'Number of parts in this router path.',
  PRIMARY KEY (`name`),
  KEY `pattern_outline_fit` (`pattern_outline`(191),`fit`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COMMENT='Maps paths to various callbacks (access, page and title)'
1 row in set (0.00 sec)

MySQL 5.6 is strict by default in new installations and strict for all in 5.7. This is probably not an issue for a non-unique index with two exceptions:
- Index-only scans are now not possible
- Strings with low selectivity in the left-most 191 characters.

It may be a problem with UNIQUE indexes because a constraint can not be imposed. I am not a fan of the 'loose' behavior.

@morgantocker there is no command we can send to Oracle MySQL to make it cut off indexes at 191 characters, like MariaDB does? I guess we'll just have to define index length explicitly?

You could achieve this by disabling strict mode, making the schema change, and then re-enabling strict mode.

Log in or register to post comments

Comment #240

stefan.r commented 2 June 2015 at 18:34

Status:

Needs work

» Needs review

Status	File	Size
new	1314214-239.patch	0 bytes
new	interdiff-218-239.txt	16.31 KB

So something like this... Though I'd rather have this taken care of by the database itself if possible (like MariaDB does)

Log in or register to post comments

Comment #241

stefan.r commented 2 June 2015 at 18:42

Status	File	Size
new	interdiff-218-240.patch	2.21 KB
new	1314214-240.patch	22.42 KB

Just a proof of concept, will work on this further over the course of the week... or anyone feel free to refactor / further comment.

Log in or register to post comments

Comment #242

morgantocker commented 2 June 2015 at 18:59

@stefan.r: To clarify, it's not quite true to say that this is being taken care of by the database. What the database is doing, is providing you with something other than what you asked for. In many cases this is very dangerous, which is why it is now disabled.

A more complete fix is available in 5.7, where the barracuda file format + innodb_large_prefix is enabled by default. This allows indexes up to 3072 bytes.

Log in or register to post comments

Comment #243

berdir

German

Switzerland

commented 2 June 2015 at 18:55

Yeah, I don't think we should rely on the database here. We have to explicitly define how those indexes should be shortened. Quite often, it's not the last part that should be shortened but the first part of the index I think for example with the router table.

We've been doing that before, we just had higher limits that we could rely on.

@morgantocker: Good to know, so based on our history of which MySQL version we support, we just have to wait 10 years or so until we can require 5.7 and our problem will be solved ;)

Log in or register to post comments

Comment #244

2 June 2015 at 19:14

Status:

Needs review

» Needs work

The last submitted patch, 241: 1314214-240.patch, failed testing.

Log in or register to post comments

Comment #245

2 June 2015 at 19:15

The last submitted patch, 241: interdiff-218-240.patch, failed testing.

Log in or register to post comments

Comment #246

stefan.r commented 2 June 2015 at 21:47

Status:

Needs work

» Needs review

Status	File	Size
new	1314214-246.patch	23.69 KB
new	interdiff-218-246.txt	3.35 KB

16 files were hidden/shown/deleted

Status	File	Size
deleted	1314214-239.patch	0 bytes
deleted	interdiff-218-239.txt	16.31 KB
deleted	interdiff-218-240.patch	2.21 KB
deleted	1314214-240.patch	22.42 KB
hidden	interdiff-185-194.txt	567 bytes
hidden	1314214-194.patch	21.51 KB
hidden	interdiff-194-205.txt	5.62 KB
hidden	1314214-205.patch	20.54 KB
hidden	1314214-207.patch	20.27 KB
hidden	interdiff-205-207.txt	2.83 KB
hidden	1314214-208.patch	20.27 KB
hidden	interdiff-207-208.txt	539 bytes
hidden	interdiff-208-212.txt	1.01 KB
hidden	1314214-212.patch	20.28 KB
hidden	interdiff-212-218.txt	342 bytes
hidden	interdiff-185-218.txt	5.93 KB

This cuts off all indexes at 191 characters on non-ASCII fields that are MySQL string types (and thus are utf8mb4 encoded).

Probably the "non-last part" shortening should be dealt with in a separate issue as the implicit behavior of the previously commited patch (at least on MariaDB) was to shorten the last part.

Log in or register to post comments

Comment #247

stefan.r commented 2 June 2015 at 22:00

Status	File	Size
new	interdiff-246-247.txt	2.28 KB
new	1314214-247.patch	23.78 KB

Nitpicks

Log in or register to post comments

Comment #248

alexpott

he/they

English

🇪🇺🌍

commented 2 June 2015 at 22:16

Status:

Needs review

» Needs work

The current patch makes the router table look like this - and Drupal installs for me.

CREATE TABLE `router` (
  `name` varchar(255) CHARACTER SET ascii NOT NULL DEFAULT '' COMMENT 'Primary Key: Machine name of this route',
  `path` varchar(255) NOT NULL DEFAULT '' COMMENT 'The path for this URI',
  `pattern_outline` varchar(255) NOT NULL DEFAULT '' COMMENT 'The pattern',
  `fit` int(11) NOT NULL DEFAULT '0' COMMENT 'A numeric representation of how specific the path is.',
  `route` longblob COMMENT 'A serialized Route object',
  `number_parts` smallint(6) NOT NULL DEFAULT '0' COMMENT 'Number of parts in this router path.',
  PRIMARY KEY (`name`),
  KEY `pattern_outline_fit` (`pattern_outline`(191),`fit`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COMMENT='Maps paths to various callbacks (access, page and title)'

So looks good. I guess this new code could do with some tests considering that #218 was green and got committed.

Log in or register to post comments

Comment #249

stefan.r commented 3 June 2015 at 13:49

Status:

Needs work

» Needs review

Status	File	Size
new	1314214-249.patch	26.94 KB
new	interdiff-247-249.txt	3.37 KB

Added tests

Log in or register to post comments

Comment #250

stefan.r commented 10 June 2015 at 21:07

Just so we can get this back in, is anyone willing to review this? :)

Log in or register to post comments

Comment #251

stefan.r commented 10 June 2015 at 21:07

Issue summary:

Updating issue summary to reflect this latest change

Log in or register to post comments

Comment #252

Le Mont-Dore

commented 13 June 2015 at 07:59

Status:

Needs review

» Needs work

On reviewing, I found 3 minors. The added test looks OK. I did not test the install, but it looks like @alexpott did so in #248.

+++ b/core/lib/Drupal/Core/Database/Driver/mysql/Schema.php
@@ -279,6 +293,63 @@ protected function createKeysSql($spec) {
+   * @param $spec
+   *   The table specification.
+   *

Data types are required as of D8: https://www.drupal.org/coding-standards/docs#param

+++ b/core/lib/Drupal/Core/Database/Driver/mysql/Schema.php
@@ -279,6 +293,63 @@ protected function createKeysSql($spec) {
+            if (!(isset($mysql_field['type']) && $mysql_field['type'] == 'varchar_ascii') && !(isset($mysql_field['length']) && $mysql_field['length'] <= 191)) {
+              // Limit the index length to 191 characters.

I would have phrased this differently (using de Morgan laws), but I guess that is also a matter of preference. So I will still RTBC if you don't change this.

+++ b/core/lib/Drupal/Core/Database/Driver/mysql/Schema.php
@@ -279,6 +293,63 @@ protected function createKeysSql($spec) {
+   * @param $index
+   *   The index array to be used in createKeySql.
+   *

idem as 1st point.

Log in or register to post comments

Comment #253

stefan.r commented 13 June 2015 at 20:27

Status:

Needs work

» Needs review

Status	File	Size
new	interdiff-249-253.txt	1.56 KB
new	1314214-253.patch	27.27 KB

Log in or register to post comments

Comment #254

13 June 2015 at 20:50

Status:

Needs review

» Needs work

The last submitted patch, 253: 1314214-253.patch, failed testing.

Log in or register to post comments

Comment #255

13 June 2015 at 20:59

stefan.r queued 249: 1314214-249.patch for re-testing.

Log in or register to post comments

Comment #256

13 June 2015 at 21:49

The last submitted patch, 249: 1314214-249.patch, failed testing.

Log in or register to post comments

Comment #257

Le Mont-Dore

commented 14 June 2015 at 09:59

The failure is with a unique constraint, which is not in $spec['indexes'] but in $spec['unique keys']. But shortening a unique key index seems a no go to me as it would invalidate the purpose of the index of being a constraint.

Is the test data erroneous, as I can't find any hint to this unique constraint in the code, only in the test data file drupal\core\modules\system\tests\fixtures\update\drupal-8.beta11.bare.standard.php.gz.

Log in or register to post comments

Comment #258

stefan.r commented 14 June 2015 at 13:44

Status:

Needs work

» Needs review

Status	File	Size
new	interdiff-253-258.txt	172.7 KB
new	1314214-258.patch	199.7 KB

That's most likely where the error comes from then... This removes the unique constraint as a workaround.

Log in or register to post comments

Comment #260

stefan.r commented 14 June 2015 at 14:47

Just looked at #2498625: Write tests that ensure hook_update_N is properly run and manually removing the constraints from the beta11 dump is probably not the right way to go :)

Log in or register to post comments

Comment #261

stefan.r commented 15 June 2015 at 10:50

Status:

Needs work

» Needs review

Status	File	Size
new	interdiff-258-261.txt	111.49 KB
new	1314214-261.patch	253.18 KB
new	diff-beta11-HEAD.txt	743.32 KB

Discussed with @catch/@alexpott on IRC and this updates the dump instead.

Log in or register to post comments

Comment #263

stefan.r commented 15 June 2015 at 11:27

Status:

Needs work

» Needs review

Status	File	Size
new	diff-beta11-HEAD-2.txt	743.11 KB
new	1314214-263.patch	253.19 KB

This updates the site name in the dump, which must be "Site-Install" to pass tests.

Log in or register to post comments

Comment #264

15 June 2015 at 11:50

Status:

Needs review

» Needs work

The last submitted patch, 263: 1314214-263.patch, failed testing.

Log in or register to post comments

Comment #265

stefan.r commented 15 June 2015 at 12:11

Status	File	Size
new	1314214-265.patch	253.17 KB
new	diff-beta11-HEAD-3.txt	743.12 KB

Another try

Log in or register to post comments

Comment #266

daffie commented 15 June 2015 at 12:24

Status:

Needs work

» Needs review

For the testbot.

Log in or register to post comments

Comment #267

Le Mont-Dore

commented 15 June 2015 at 13:46

Do I understand correctly that the only actual change is that the unique constraint was removed from the dump?

Log in or register to post comments

Comment #268

stefan.r commented 15 June 2015 at 14:00

The change is that instead of taking the beta11 database dump, it takes the database dump of HEAD+the current patch using core/scripts/dump-database-d8-mysql.php.

So that includes removal of the unique constraint, as well as all other changes introduced in this patch.

Log in or register to post comments

Comment #269