Demo only records/dumps the charset for tables and columns, but not their collation.
Hence, when dumping and restoring from a snapshot, all tables and fields are recreated using MySQL's default collation (which is almost impossible to alter, as far as I can tell).
Especially when trying to use snapshots between MySQL 5.0 and 5.1 servers, this bug prevents successful importing of snapshots through other means (phpMyAdmin). To spit out some non-obvious details here, MySQL 5.1 contains an incompatible change in handling of keys/indexes for the utf8_general_ci collation, which apparently is only mentioned in a single, almost hidden sentence in MySQL's upgrade docs. When trying to import a snapshot from 5.0 in this collation, you'll get nasty duplicate key errors and similar weird errors. The solution is to switch the collation from utf8_general_ci to utf8_unicode_ci, which can be done in various ways, but doesn't matter for this issue.
What matters is that Demo snapshots do not contain any collation specs for table/column schema information when exporting on 5.0. Apparently, the information exists when exporting on 5.1.
Comment | File | Size | Author |
---|---|---|---|
#10 | demo.collation.10.patch | 9.5 KB | sun |
#9 | demo.collation.9.patch | 8.5 KB | sun |
#8 | demo-HEAD.collation.8.patch | 6.75 KB | sun |
#5 | demo.collation.5.patch | 5.14 KB | sun |
Comments
Comment #1
sunTagging.
Comment #2
sun.
Comment #3
sunBackground info:
List of default charsets and collations: http://dev.mysql.com/doc/refman/5.1/en/charset-charsets.html
The original bug report that lead to the backwards-incompatible collation change:
http://bugs.mysql.com/bug.php?id=43593
Source: http://dev.mysql.com/doc/refman/5.1/en/upgrading-from-previous-series.html
Source: http://dev.mysql.com/doc/refman/5.1/en/checking-table-incompatibilities....
CAST() is using the default collation of the connection.
http://dev.mysql.com/doc/refman/5.1/en/charset-convert.html
SET NAMES uses the default collation of the server (utf8_general_ci for charset utf8), only SET CHARACTER SET respects the default collation of the database.
http://dev.mysql.com/doc/refman/5.1/en/charset-connection.html
Both in D6 and D7, Drupal's MySQL driver initializes the connection with:
Comment #4
sunActually,
SHOW CREATE TABLE
contains collation information, but only if the table/column collation differs from the default collation.We therefore have separate issues:
We can skip the conditional comments, as D6 requires MySQL 4.1.
We perhaps want to use
SET CHARACTER SET
instead ofSET NAMES
. Not sure though.SHOW CREATE TABLE
only adds collation info for tables and columns in case they differ from the defaults, we need to include aCOLLATE
statement to theCREATE TABLE
statement:Note that
CHARSET
is always added, also if it is equal to the default.The MySQL reference manual does not state whether the collation specified via
COLLATE
for the table is taken over as default for table columns. Let's hope so, but needs manual testing.SET CHARACTER SET
instead ofSET NAMES
when initializing the database connection, so as to respect the database's default collation for new tables and columns, instead of using the servers/connections default collation.Comment #5
sunImplements 1) and 2), plus plenty of comments.
Comment #6
sunFor Drupal core, see #772678: Database default collation is not respected
Comment #7
sunOn Win32, PHP 5.2.6 PDO with MySQL 5.0.x also seems to append an InnoDB free space marker to all table comments, which we should manually remove, if existent:
Comment #8
sunCommitted attached patch to HEAD (D7).
Comment #9
sunSame patch for D6 (untested).
Last hunk needs further consideration. We likely need a new helper to generate that table info and status.
Comment #10
sunCommitted attached patch to D6.