I'm trying to migrate a hacky-patched version of Drupal 4.6 to a current released version and having some problems, and I think the collation mess is part of it. Does anybody have a script that will upgrade all instances of latin1_swedish_ci to utf8_general_ci?

Comments

rivena’s picture

I am interested in adding this info to the handbook, so I'm just posting here. :)

Anisa.

-----------------------------------------------------------
Kindness builds stronger bonds than necessity.

www.animecards.org - 18,000 card scans and counting!
-----------------------------------------------------------

ajwwong’s picture

Good luck in finding it... If I catch it I'll let you know here.

Albert
www.ithou.org

yelvington’s picture

... a quick and dirty collation updater.

header('Content-type: text/plain;');
mysql_connect('localhost','username','passwordl');
mysql_select_db('databasename');

$res = mysql_query("SHOW TABLES");
while ($r = mysql_fetch_array($res)){
  $tablename = $r[0];
  echo $sql = "ALTER TABLE $tablename CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci";
  echo "\r\n";
 mysql_query($sql);
 
}

I don't know that this actually fixed anything; I was able to get a test upgrade working by doing this and also truncating the sessions table. Until I did that I couldn't even log on.

escoles’s picture

I'd made a bunch of semi-successful updates (was getting a weird post truncation problem that led me to keep trying). After finally getting an update to complete without serious error, I had the login problems you describe.

Fortunately, this code snipped was just the thing. Thank you so much for posting it.

patricksettle’s picture

This will certianly switch your charset and collate, but it doesn't actually "convert" anything to utf8. What you get after this (unless the update.php had already successfully converted the table) is latin1 characters in a utf8 charset field. Most cases you won't notice to much of a difference. But if there was any special latin1 characters used, such as Open Double-quote and Close Double-quote, all you'll get is strange characters. (“ or � to be exact)

So, keep an eye out for that... if you've not seen crazy characters in your nodes, you can ignore the fact they've not -actaully- been converted. But if you do see that sort of stuff, you'll need to go in and edit each node, or convert it properly by switching to the binary field type (varbinary, blob, etc ) before switching charset or collate