Site server crashed and was restored and now Drupal articles are displaying UTF-8 incorrectly.
This does not seem to be a MySQL issue - at least the data in the DB is the same now as it was before the DB restore.

Web browser is correctly set to view as UTF-8 encoding.

This site was displaying correctly before we lost the server, and now the admins restored from backup and words like Café (ends with eacute) in the Drupal articles are displaying incorrectly - Drupal pages show it as A with tilde followed by the Copyright symbol. In Latin1, C3 is A-tilde, and A9 is copyright.
Non-Drupal pages at the site work correctly with UTF-8.

The eacute character is E9 hex, and UTF-8 encoding is "C3A9" in hex.
In the database, I see it as "C383C2A9" now - just as it was in all my backups, even before the site server went down, so this may suggest that atleast the DB data is correct, and either MySQL or Drupal removes this double-escape.

Is the drupal node display code supposed to change the node content from C383C2A9 to C3A9? If not, how did my site work correctly in the past - is there some MySQL setting that may have got lost on restore?

Comments

quaestor’s picture

I had something very similar happen a few months ago. When I restored from a backup I found that certain characters appeared very strange indeed. In the end I found that the restored database had its collation set incorrectly. I think it was Italian or something. Anyway, when I changed the collation to utf8_general_ci and re-imported the problem was solved.

So maybe try setting collation to utf8_general or utf8_unicode?

bwooster47’s picture

That sounds like it may be my problem - the collation now shows up as latin1_swedish_ci on all tables - presumably because it is the server default?

But - you seem to be also saying that just changing collation is not enough, have to change it (I see utf8_general_ci in my phpMyAdmin page) and then re-restore all the data... nothing is easy!

I searched around on the web, found this MySQL 3.23 to 4.1 breaks table collation page that may apply to my situation.
Before the site crashed, MySQL was at 4.0, and now it is at 4.1.

Is the link above correct - Drupal changes behavior if MySQL is above 4.1? Maybe that is why my site was working fine with MySQL 4.0, even though the collation was the same - latin1.

quaestor’s picture

I honestly can't remember if I *had* to restore the data after the collation change. I guess try setting the collation and see if the problem is fixed. If not, go for the restore. :)

If I understand the 3.23 to 4.1 issue, the fix listed should solve your problem. Its just a fancy way of converting all the tables in the db to uft8_general_ci. If you don't have shell access you can use phpmyadmin to set them.

BTW, sorry for the delay in responding.

skullJ’s picture

Hello,I had the same problem…

3 days ago I was transferred the full backup from drupal 5.0 site.
My previous server’s phpMyAdmin didn’t shows the collation but the new server’s phpMyAdmin required table’s Collation and MySQL connection collation!

I changed all collation to "utf8_general_ci" and all problems fixed! :D

wwwoliondorcom’s picture

I got the same problem and the collation must be changed before importing the database.

jessZ’s picture

Apparently I am having the same problem (even though i have been on NYSQL 5 for quite some time. I tried using the ooperations tab to change defauil encoding to utf 8 and it reports success but if i go back and look at the structure all of the tables are still latin. How does onechange all the tables to utf 8 using phypmyadmin. I do not have shell access.