Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
By halfelven on
I've got an odd problem. I'm updating from an old 4.5.8 installation to 5.0 and many posts that included "curly quotes" and other non-standard characters are coming out with garbage in the new displays. I know I solved this same problem in 4.5.8 but for the life of me, I can't remember what I did! This involves almost 3000 posts, some of them in excess of 64k words, so it is a large problem.
Any help out there?
Thanks,
Erin
Comments
Database encoding
Most likely you were using UTF-8 encoding before, when Drupal was not expecting it. We only told the database server our data was UTF-8 from 4.7 and onwards.
There are two possibilities:
In both cases, the goal is to get valid UTF-8 data in UTF-8 encoded database tables and columns. The 'conversion' back to Latin1 is just a trick that should result in real UTF-8.
To get started, you need to be sure exactly what it is in the database. Get a straight database dump and open it in an editor that understands encodings. Converting from Latin1 to UTF-8 will change each 1-byte non-ASCII character into 2-3 characters. Count the number of jumbled characters that you see instead of e.g. a quote, and you can verify whether the conversion was applied one or more times.
--
If you have a problem, please search before posting a question.
Thanks
That's a good clue. Though I could wish that such a converter were considered a necessary item to provide, or at least suggest the need for, in upgrades.
- Erin