Problem/Motivation

Exports do not set the character encoding on the MySQL connection and UTF-8 characters are incorrectly exported.

This is a critical issue for non-English / I18N sites that use UTF-8 or latin1 texts (Norwegian letters like æøå for instance) and significantly impacts any site using any of the common extended characters like the currly quotes (“ ” ‘ ’).

The bug leads to broken backup export file, which is irreversibly corrupting the sites content, when restoring it.

Proposed resolution

Set the database connection to use UTF-8 4 byte encoding similar to Drupal\Core\Database\Driver\mysql\Connection.

Remaining tasks

Tests, but maybe deferred due to the critical nature of this issue (data loss in the backups).

User interface / API / Data model changes

None

Original report by [username]

Hi,

when I save a backup and restore the database I found out that all special characters in text field of nodes were replaced in browser (Firefox) by questionmarks. I opened the backup file in Notepad and found out that the file was not saved in UTF8. Changing the encoding to UTF 8 helps to show special characters in editor - but when I save this file and use it for restoring I still have the problem. Does not matter if I use phpMyAdmin or Backup and Migrate form in administration to restore the database.

The problem does not appear with backups made with phpMyAdmin so it seems to be related to this module.

Best,
Tobias

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

tobiberlin created an issue. See original summary.

esolitos’s picture

Title: Problems with UTF8 » Encoding issues with non ASCIII texts

As i have pointed out in [273190] we experienced the same issues with latin1 texts (Norwegian letters like æøå for instance).

esolitos’s picture

Priority: Normal » Major

Also imho this is major, because it makes the module unusable on any non-english language.

Pandelon’s picture

I confirm this problem with Hungarian charaters as well. I experienced it with several sites. With PMA export the characters are good.
It is a major problem, the modul is unusable with non English sites. It ruins the content completely. It changes all of the non-Latin-1 characters with question mark.

szeidler’s picture

Status: Active » Needs review
FileSize
707 bytes

I was able to narrow the problem down im my local environment. The reason for the problem is, that the MYSQL statements seemed to be not set with the right character set (utf8).

With the following patch it ensures the usage of UTF8. That fixes the issue for me and create reliable backups again. I'm looking forward for reviews or alternative approaches.

olivier.br’s picture

The patch #5 worked for me for a website in french. Nice job.

szeidler’s picture

Is might be better to set the charset directly in the database connection process, as backup_migrate is handling the database connection completely by its own.

The new patch sets the charset by default to utf8mb4, which is default by Drupal 8. If not available it falls back to utf8, like Drupal does, too.

What do you think?

szeidler’s picture

Franz-m’s picture

The patch #5 worked for me for a website in french. Nice job.

#7 worked for me in german - many thanks!

Anonymous’s picture

Status: Needs review » Reviewed & tested by the community

I'm testing only #7 patch on multilanguage site (Russian, Armenian, Latvian, Czech, Spanish, Georgian, Lithuanian). Before patch i've encoding problem in dump with all of them. After #7 patched all fine! I changed status on RTBC, because it looks good, and helped to #6 (#5 patch have similar solution), #9, #10. Thank you very match! I hope to see it fixed.

salaDDodger’s picture

Confirm patch #7 solved bad encoding after restore on 8.1.10

stevieb’s picture

Confirm patch #7 works

awasson’s picture

I can also confirm that patch #7 does the job. I've used Backup & Migrate to move several Drupal 8 sites and prior to patch #7, question marks replaced dashes, apostrophe's and other grammatical symbols.

szeidler’s picture

Alan D.’s picture

Nice.

Note that the old bkups are already broken before the patch if testing this.

This also affects English sites too, most commonly the good old single & double curly quotes (“ ” ‘ ’)

awasson’s picture

Any thoughts on when this will be committed? It seems to be solid and definitely a necessary piece.

Anonymous’s picture

Priority: Major » Critical

OK. I have two thoughts on the reasons for this delay: it changes the part of the vendor code, and we have not committers of this module.

Proposed resolution:

(1): We could keep the vendor code inside 1.x branch of module (and it will keep the installation simplicity). And make 2.x branch module with outside vendor code (based on original or fork git).

(2): Wait return of the committers (please, please, please). Search for a new hero (maybe someone here?). Push the module into the core :)

I also think this Critical due to loss of data, because keep data - main goal of this module. Thanks!

Alan D.’s picture

Issue summary: View changes
tsotoodeh’s picture

Hi,
I were not aware about this critical issue! Last night if I did not backup my website with MySQL, I would totally lost everything I have done.
This is a Critical issue guys, The module does not backup the database for non-english websites.
I am very lucky that I did not just rely on this module and backuped the database with the Myadmin.
Please fix this issue, Thanks.

Anonymous’s picture

@tsotoodeh, absolutely!
I think this is a case where the module can damage the reputation of the whole Drupal. If no one can fix this issue, we must at least warn by red description of the module page.

Alan D.’s picture

I placed an issue in the webmaster queues requesting a warning on the project page.

#2832836: Warning for Backup and Migrate 8.x users on project page

Has anyone attempted to contact the maintainers directly?

tsotoodeh’s picture

Thanks @Alan, Do you think he will see his messages in Linkedin? I can send him a note then

Alan D.’s picture

Personally contact forms is best.

Ronan - https://www.drupal.org/user/72815/contact
Drew Gorton (dgorton) - https://www.drupal.org/user/19044/contact

Remember, be nice, they are giving their time for free :)

tsotoodeh’s picture

@Alan, Oops, Access is denied!

awasson’s picture

@tsotoodeh, did you get an Access denied message after you tried to send the message or when you tried to access the PM pages?

Alan D.’s picture

You were not yet confirmed as a real user before. I clicked the green "Confirm" button yesterday, this should allow you access now, otherwise I can pop in. It is better to coordinate rather than having multiple users emailing :)

awasson’s picture

@Alan D., that's exactly what I was thinking... I thought rather than jump the gun and contact Ronan and Drew myself, that perhaps one person should be nominated to make contact rather than a cluster of messages.

tsotoodeh’s picture

Thanks Alan, I will send them a note and hope they look into this critical issue.
Drupal is a universal platform and many people rely on it.

vgutekunst’s picture

Hi guys,

i saved my database via backup & migrate. I have a german site. Am i broken now? After import of this file i didnt have any "äöß etc." on my site. The database backup seems to be utf8. What can i do now? :-(
best,

Alan D.’s picture

You need to get a real backup from the hosting provider, the export without this patch is already broken.

Sadly, manually editing the content is the only way to restore :(

vgutekunst’s picture

Shit, your sure there is no other way? :((((

szeidler’s picture

Not from within Drupal itself. The backup is corrupted and the ?? could not be converted to their initial characters. The only chance for you would be to see, if you have another backup resource, that you can use for restoring the database (like from the hoster, backup of your local harddrive or so on).

There was also an approach of getting an warning on the project page, but it seems to be not possible without having original maintainers involved or someone applying for a co-maintainership (which usually also takes some time).

vgutekunst’s picture

Istn it possible to get the correct characters with an editor and save the file into utf8 charset? I was very stupid to trust an alpha-status module :-(

szeidler’s picture

Title: Encoding issues with non ASCIII texts » Database export irreversibly corrupted after restore - Encoding issues with non ASCIII texts
Issue summary: View changes
szeidler’s picture

Title: Database export irreversibly corrupted after restore - Encoding issues with non ASCIII texts » Database irreversibly corrupted after export and restore - Encoding issues with non ASCIII texts
esolitos’s picture

Issue summary: View changes

  • ronan committed 995100e on 8.x-4.x
    Issue #2749885 by szeidler: Database irreversibly corrupted after export...
ronan’s picture

Status: Reviewed & tested by the community » Fixed

I applied this patch and added updated the unit tests. Thanks everyone for finding, fixing and testing this.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

Alan D.’s picture

I'd strongly recommend a new release since 8.x-4.0-alpha1 is well, totally bung ;)

gunwald’s picture

This is really discouraging! This bug is known for more than eight month now, and the affected release is still active! It completely destroyed all content of my entire site and it took me days to find out, what was the cause of the corruption and to restore the corrupted content! Thank you for that!

If this module is not actively maintained anymore, please make that clear. And it is highly destructive an irresponsible behavior not to publish a warning on the module site. In particular as this module used to be on the one hand essential and on the other hand reliable.

There should be a way for the community to unpublish such releases!

vgutekunst’s picture

i had problems with the german "umlaute" and could restore my database with this steps: import the .mysql file from the module backup&migrate into a database using myphpadmin and then export it with "utf8" as .sql file. Then import into the final database - i could save my hole data!!

best,

Mike Dodd’s picture

Patch works well. After import to my QA site I lost all my £ signs.... fresh import after the patch and its looking good. Thank you for the patch. I probably shouldn't be storing the values as UTF-8 but i guess in this day and age this is probably normally behaviour now..

I do think we need to get another release out though as this is probably going to cause a problem for a lot of people...

esolitos’s picture

@ronan: This is so critical that i think it would be nice to get a alpha-2 even only for this commit!

stevieb’s picture

the parch works it absolutely should be committed

Arnaud01’s picture

Is this issue fixed in the stable realease : 8.x-4.0 ?

https://www.drupal.org/project/backup_migrate/releases/8.x-4.0

Thank you :)