Problem/Motivation
Exports do not set the character encoding on the MySQL connection and UTF-8 characters are incorrectly exported.
This is a critical issue for non-English / I18N sites that use UTF-8 or latin1 texts (Norwegian letters like æøå for instance) and significantly impacts any site using any of the common extended characters like the currly quotes (“ ” ‘ ’).
Proposed resolution
Set the database connection to use UTF-8 4 byte encoding similar to Drupal\Core\Database\Driver\mysql\Connection.
Remaining tasks
Tests, but maybe deferred due to the critical nature of this issue (data loss in the backups).
User interface / API / Data model changes
None
Original report by [username]
Hi,
when I save a backup and restore the database I found out that all special characters in text field of nodes were replaced in browser (Firefox) by questionmarks. I opened the backup file in Notepad and found out that the file was not saved in UTF8. Changing the encoding to UTF 8 helps to show special characters in editor - but when I save this file and use it for restoring I still have the problem. Does not matter if I use phpMyAdmin or Backup and Migrate form in administration to restore the database.
The problem does not appear with backups made with phpMyAdmin so it seems to be related to this module.
Best,
Tobias
Comment | File | Size | Author |
---|---|---|---|
#7 | backup_migrate-ensure_utf8_name_setting-2749885-7.patch | 799 bytes | szeidler |
Comments
Comment #2
esolitosAs i have pointed out in [273190] we experienced the same issues with latin1 texts (Norwegian letters like æøå for instance).
Comment #3
esolitosAlso imho this is major, because it makes the module unusable on any non-english language.
Comment #4
Pandelon CreditAttribution: Pandelon as a volunteer commentedI confirm this problem with Hungarian charaters as well. I experienced it with several sites. With PMA export the characters are good.
It is a major problem, the modul is unusable with non English sites. It ruins the content completely. It changes all of the non-Latin-1 characters with question mark.
Comment #5
szeidler CreditAttribution: szeidler at Ramsalt Lab commentedI was able to narrow the problem down im my local environment. The reason for the problem is, that the MYSQL statements seemed to be not set with the right character set (utf8).
With the following patch it ensures the usage of UTF8. That fixes the issue for me and create reliable backups again. I'm looking forward for reviews or alternative approaches.
Comment #6
olivier.br CreditAttribution: olivier.br commentedThe patch #5 worked for me for a website in french. Nice job.
Comment #7
szeidler CreditAttribution: szeidler at Ramsalt Lab commentedIs might be better to set the charset directly in the database connection process, as backup_migrate is handling the database connection completely by its own.
The new patch sets the charset by default to
utf8mb4
, which is default by Drupal 8. If not available it falls back toutf8
, like Drupal does, too.What do you think?
Comment #8
szeidler CreditAttribution: szeidler at Ramsalt Lab commentedComment #9
Franz-m CreditAttribution: Franz-m as a volunteer commented#7 worked for me in german - many thanks!
Comment #10
Anonymous (not verified) CreditAttribution: Anonymous commentedI'm testing only #7 patch on multilanguage site (Russian, Armenian, Latvian, Czech, Spanish, Georgian, Lithuanian). Before patch i've encoding problem in dump with all of them. After #7 patched all fine! I changed status on RTBC, because it looks good, and helped to #6 (#5 patch have similar solution), #9, #10. Thank you very match! I hope to see it fixed.
Comment #11
salaDDodger CreditAttribution: salaDDodger commentedConfirm patch #7 solved bad encoding after restore on 8.1.10
Comment #12
stevieb CreditAttribution: stevieb commentedConfirm patch #7 works
Comment #13
awasson CreditAttribution: awasson commentedI can also confirm that patch #7 does the job. I've used Backup & Migrate to move several Drupal 8 sites and prior to patch #7, question marks replaced dashes, apostrophe's and other grammatical symbols.
Comment #14
szeidler CreditAttribution: szeidler at Ramsalt Lab commentedComment #15
Alan D. CreditAttribution: Alan D. commentedNice.
Note that the old bkups are already broken before the patch if testing this.
This also affects English sites too, most commonly the good old single & double curly quotes (“ ” ‘ ’)
Comment #16
awasson CreditAttribution: awasson commentedAny thoughts on when this will be committed? It seems to be solid and definitely a necessary piece.
Comment #17
Anonymous (not verified) CreditAttribution: Anonymous commentedOK. I have two thoughts on the reasons for this delay: it changes the part of the vendor code, and we have not committers of this module.
Proposed resolution:
(1): We could keep the vendor code inside 1.x branch of module (and it will keep the installation simplicity). And make 2.x branch module with outside vendor code (based on original or fork git).
(2): Wait return of the committers (please, please, please). Search for a new hero (maybe someone here?). Push the module into the core :)
I also think this Critical due to loss of data, because keep data - main goal of this module. Thanks!
Comment #18
Alan D. CreditAttribution: Alan D. commentedComment #19
tsotoodeh CreditAttribution: tsotoodeh commentedHi,
I were not aware about this critical issue! Last night if I did not backup my website with MySQL, I would totally lost everything I have done.
This is a Critical issue guys, The module does not backup the database for non-english websites.
I am very lucky that I did not just rely on this module and backuped the database with the Myadmin.
Please fix this issue, Thanks.
Comment #20
Anonymous (not verified) CreditAttribution: Anonymous commented@tsotoodeh, absolutely!
I think this is a case where the module can damage the reputation of the whole Drupal. If no one can fix this issue, we must at least warn by red description of the module page.
Comment #21
Alan D. CreditAttribution: Alan D. commentedI placed an issue in the webmaster queues requesting a warning on the project page.
#2832836: Warning for Backup and Migrate 8.x users on project page
Has anyone attempted to contact the maintainers directly?
Comment #22
tsotoodeh CreditAttribution: tsotoodeh commentedThanks @Alan, Do you think he will see his messages in Linkedin? I can send him a note then
Comment #23
Alan D. CreditAttribution: Alan D. commentedPersonally contact forms is best.
Ronan - https://www.drupal.org/user/72815/contact
Drew Gorton (dgorton) - https://www.drupal.org/user/19044/contact
Remember, be nice, they are giving their time for free :)
Comment #24
tsotoodeh CreditAttribution: tsotoodeh commented@Alan, Oops, Access is denied!
Comment #25
awasson CreditAttribution: awasson commented@tsotoodeh, did you get an Access denied message after you tried to send the message or when you tried to access the PM pages?
Comment #26
Alan D. CreditAttribution: Alan D. commentedYou were not yet confirmed as a real user before. I clicked the green "Confirm" button yesterday, this should allow you access now, otherwise I can pop in. It is better to coordinate rather than having multiple users emailing :)
Comment #27
awasson CreditAttribution: awasson commented@Alan D., that's exactly what I was thinking... I thought rather than jump the gun and contact Ronan and Drew myself, that perhaps one person should be nominated to make contact rather than a cluster of messages.
Comment #28
tsotoodeh CreditAttribution: tsotoodeh commentedThanks Alan, I will send them a note and hope they look into this critical issue.
Drupal is a universal platform and many people rely on it.
Comment #29
vgutekunst CreditAttribution: vgutekunst commentedHi guys,
i saved my database via backup & migrate. I have a german site. Am i broken now? After import of this file i didnt have any "äöß etc." on my site. The database backup seems to be utf8. What can i do now? :-(
best,
Comment #30
Alan D. CreditAttribution: Alan D. commentedYou need to get a real backup from the hosting provider, the export without this patch is already broken.
Sadly, manually editing the content is the only way to restore :(
Comment #31
vgutekunst CreditAttribution: vgutekunst commentedShit, your sure there is no other way? :((((
Comment #32
szeidler CreditAttribution: szeidler at Ramsalt Lab commentedNot from within Drupal itself. The backup is corrupted and the ?? could not be converted to their initial characters. The only chance for you would be to see, if you have another backup resource, that you can use for restoring the database (like from the hoster, backup of your local harddrive or so on).
There was also an approach of getting an warning on the project page, but it seems to be not possible without having original maintainers involved or someone applying for a co-maintainership (which usually also takes some time).
Comment #33
vgutekunst CreditAttribution: vgutekunst commentedIstn it possible to get the correct characters with an editor and save the file into utf8 charset? I was very stupid to trust an alpha-status module :-(
Comment #34
szeidler CreditAttribution: szeidler at Ramsalt Lab commentedComment #35
szeidler CreditAttribution: szeidler at Ramsalt Lab commentedComment #36
esolitosComment #38
ronan CreditAttribution: ronan commentedI applied this patch and added updated the unit tests. Thanks everyone for finding, fixing and testing this.
Comment #40
Alan D. CreditAttribution: Alan D. commentedI'd strongly recommend a new release since 8.x-4.0-alpha1 is well, totally bung ;)
Comment #41
gunwald CreditAttribution: gunwald commentedThis is really discouraging! This bug is known for more than eight month now, and the affected release is still active! It completely destroyed all content of my entire site and it took me days to find out, what was the cause of the corruption and to restore the corrupted content! Thank you for that!
If this module is not actively maintained anymore, please make that clear. And it is highly destructive an irresponsible behavior not to publish a warning on the module site. In particular as this module used to be on the one hand essential and on the other hand reliable.
There should be a way for the community to unpublish such releases!
Comment #42
vgutekunst CreditAttribution: vgutekunst commentedi had problems with the german "umlaute" and could restore my database with this steps: import the .mysql file from the module backup&migrate into a database using myphpadmin and then export it with "utf8" as .sql file. Then import into the final database - i could save my hole data!!
best,
Comment #43
Mike Dodd CreditAttribution: Mike Dodd commentedPatch works well. After import to my QA site I lost all my £ signs.... fresh import after the patch and its looking good. Thank you for the patch. I probably shouldn't be storing the values as UTF-8 but i guess in this day and age this is probably normally behaviour now..
I do think we need to get another release out though as this is probably going to cause a problem for a lot of people...
Comment #44
esolitos@ronan: This is so critical that i think it would be nice to get a alpha-2 even only for this commit!
Comment #45
stevieb CreditAttribution: stevieb commentedthe parch works it absolutely should be committed
Comment #46
Arnaud01 CreditAttribution: Arnaud01 commentedIs this issue fixed in the stable realease : 8.x-4.0 ?
https://www.drupal.org/project/backup_migrate/releases/8.x-4.0
Thank you :)