I tried to paste this (Marathi: मकबूल फिदा हुसेन, Urdu: مقبول فدا حسين, Hindi: मक़बूल फ़िदा हुसैन)
Error
PDOException: SQLSTATE[HY000]: General error: 1366 Incorrect string value: '\xE0\xA4\xAE\xE0\xA4\x95...' for column 'body_value' at row 1: INSERT INTO {field_data_body} (entity_type, entity_id, revision_id, bundle, delta,
Is it possible to make it work or it functions properly

Comments

mdupont’s picture

What character set is used on your databases and tables? Is it UTF-8 anywhere?

mdupont’s picture

No issue with these characters on my local installation (UTF-8 everywhere, utf-8 mysql tables and collation).

sachbearbeiter’s picture

my problem with:

"Lukšič"

PDOException: SQLSTATE[HY000]: General error: 1366 Incorrect string value: '\xC4\x8D' for column 'title' at row 1: UPDATE {node} SET vid=:db_update_placeholder_0, type=:db_update_placeholder_1, language=:db_update_placeholder_2, title=:db_update_placeholder_3, uid=:db_update_placeholder_4, status=:db_update_placeholder_5, created=:db_update_placeholder_6, changed=:db_update_placeholder_7, comment=:db_update_placeholder_8, promote=:db_update_placeholder_9, sticky=:db_update_placeholder_10, tnid=:db_update_placeholder_11, translate=:db_update_placeholder_12 WHERE (nid = :db_condition_placeholder_0) ; Array ( [:db_update_placeholder_0] => 1162 [:db_update_placeholder_1] => profile [:db_update_placeholder_2] => und [:db_update_placeholder_3] => Lukšič [:db_update_placeholder_4] => 1 [:db_update_placeholder_5] => 1 [:db_update_placeholder_6] => 1307544979 [:db_update_placeholder_7] => 1309524543 [:db_update_placeholder_8] => 0 [:db_update_placeholder_9] => 0 [:db_update_placeholder_10] => 0 [:db_update_placeholder_11] => 0 [:db_update_placeholder_12] => 0 [:db_condition_placeholder_0] => 1162 ) in drupal_write_record() (line 6859 of /xxx/includes/common.inc).

catch’s picture

Version: 7.2 » 8.x-dev
Priority: Major » Normal
Status: Active » Postponed (maintainer needs more info)
Issue tags: +Needs backport to D7

I tried Lukšič on my local install and it worked fine, so this must be an encoding/collation issue locally.

Please post the collation for the database (and table, and column since they can be different on MySQL) to this issue, without that this is going to be very hard to fix.

sachbearbeiter’s picture

sorry - i did a research and it was some migration related stuff (on field level - so it was a little bit tricky to find out ...)

mdupont’s picture

Status: Postponed (maintainer needs more info) » Fixed

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

chi’s picture

Status: Closed (fixed) » Active

I got the message when I tried to paste 𐤈.
(U+10908 PHOENICIAN LETTER TET)

mdupont’s picture

It appears to happen only with MySQL :

- D7, MySQL: getting error message
- D6, MySQL: no error, but character silently removed
- D7, SQLite: no error, character saved and displayed correctly

From MySQL documentation ( http://dev.mysql.com/doc/refman/5.5/en/charset-unicode-utf8mb4.html ), MySQL only uses up to 3 bytes per character when using utf8 encoding. As your character contains 4 bytes, it can't work correctly with MySQL default utf8 encoding.

The solution is to use MySQL 5.5.3 or later and the utf8mb4 encoding which supports 4 bytes characters.

mdupont’s picture

Status: Active » Fixed
chi’s picture

Status: Fixed » Active

Why don't we catch this exeption?
Can we change drupal_validate_utf8?

mdupont’s picture

drupal_validate_utf8() is used in a lot of places throughout the code, the bug described here only happens when trying to save content to a MySQL DB that is not using utf8mb4 encoding.

However it could be useful to open a separate issue about content validation.

mdupont’s picture

mdupont’s picture

fietserwin’s picture

Issue summary: View changes

How does this differ from #1314214: MySQL driver does not support full UTF-8 (emojis, asian symbols, mathematical symbols)? Close as duplicate? if this issue is about showing a proper error message, then this should not be postponed but be done now and deprecated when the other issue gets in.

mgifford’s picture

Status: Postponed » Active

Postponed issue fixed in D8.

fietserwin’s picture

Let's close this as a duplicate of #1314214: MySQL driver does not support full UTF-8 (emojis, asian symbols, mathematical symbols), as I don't see what this issue adds to that other one.