Using Drupal 8 (git checkout and install today) and CKeditor I tried to type some greek in a node.

I typed this:

English: We have wysiwyg be default.
Greek: Άραγε δουλεύει σωστά ή τα κάνει μαντάρα?
French: à, è, ù

And then checked what gets stored in the database. All greek text was converted to html entities.
This is what resides in mysql:

<p>English: We have wysiwyg be default.</p>
<p>Greek: ?&rho;&alpha;&gamma;&epsilon; &delta;&omicron;&upsilon;&lambda;&epsilon;?&epsilon;&iota; &sigma;&omega;&sigma;&tau;? ? &tau;&alpha; &kappa;?&nu;&epsilon;&iota; &mu;&alpha;&nu;&tau;?&rho;&alpha;?</p>
<p>French: &agrave;, &egrave;, &ugrave;</p>

For those wondering, the greek text above translates as: "Does it work or is it messy?" :)

This used to be an issue with the ckeditor module too which has been long solved, I am marking the related issues.

I searched for an issue against D8 but found nothing, thus I created this issue.
I strongly believe that this conversion to html should not happen.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

bserem’s picture

Issue summary: View changes

Added french characters example

bserem’s picture

The attached patch does the trick for me for greek characters while it does not affect basicEntities (<, >, &, they get stored as html entities).

If I'm on the wrong direction please tell me

bserem’s picture

Status: Active » Needs review
FileSize
7.56 KB

Also, git diff provided this, arguing with me about newline. Attaching this too.

Wim Leers’s picture

Title: When writting greek or french accented characters on the wysiwyg editor they get saved as html entities in the database » Configure CKEditor to not create HTML entities
Priority: Major » Critical
Issue tags: +CKEditor in core
FileSize
778 bytes

Oh, wow, amazing catch!

You found the right settings, but we don't modify CKEditor itself. We modify CKEditor's settings :) Also, I think we just want to tell CKEditor to not create HTML entities by default at all (CKEDITOR.config.entities = false), which then also prevents the problems with Greek and French characters.

I looked up the documentation for this and http://docs.ckeditor.com/#!/api/CKEDITOR.config-cfg-entities tells me CKEditor also converts the single quote into an entity by default. We also don't want that. It also tells me Chinese is not HTML-encoded by default. We want to verify that. So I'm testing with this example:

English: We have wysiwyg be default.
Greek: Άραγε δουλεύει σωστά ή τα κάνει μαντάρα?
French: à, è, ù
Chinese: 汉语
Special characters: ', ", `, &, <, >

When stored, this yields:

<p>English: We have wysiwyg be default.<br />
Greek: Άραγε δουλεύει σωστά ή τα κάνει μαντάρα?<br />
French: à, è, ù<br />
Chinese: 汉语<br />
Special characters: ', ", `, &amp;, &lt;, &gt;</p>

As expected and needed, the things that must be converted to HTML entities (the ampersand, smaller than and greater than symbols) are converted. (To disable those, we'd need to set CKEDITOR.config.basicEntities to false, which we don't want to, because it'd result in invalid HTML.)

I asked the CKEditor team to chime in, to confirm that this is the correct approach.

Status: Needs review » Needs work

The last submitted patch, 4: ckeditor_html_entities-2345037-4.patch, failed testing.

Wim Leers’s picture

Status: Needs work » Needs review
FileSize
1.38 KB
649 bytes

Updated the test coverage, will be green now.

wwalc’s picture

This default behaviour (config.entities = true) has been set many years ago, in early FCKeditor times. Long time ago utf8 was not that popular on websites and people set wrong database encodings too often, so this was a remedy for common issues about "characters being destroyed" etc. Since Drupal is using utf8 correctly, there is no sense is keeping it enabled, hence this issue is valid.

bserem’s picture

Status: Needs review » Reviewed & tested by the community

Oh man... I couldn't figure out where to control out ckeditors settings.
Patch on #6 is working for me, and keeps the greater/lesser and ampersand as html entities as it should have.

I'm moving this on to "reviewed".

As a side note. On D7+ckeditor this functionallity was controlable from the UI. I never liked it, since it never made sense for greek users. Do you believe we should re-implement something like that?
From wwalc's comment I understand that we shouldn't.

ps: Wim will you be on Amsterdam next week?

Wim Leers’s picture

No, we shouldn't have a setting for this in the UI, because we should just always use UTF-8! :)

Yes, I'll be in Amsterdam! And so will wwalc, by the way. He's presenting about CKEditor in Drupal 8!

bserem’s picture

Good, see you both there then. I haven't found the equivalent of last years beer museum yet though :P

On to the next bug...

Wim Leers’s picture

Another language-and-WYSIWYG-related bug is #2318237: CKEditor translates its user interface even if interface translation is turned off. If you could roll a patch for that one, I'll definitely review it :)

bserem’s picture

I'll have a look at it. I'm not a patch master, but I'll see what I can do. Thanks

webchick’s picture

Status: Reviewed & tested by the community » Fixed

Committed and pushed to 8.x. Thanks!

  • webchick committed b693734 on 8.0.x
    Issue #2345037 by Wim Leers, bserem: Fixed Configure CKEditor to not...

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.