Summary

We need some way of mapping Drupal languages to CKEditor translations/languages.

Original report

From #2036253-12: Update CKEditor library to 4.2:

Well I'm not sure we want to pick up whatever is offered to us, because we don't want to ship CKEditor translations for languages we're currently not supporting with a Drupal translation, such as currently sr-latn, fr-ca or en-** and probably more to come. So optimally we would be using LanguageManager::getStandardLanguageList.

Either way we need to find ways to match language codes like pt-pt => pt, no => nn, zh-cn => zh-hant, and zh => zh-hans (Let this be a followup, though).

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Wim Leers’s picture

Wim Leers’s picture

Assigned: Unassigned » Gábor Hojtsy

I've asked Gábor Hojtsy to look at this, who's in a much better position than me to decide how this should work. The general rule should be (AFAICT) that CKEditor should follow Drupal, so if CKEditor supports a language that Drupal doesn't support, it will never be used.

Compare the list of languages, and specifically their corresponding language codes, of Drupal\Core\Language\LanguageManager::getStandardLanguageList() with that in core/assets/vendor/ckeditor/lang:

wim.leers at wimleers-acquia in ~/Work/drupal-tres on 8.x*
$ ls core/assets/vendor/ckeditor/lang
_translationstatus.txt cs.js                  en-gb.js               fi.js                  hi.js                  ka.js                  mn.js                  pt.js                  sr-latn.js             vi.js
af.js                  cy.js                  en.js                  fo.js                  hr.js                  km.js                  ms.js                  ro.js                  sr.js                  zh-cn.js
ar.js                  da.js                  eo.js                  fr-ca.js               hu.js                  ko.js                  nb.js                  ru.js                  sv.js                  zh.js
bg.js                  de.js                  es.js                  fr.js                  id.js                  ku.js                  nl.js                  si.js                  th.js
bn.js                  el.js                  et.js                  gl.js                  is.js                  lt.js                  no.js                  sk.js                  tr.js
bs.js                  en-au.js               eu.js                  gu.js                  it.js                  lv.js                  pl.js                  sl.js                  ug.js
ca.js                  en-ca.js               fa.js                  he.js                  ja.js                  mk.js                  pt-br.js               sq.js                  uk.js
Gábor Hojtsy’s picture

Issue tags: -Spark

I agree with "The general rule should be (AFAICT) that CKEditor should follow Drupal, so if CKEditor supports a language that Drupal doesn't support, it will never be used".

Otherwise Drupal core already has a browser language code to internal language code mapping table, see that in locale module's config. That may be possible to reuse for this. May be the easiest instead of making up yet one more system.

Wim Leers’s picture

#3: can you point me to where that mapping lives?

Gábor Hojtsy’s picture

Issue tags: +Spark

This is where the mapping lives: core/modules/language/config/language.mappings.yml

Gábor Hojtsy’s picture

Ok, so the bad news is the situation is different with each missing language. Hah! I went through each language code in CK and looked at Drupal language codes. Here are the missing ones with analysis:

-----
km: This language is not listed in Drupal core but is now available in https://localize.drupal.org/translate/languages/km and has some translations, so it should be in core.

ms: This language is not listed in Drupal core but is now available in https://localize.drupal.org/translate/languages/ms and has some translations, so it should be in core.

no: This is a legacy language code for Norvegian stuff. We should likely add a mapping in core/modules/language/config/language.mappings.yml to nb (which is the main Norvegian code that Drupal understands).

en-au, en-ca: These were not proposed for localize.drupal.org, so they are not there. This does not stop sites to add these languages, which would immediately make ck's language handling kick in I believe.

en-gb: Is available on https://localize.drupal.org/translate/languages/en-gb however no translations for Drupal 6 or 7 or 8. This makes it pretty impossible to offer in Drupal core, since you connot display it in the installer and expect it will download a translation file. Such languages have been recently removed from core inclusion by default.

pt: Drupal has pt-pt for this. I think we can add a mapping to core/modules/language/config/language.mappings.yml on pt to pt-pt to resolve this. That will also be somewhat useful for browser detection, etc.

en: Drupal has this language configured by default, I assume CK should fall back on this language itself if the langcode used by Drupal is not available with CK.

sr-latn: Drupal has sr (which CK also has!), but sr-latn is a different script; Drupal does not have same language-different script translations, so since the whole of Drupal would not be translatable to sr-latn anyway, I'm not sure it is an issue that this will not be useful.

fr-ca: Drupal only has fr, no variants of it.

zh-cn and zh: Drupal has zh-hans and zh-hant; this is where core/modules/language/config/language.mappings.yml may be useful; that maps the codes proper to these

-----

So in short, looks like

1. The CK language mapper should have a full list of the CK language codes so it can decide which one to pick on the server side AFAIS. Apply a multi-pass process where different options are looked at starting with the full Drupal language code and then do transformations as explained later here to see if that matches.

2. Resolve pt, no, zh-cn and zh: pt => pt-pt, no => nb, zh => zh-hans mappings should be added to core/modules/language/config/language.mappings.yml. CK should use these mappings *in reverse* to map Drupal language codes to CK codes if there is no direct match for the langcode.

3. Resolve km and ms: km and ms need to get into Drupal core, they are now on l.d.o.

4. Defer en-* and sr-*: I think en-gb, en-ua, en-ca, fr-ca and sr-latn localizations of CK will only be accessible if the user has this manually configured on the site; this makes total sense IMHO.

Sounds like this is at least 2 different core issues. 3 should be one to refresh the language list from l.d.o to core with a new static list. Then 2 and 3 may be resolved here.

Wim Leers’s picture

Issue tags: +sprint

#6: awesome analysis, thank you! Maybe you want to work on this? (You probably know your way around this codebase very well.) The only language/translation-related thing we do is at Drupal\ckeditor\Plugin\Editor\CKEditor::getJSSettings():

  public function getJSSettings(EditorEntity $editor) {
    $language_interface = language(Language::TYPE_INTERFACE);

    …

    // Next, set the most fundamental CKEditor settings.
    $external_plugins = $this->ckeditorPluginManager->getEnabledPlugins($editor);
    $settings += array(
      …
      'language' => $language_interface->id,
     );

     …

So you just have to change those one or two lines of code, so that they call this yet-to-be-written CKEditor language mapper.

If you can't, that's fine too.

Gábor Hojtsy’s picture

I can definitely take a look!

Gábor Hojtsy’s picture

Gábor Hojtsy’s picture

FileSize
1.69 KB

Here is a very basic initial version. It falls back on English, if CK does not support the language we want. It does not yet do more sophisticated mapping.

Gábor Hojtsy’s picture

Status: Active » Needs review

Status: Needs review » Needs work

The last submitted patch, ck-langcode-v1.patch, failed testing.

Gábor Hojtsy’s picture

Status: Needs work » Needs review
FileSize
1.64 KB

CK changed to some degree. Here is a reroll.

Status: Needs review » Needs work
Issue tags: -sprint, -Spark, -CKEditor in core

The last submitted patch, ck-langcode-v1r.patch, failed testing.

Wim Leers’s picture

Status: Needs work » Needs review

#13: ck-langcode-v1r.patch queued for re-testing.

Status: Needs review » Needs work

The last submitted patch, ck-langcode-v1r.patch, failed testing.

Gábor Hojtsy’s picture

Status: Needs work » Needs review

#13: ck-langcode-v1r.patch queued for re-testing.

Status: Needs review » Needs work
Issue tags: +sprint, +Spark, +CKEditor in core

The last submitted patch, ck-langcode-v1r.patch, failed testing.

Gábor Hojtsy’s picture

Status: Needs work » Needs review
FileSize
1.63 KB

Languages don't have an id() method looks like.

Gábor Hojtsy’s picture

Several updates:

- injected the language manager to get interface language since calling language() directly is deprecated
- refactored the language list compilation to a method on its own
- added caching to this, so we don't need to traverse files all the time (cache clearing still @todo)
- added the language module based mapping reversal into this language list, so we don't need to do a multi-pass matching later on; we have a complete matching map in the array this way
- injected the module handler as well, so we can call out to language module for the mapping

I'm not sure how to elegantly clear the cache honestly. It would need to be cleared when:

1. Core is updated (and ck languages may change)
2. The browser mapping changes which can be a UI based operation (form submit) or config deployment

I think (1) already clears caches widely, no? And then (2) would need an event listener on config saves to look for when the browser mappings change. That sounds a bit too heavy handed for this... So two possibilities:

A. We only cache the list from the file system and always dynamically add the mapping elements
B. We cache based on time, so it will eventually expire proper

Any bright thoughts?

Status: Needs review » Needs work

The last submitted patch, ck-langcode-complete-20.patch, failed testing.

Wim Leers’s picture

+++ b/core/modules/ckeditor/lib/Drupal/ckeditor/Plugin/Editor/CKEditor.php
@@ -45,17 +62,30 @@ class CKEditor extends EditorBase implements ContainerFactoryPluginInterface {
+    return new static(
+      $configuration, ¶
+      $plugin_id, ¶
+      $plugin_definition, ¶
+      $container->get('plugin.manager.ckeditor.plugin'), ¶
+      $container->get('module_handler'),
+      $container->get('language_manager')

Trailing spaces.

+++ b/core/modules/ckeditor/lib/Drupal/ckeditor/Plugin/Editor/CKEditor.php
@@ -228,6 +266,38 @@ public function getJSSettings(EditorEntity $editor) {
+  private function getLangcodes() {

We probably want test coverage for this method? :)

+++ b/core/modules/ckeditor/lib/Drupal/ckeditor/Plugin/Editor/CKEditor.php
@@ -228,6 +266,38 @@ public function getJSSettings(EditorEntity $editor) {
+      // Get languagr mapping if available to map to Drupal language codes.

s/languagr/language/


I don't have experience with the new EventDispatcher class that we're using from Symfony (already being used by e.g. ConfigEvent/ConfigContext), but it seems it's appropriate that a TranslationEvent or LanguageEvent is fired whenever either new translations are imported, or browser mappings are changed?

Gábor Hojtsy’s picture

I don't have experience with the new EventDispatcher class that we're using from Symfony (already being used by e.g. ConfigEvent/ConfigContext), but it seems it's appropriate that a TranslationEvent or LanguageEvent is fired whenever either new translations are imported, or browser mappings are changed?

This honestly seems overly complex to me. I mean we can add an event listener for the config change and another one for config import to cover bases, but its two more mostly empty classes just so we can clear a derived cached representation of the data that is otherwise retrieved from a quick access store anyway. Sounds to me like we may be better off to do this derivative data generation on the cached data, so we can rely on the upgrade cache clear that would clear all bins(?).

Wim Leers’s picture

Coming from a C++/Qt background, where you have "signals" and "slots", this actually makes a lot of sense to me, and sounds as simple as it could possibly be.

The core update case isn't problematic I think, and if it were only that we needed to react to, I'd agree what I said is overkill. But how are you going to react to a changed browser mapping if not through an event? Inject another submit handler? Or are you saying there already is a ConfigEvent for this, and it is possible to react to just the ConfigEvent for changing the browser mapping? If that's the case, then yes, just listen for ConfigEvents, check if it's the relevant one, and clear the cache if that's the case :)

Gábor Hojtsy’s picture

Status: Needs work » Needs review

#20: ck-langcode-complete-20.patch queued for re-testing.

Damien Tournoud’s picture

Don't we want to handle CKEditor translation ourselves? I would be horribly inconsistent to be able to translate everything *except* CKEditor strings in the translation admin. Plus, it would make little sense to have a different way of translating different kinds of CKEditor plugins.

We are already handling JQuery UI's translation ourselves, so there is a precedent here.

Status: Needs review » Needs work

The last submitted patch, ck-langcode-complete-20.patch, failed testing.

Gábor Hojtsy’s picture

@Damien: Wait, how do we handle jQuery UI translations ourselves? Looks like I missed something?

Damien Tournoud’s picture

Gábor Hojtsy’s picture

@Damien: I looked into this over the weekend. The translation files for CKE have the following structure:

CKEDITOR.lang['cs']={"undo":{"redo":"Znovu","undo":"Zpět"},"toolbar":{"toolbarCollapse"...

It is essentially a nested map of string related to certain functions. There are 29 groups of strings for each language, but there are multiple strings per group as shown here, so in total there are hundreds of strings. We can obviously redo these files ourselves like this:

CKEDITOR.lang['cs']={"undo":{"redo":Drupal.t('Redo'),"undo":Drupal.t('Undo')},"toolbar":{"toolbarCollapse"...

Then we still need to ensure that CK would take the language files from somewhere else but not its default location. I did not find ways to make CK understand we would want to feed it translations from outside. Eg. see http://docs.cksource.com/ckeditor_api/symbols/CKEDITOR.lang.html

Still then, this would mean we push hundreds more untranslated source text to Drupal translators even though the translations are already available upstream. So even if this would be possible, I'm not sure it would be actually a good move in terms of our productivity. We rely on the upstream work to not need to redo stuff :)

Damien Tournoud’s picture

@Gábor: we can and should mass import the upstream translations anyway, right?

Gábor Hojtsy’s picture

@Damien: Drupal translations are the domain of each translation team. At best we could somehow provide them the CK translations as a suggestion. But I'm not sure how that could work. I'm definitely not signing up to write scripts to convert these JS arrays to .po files and then import them in some way to localize.drupal.org as suggestions and do that anytime the CK codebase is updated in core, etc. Sounds like a lot of overhead where I don't see much gain to be had.

Gábor Hojtsy’s picture

Status: Needs work » Needs review
FileSize
2.8 KB
6.22 KB

Working forward on the above patch for now. Reorganized language code list caching, so we don't cache the browser mapping configuration as applied to the list. That should be quick to produce (unlike file directory listing) and we don't need to add lots of overhead to the codebase for this.

Status: Needs review » Needs work

The last submitted patch, ck-langcode-complete-33.patch, failed testing.

Gábor Hojtsy’s picture

Status: Needs work » Needs review
FileSize
6.94 KB
2.77 KB

Fixed several PHP issues and registered the ck cache service. Makes things pass on some spot-check. Let's go testbot :)

Wim Leers’s picture

Reviewed & rerolled. I also applied the suggested additional language mappings of #6.

  1. +++ b/core/modules/ckeditor/ckeditor.services.yml
    @@ -2,3 +2,10 @@ services:
    +  cache.ckeditor:
    

    Maybe rename this to cache.ckeditor.translations?

    Because "a CKEditor cache" doesn't make any sense in my mind :)

  2. +++ b/core/modules/ckeditor/ckeditor.services.yml
    @@ -2,3 +2,10 @@ services:
    +    arguments: [ckeditor]
    

    This results in a new cache_ckeditor table. Which will always contain just one row. Isn't this a bit overkill?

    I'm fine with it in principle — I prefer logical separation. I just don't know if this considered acceptable within Drupal core?

  3. +++ b/core/modules/ckeditor/lib/Drupal/ckeditor/Plugin/Editor/CKEditor.php
    @@ -201,13 +229,23 @@ public function getJSSettings(EditorEntity $editor) {
    +    if (isset($ckeditor_langcodes[$language_interface->id])) {
    +      $display_langcode = $language_interface->id;
    +    }
    
    @@ -227,6 +265,48 @@ public function getJSSettings(EditorEntity $editor) {
    +    // Get language mapping if available to map to Drupal language codes.
    +    // This is configurable on the user interface and not expensive to get,
    +    // so we don't include in the cached language list.
    

    I think this is wrong?

    First it checks if a mapping is available. Then it doesn't use the mapping.

    Fixed in patch.

  4. +++ b/core/modules/ckeditor/lib/Drupal/ckeditor/Plugin/Editor/CKEditor.php
    @@ -227,6 +265,48 @@ public function getJSSettings(EditorEntity $editor) {
    +  private function getLangcodes() {
    

    Do we ever make something private? I thought we were only supposed to protected?

  5. +++ b/core/modules/ckeditor/lib/Drupal/ckeditor/Plugin/Editor/CKEditor.php
    @@ -227,6 +265,48 @@ public function getJSSettings(EditorEntity $editor) {
    +        $langcode = basename($language_filename);
    

    The way basename() is used here, you would never get a match: you'd get af.js, nl.js, etc.

    I fixed this.

  6. +++ b/core/modules/ckeditor/lib/Drupal/ckeditor/Plugin/Editor/CKEditor.php
    @@ -227,6 +265,48 @@ public function getJSSettings(EditorEntity $editor) {
    +        $langcodes[$langcode] = $langcode;
    

    Why use an associated array? Why is a simple array not enough? I think you chose to do this because further down, you expand it with Drupal language codes.

Wim Leers’s picture

Issue tags: +Needs tests

The code is close, what is still needed is test coverage.

Status: Needs review » Needs work

The last submitted patch, ck-langcode-2050097-36.patch, failed testing.

Wim Leers’s picture

Adding the additional language mappings broke the test coverage for that apparently.

Gábor Hojtsy’s picture

Added specific "unit" test coverage for this method. It needs language module enabled and default language config added, so that its a unit test is debatable... Also fixed the direction of the pt-pt => pt mapping, Drupal has pt-pt so that should be the other way around.

Did not look into fixing the detection tests, we should confirm the test cases are not covering important use cases that we should cover otherwise now that we expanded the mappings.

As for Wim's questions:

1/2: on cache: we can use state() for storing this language list if you feel that is better; technically this is a cache and ckeditor module owns it; I don't see a problem with that or that we'd need to make the name more specific
3: thanks for the fix.
4: it needed to be public to be "unit testable" anyway, so now public (sorry, not included in interdiff, screwed that up :D)
5: thanks!
6: I think you figured out that this is a mapping where the values are not the same as the keys, that is the point :)

Gábor Hojtsy’s picture

Status: Needs work » Needs review
Wim Leers’s picture

1. I still think we should rename the service because "a CKEditor cache" doesn't make sense. But that is a tiny nitpick about an implementation detail most people won't ever see.
2. Okay :)
4. Yes, sadly!


The added test coverage looks sufficient to me. If it comes back green and you think 1. shouldn't be changed, then I'll RTBC.

Wim Leers’s picture

Issue tags: -Needs tests

.

Status: Needs review » Needs work

The last submitted patch, ck-langcode-2050097-40.patch, failed testing.

Gábor Hojtsy’s picture

Status: Needs work » Needs review
FileSize
2.63 KB
10.36 KB

Its funny, the fails with the language detection actually exposed a bug with the implementation for browser negotiation. Huh! Since this patch introduced some more mappings, the examples in the tests now had multiple languages that have resolved to the same internal langcode. Since in this case the old code took the last value (which usually has the smaller qvalue (priority) as well), the result for the negotiation changed. We should fix the language negotiation code to take the highest qvalue (priority) in this case for the langcode and that nicely fixes the issue. See that part of the interdiff.

Also fixed the ckeditor cache name.

Any other concerns? :)

Wim Leers’s picture

Status: Needs review » Reviewed & tested by the community

Wow, nice find! :D

RTBC if green :)

Gábor Hojtsy’s picture

Status: Reviewed & tested by the community » Needs review
Issue tags: +D8MI, +language-base

Putting on the D8MI language-base tags for the cross-over to D8MI bugfixing land :)

Gábor Hojtsy’s picture

Status: Needs review » Reviewed & tested by the community

Crosspost.

Gábor Hojtsy’s picture

#45: ck-langcode-2050097-45.patch queued for re-testing.

webchick’s picture

Priority: Normal » Major
Status: Reviewed & tested by the community » Fixed

Since this is a blocker for translations, escalating this to major.

This looks good to me. Checked through the code, but nothing I could find to complain about.

Committed and pushed to 8.x. Thanks!

Wim Leers’s picture

Issue tags: -sprint

Automatically closed -- issue fixed for 2 weeks with no activity.

Anonymous’s picture

Issue summary: View changes

Fixed some chars

ljcarnieri’s picture

Issue summary: View changes

Hi guys, I found one problem when I create custom Language.

I related the issue in this issue and one possible solution, anybody help me with review of the solution?

https://www.drupal.org/project/drupal/issues/3076413

Thanks and Regards.