Problem/Motivation
There are parts of the code in French and Dutch, and maybe other languages.
Overall, real words in foreign languages should be handled on a per-sentence or per-test basis, not on a per-word basis.
So the objective is to look for all the foreign languages in the dictionary and treat them separately from other issues.
Proposed resolution
Review the dictionary.txt and list the foreign words.
decide how to proceed about ignoring, changing or creating a separated dictionary?
Remaining tasks
Make a patch. There is no need to test just yet.
User interface changes
API changes
Data model changes
Release notes snippet
| Comment | File | Size | Author |
|---|---|---|---|
| #5 | 3333267-5.patch | 1.07 KB | quietone |
Issue fork drupal-3333267
Show commands
Start within a Git clone of the project using the version control instructions.
Or, if you do not have SSH keys set up on git.drupalcode.org:
Comments
Comment #2
lucienchalom commentedComment #3
lucienchalom commentedComment #4
quietone commentedThe proposal in the related issue is to add a dictionary for drupal, drupal.txt. The non-English works could be added there. But it might also help to have these in a separate dictionary for reference when writing tests. So, we could have drupal.txt for drupalisms and technical words and drupal-non-english for non english words. The new dictionary 'drupal-non-english' would be small but I think it is a good distinction to make.
I am removing the tag because I made that just for the issues that are removing words, as in 'remove N words ..'.
Comment #5
quietone commentedA possible way to do this is to move the non English words to a separate dictionary. I started a patch for this, which is kind of funny because I am fluent in only one language. I haven't updated dictionary.txt or cspell.json because there needs to be agreement on this approach.
Comment #6
lucienchalom commentedI agree with a separate dictionary for drupalisms and another for non english words.
I can help with portuguese text, that's my first language.
Comment #8
quietone commentedFirst, a fact. Currently, the dictionary has about 15 valid non English words. That is a small number of the 943 words in the dictionary today, which is less that half of what it was when it was first committed.
After working on spelling issue for a while now I don't think it is possible to handle valid word from other languages consistently. Such words can easily be retained in a few files with a
cspell:ignoreline. That gets these words out of the dictionary and only associated with the files or subsystems where they are used. This is what is done for many words in the dictionary whether they are valid non English words, valid technical term, or incorrect spellings of English words.What makes a standard approach not possible is that Umami uses valid non English words in csv files, which can't have a
cspell:ignoreline. These words will always have to be maintained in the a dictionary.I don't think there is anything to do here. I suggest we close this a works as designed and continue to do our best to be respectful of multiple languages.
Comment #9
smustgrave commented+1 to closing out per explanation in #8
Comment #11
smustgrave commentedIf anyone full disagrees please reopen leaving a comment as to why but still agree with @quietone's assessment in #8