Handle valid words from other languages consistently in cspell [#3333267]

Problem/Motivation

There are parts of the code in French and Dutch, and maybe other languages.
Overall, real words in foreign languages should be handled on a per-sentence or per-test basis, not on a per-word basis.
So the objective is to look for all the foreign languages in the dictionary and treat them separately from other issues.

Proposed resolution

Review the dictionary.txt and list the foreign words.
decide how to proceed about ignoring, changing or creating a separated dictionary?

Remaining tasks

Make a patch. There is no need to test just yet.

User interface changes

API changes

Data model changes

Release notes snippet

Comment	File	Size	Author
#5	3333267-5.patch	1.07 KB	quietone

Issue fork drupal-3333267

Show commands

Start within a Git clone of the project using the version control instructions.

Add & fetch this issue fork’s repository

Or, if you do not have SSH keys set up on git.drupalcode.org:

Add & fetch this issue fork’s repository

3333267-handle-valid-words compare
Check out this branch for the first time

Check out existing branch, if you already have it locally

About issue forks

Comments

Comment #1

13 January 2023 at 10:35

lucienchalom created an issue. See original summary.

Comment #2

lucienchalom commented 13 January 2023 at 10:55

Issue summary:

View changes

Comment #3

lucienchalom commented 13 January 2023 at 12:34

Issue summary:

View changes

Comment #4

quietone commented 15 January 2023 at 03:19

Component:	base system	» other
Issue summary:	View changes
Issue tags:	-cspell error
Related issues:		+#3328741: Add a dictionary for Drupal-specific words

The proposal in the related issue is to add a dictionary for drupal, drupal.txt. The non-English works could be added there. But it might also help to have these in a separate dictionary for reference when writing tests. So, we could have drupal.txt for drupalisms and technical words and drupal-non-english for non english words. The new dictionary 'drupal-non-english' would be small but I think it is a good distinction to make.

I am removing the tag because I made that just for the issues that are removing words, as in 'remove N words ..'.

Comment #5

quietone commented 16 January 2023 at 08:18

Status	File	Size
new	3333267-5.patch	1.07 KB

A possible way to do this is to move the non English words to a separate dictionary. I started a patch for this, which is kind of funny because I am fluent in only one language. I haven't updated dictionary.txt or cspell.json because there needs to be agreement on this approach.

Comment #6

lucienchalom commented 16 January 2023 at 12:18

I agree with a separate dictionary for drupalisms and another for non english words.
I can help with portuguese text, that's my first language.

Comment #7

16 January 2023 at 12:18

Version:

10.1.x-dev

» 11.x-dev

Drupal core is moving towards using a “main” branch. As an interim step, a new 11.x branch has been opened, as Drupal.org infrastructure cannot currently fully support a branch named main. New developments and disruptive changes should now be targeted for the 11.x branch, which currently accepts only minor-version allowed changes. For more information, see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Comment #8

quietone commented 9 February 2024 at 05:47

Status:

Active

» Needs review

First, a fact. Currently, the dictionary has about 15 valid non English words. That is a small number of the 943 words in the dictionary today, which is less that half of what it was when it was first committed.

After working on spelling issue for a while now I don't think it is possible to handle valid word from other languages consistently. Such words can easily be retained in a few files with a cspell:ignore line. That gets these words out of the dictionary and only associated with the files or subsystems where they are used. This is what is done for many words in the dictionary whether they are valid non English words, valid technical term, or incorrect spellings of English words.

What makes a standard approach not possible is that Umami uses valid non English words in csv files, which can't have a cspell:ignore line. These words will always have to be maintained in the a dictionary.

I don't think there is anything to do here. I suggest we close this a works as designed and continue to do our best to be respectful of multiple languages.

Needs review

» Closed (works as designed)

If anyone full disagrees please reopen leaving a comment as to why but still agree with @quietone's assessment in #8

Handle valid words from other languages consistently in cspell

Problem/Motivation

Proposed resolution

Remaining tasks

User interface changes

API changes

Data model changes

Release notes snippet

Issue fork drupal-3333267

Comments

Comment #1

Comment #2

Comment #3

Comment #4

Comment #5

Comment #6

Comment #7

Comment #8

Comment #9

Comment #10

Comment #11

Parent issue

Related issues

News items

Our community

Documentation

Drupal code base

Governance of community