Voting starts in March for the Drupal Association Board election.
The form for configuring and translating numeric fields has been copied multiple times: NumericField::buildOptionsForm(), TranslateEditForm::buildForm(), PluralVariants::getSourceElement(), and PluralVariants::getTranslationElement(). (That in itself is a problem -- it should be centralized.)
The problem is that the labels used in this form are not clear. Not only that, they are not appropriate for many languages. The reason is that while English, Spanish, etc. have two forms for plural expressions (singular, plural), many other languages have either one form, multiple forms, or two forms that are not "singular" and "plural".
Here's an easy-to read list of the rules:
And here's a more definitive set of rules that is not as easy to code or follow, but should be more accurate since it comes from Unicode.org:
And this related issue demonstrates actual real-world problems that have resulted from our localization teams misunderstanding the labels (the same labels are used on localize.drupal.org)... These are the best experts we have in the Drupal community, and they can be educated on how to understand the labels, and even they are having difficulty. For ordinary users who are not part of a localization community, who are trying to translate their own Drupal site, the problem would be even worse.
So, we need to fix this!
Next: some examples of actual observed problems arising from these labels on localize.drupal.org from that other issue.
Languages with only 1 plural form
In this case the label for the translation is "Singular form". Bad!
So we are seeing output like:
msgid_plural "@count hours" msgstr "1 сағат"
With this translation, '12 hours' will be translated as '1 сағат'.
Languages with 2 plural forms
For all languages with 2 plural forms, we are currently using these labels:
- Singular form
- Plural form
This is OK for English, Spanish, French, etc. But there are languages where the 1st form is actually for all numbers that end in 1 (Icelandic, for instance), or for all non-zero numbers (Javanese). So for instance, here's a translation from the current Javanese po file:
msgid_plural "@count comments" msgstr "1 komèntar" msgstr "@count komèntar"
In this case the result is that, '0 comments' will be translated as '1 komèntar' in Javanese, because the first plural form is used for 0 and the second for all non-zero numbers. This is not correct.
Languages with more than 2 plural forms
The labels in the UI currently are:
- 'Singular form'
- 'First plural form'
- '2. plural form'
- '3. plural form'
The problem here is that in many languages, the first form is for something like "Numbers that end in 1", not really singular. Since the English form they are translating will say something like "1 item", it's likely they will put "1 item" in their language as well for the first form, instead of using "@count item" or the equivalent.
So really what is needed are language-specific labels. So for Russian, the labels might be something like:
- Form for numbers ending in 1 but not 11
- Form for numbers ending in 2, 3, 4, but not 12, 13, 14
- Form for all other numbers
Add a centralized function or method that generates those labels appropriate for a given language, and translate them in the interface language.
Decide. Implement. Review.
User interface changes
We will have appropriate labels for every languages.