Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
Currently the character я
is replaced by a
not by ya
as expected.
See #357254: Transliteration of Russian letters as a reference for correct transliteration table.
Comment | File | Size | Author |
---|---|---|---|
#20 | maintainer_—_-bash_—_80×39.png | 352.47 KB | xjm |
#12 | transliteration-ru_2932249_12.patch | 1.69 KB | Murz |
Comments
Comment #2
petrovnn CreditAttribution: petrovnn commented\core\lib\Drupal\Component\Transliteration\data\ru.php
Comment #3
andypostComment #4
andypostComment #5
MurzThe problem is not only with 'я' symbol, but also for some other characters: ё, ж, й, х, ч, ш, щ, ъ, ы, ь, ю, я.
This already was fixed in 7.x version of Transliteration, but seems lost when porting: #357254: Transliteration of Russian letters
So I extend the list with all needed overrides, patch is attached.
Comment #6
MurzAlso we must note, that transliteration rules from
ru.php
file works only when Drupal interface language, so if we try to transliterate Russian word in non-Russian interface language - this rules will not applied.So we can apply this rules globally, using
/core/lib/Drupal/Component/Transliteration/data/x04.php
file - this way will work with any interface language and do correct transliteration of all Russian characters.Comment #7
MurzHere is another patch, that fix problem with Russian symbols transliteration globally, not only for Russian interface language.
Comment #8
andypostI bet we should fix both!
Comment #9
MurzHere is combined patch that fix the problem globally in both files.
Comment #10
andypostIt looks great for me! Much more natural to parse.
Assigning to Maintainer!
Looking at #2926187: Better Greek transliteration probably this require @xjm to commit
Comment #11
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedWe only need the overrides if a character needs to be transliterated differently in a specific language. See #567832-52: Transliteration in core and the next comment for a similar question and answer.
Comment #12
MurzOk, so here is a patch that fix errors only in default values, without touching overrides.
Comment #13
MurzComment #14
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedNice, the patch looks good to me. Passing over to @xjm :)
Comment #15
andypostWould be great to see this backported to 8.5
Comment #17
MixologicTemporary testbot hiccup.
Comment #19
andypostComment #20
xjmСпасибо.
Вот цветовая разница:
Сейчас я смотрю на юикод (это правильно?). Минуточку...
Comment #21
xjmOkay I read over the diff carefully looking at the order of the actual characters in:
https://en.wikipedia.org/wiki/List_of_Unicode_characters#Cyrillic
Everything in the 0x10 through 0x40 (Russian) looks correct. The other rows appear to be Ukranian which I don't read or speak; does anyone else here on the issue? I can try to read up on it but that will take more time. :)
One small question I had about the Russian transliteration. Ц seems to be transliterated as "c". Is that normal/what Russians use when transliterating? As an anglophone I would phonetically write it as "ts".
Thanks!
Comment #22
xjmHm, both seem to be used a lot, with "c" about twice as frequent as "ts". "Cvety" does give me pictures of flowers though whereas "tsvety" seems to be about some rock band. :)
So looks like (my phonetic assumption nonwithstanding) it is usually "c". So ignore my final question; just the "help please with Ukranian review". :)
Comment #23
xjmAh, the only thing changed in the Ukranian rows is ё which is missing from the alphabetical order of the Russian, so I think this is correct.
Back to RTBC. I'll probably commit this later today.
Comment #24
xjmRetitling since I don't think we reviewed the rest of the Ukranian. :)
Comment #25
Chi CreditAttribution: Chi commentedOn your screenshot that band is also referenced as "The flowers".
'c' and 'ts' are used interchangeably. I propose we stick to 'ts' as we did it in Drupal 7.
Comment #26
Chi CreditAttribution: Chi commentedNever mind, Drupal 7 actually uses 'c'.
Comment #27
andypostThis fix is for common Russian translit
Ц mostly used as C (traditionally) but sounds like "ts" (Tsar)
The same applies to Ч used as ch but most English speakers pronounce it like "tsh" (probably because they listen it more softer then native "
ch
ange")Comment #28
xjmOops, looks like I forgot to come back to this issue. :)
Thanks @andypost and @Chi, makes sense.
Fixing title capitalization and saving issue credit. I thought about whether this might just be a normal bug, but even the
А
vs.Я
by itself is pretty disorienting, so I've kept it as major.Comment #31
xjmCommitted and pushed to 8.6.x. Thanks! I also backported it to 8.5.x as a major bugfix.
Comment #32
Anonymous (not verified) CreditAttribution: Anonymous commented#20: ❤️ xjm по-русски, балдеж!
#25: This would help to eliminate a lot of illiterate mistakes, eg: "буцы/бутсы -> butsy". But I completely agree with #26/ #27. For example on the site http://translit-online.ru/ you can get 240 different combinations, and this is not the limit 😱 So it's better to focus on one popular option.
Personally for me the most controversial is the
й -> y
instead ofj
. But after long and painful arguments inside of me, I agree that they
is preferable 🙏🏻#28: Absolutely, I had a rather amusing embarrassment when on the page with the names of employees, the name "Яна (Янина)" was displayed as "Ana". Given that these are two different female names. And "Ана" is also written illiterate (right "Анна") 😯
Now traslit works fine! Great thanks!
Comment #33
xjmAdding credit for @amateescu as well for the review in #11 (thanks @amateescu)!
Comment #38
xjm(The revert and recommit is to add @amateescu to the commit message.)