Updated: Comment #31

Problem/Motivation

As has been mentioned elsewhere, we don't yet have guidelines on when and how to use string contexts (= translation contexts). Due to lack of guidelines every developer/module maintainer re-invents the wheel and decides on their own context strings. This results in a great variety of contexts, some good, some not. Further, the lack of guidelines for context, makes it difficult for module/core maintainers to accept string context suggestions. As a result discussions come to a standstill or bad contexts get accepted.

For examples of real-life string context issue, see the list of issues tagged with "string context".

Proposed resolution

Create a description of string context for both developers and translators. Including criteria for string context and examples of good and bad contexts.

Remaining tasks

String context criteria

Define a context that:

  • Fixes an actual translation problem that has been identified due to the lack of context.
  • Allows a translator to distinguish the meaning of the string and decide on the right translation.
  • Is reusable.
  • Provides information on the meaning of the string.
  • Is short and concise (phrase rather than a full sentence).
  • Only add the context to the string in the deviating context. The dominant use of the string does not receive a context.

Avoid the following in context definition:

  • Linguistic contexts like: "Verb", "Noun" or "Dative". This usually does not provide enough distinction
  • "Module name". Module names are usually not translated. If the module name is the donimant use of the string, it will not receive a context.
  • Don't use a module name. It silos the use within (groups of) modules and prevents collaboration between modules.
  • Don't use functional part of the module where the string is used (e.g. "Block"), function name or php class name. This is not re-usable and this context may change over time.

Related issues

Arguments from the original report by @zirvap

  • Context is used to give additional information about strings to translators, for strings which would otherwise be difficult to translate well.
  • General rule of thumb: If a string has several possible meanings (ie. "May" which can be both the complete month name and a three-letter abbreviation of the name), we use the bare string for one meaning, and add context to others. For instance: "May" without any context is the abbreviation, "May" with context "Long month name" is the complete name.
  • Contexts should be reused. (For instance, a contrib module should not introduce "May" with context "Complete month name", it should reuse the core context.
    (#1034882: Make list of contexts used more evident for developers is necessary for this)

Comments

spuky’s picture

It maybe would be good to come up with a short list of common context cases that would be first choice for a developer when seeking for context... I try to start of.. with some examples that could be extended.

A Rule for Developers:
Defining context is most important for short translateables like "view" or "views" where it is hard for a translator just using a tool like localize.drupal.org to get contextual information.

I'd start of with defining simple contexts like:

  • verb (may be used for buttons with actions)
  • noun
  • module_name
  • name (for names in general.. since they potentialy will not be, or be translated diffrently like uppercase firstletter )

those are highly reuseable and would help a lot. I'd try to come up with more cases but I haven't been translating that much lately...

I hope other people will come up with more examples...

the_phi’s picture

... I've been following the German translation process since months. Unluckily I haven't had enough time to join in. Here my tiny contribution in form of a quick-intuitive proposal:

A firm global translation meaning tree

A semantic taxonomy/ontology in OWL/RDFS (a standardised linguistic approach).

  • Language
  • Title
  • Function
  • Descriptor
  • Property

I haven't found any useful/fitting existing namespaces (e.g. http://usefulinc.com/ns/doap, a project vocabulary) to enrich resp. extend pure RDF in order to make use of RDF's basic functionalities.

plach’s picture

subscribe

Gábor Hojtsy’s picture

@spuky: the difficulty of defining contexts is exactly that we should (IMHO) avoid using simplistic contexts like you propose. "Verb" our "noun" do not provide real context for meaning. The word "view" if a noun can still mean various things, like "view defined in views", "view from Rocky Mountains", "view in a hierarchical database". These are all nouns, but might be differently translated to a foreign language. I don't have an English example offhand, but one fun Hungarian example, is that we use the same noun for "lightbulb" and "pear fruit", probably because they look the same. Simply saying "noun" or "verb" does not really provide sufficient context in many cases.

For using the module name, that is yet another thing I'd avoid. Clustering strings per module name is pretty bad, since it can easily spiral into killing collaboration between modules and will lead to inconsistencies in translation.

Some strings that needed context in core include "strong" (for important and the strong HTML tag) and the above mentioned May (for short and long month name). Contrib examples include "check" for "check as in form of payment" and "check as in checking out". The context should provide supplemental information about the meaning of the string.

the_phi’s picture

... for [very] professional results one should probably take XLIFF into consideration as well:

Sabine Mehrens’s picture

Shouldn't the module be an option as a context, maybe even first choice?
Or would the module be there, anyway?

Gábor Hojtsy’s picture

@Sabine: no, the word "view" means the same in Views, Views slideshow, Views whatever modules and its use should not be siloed into module specific namespaces or it kills efforts for translators to establish a unified terminology. Contexts should provide supplemental information about the meaning of the string. Many strings have a whole ecosystem of modules using them with the same meaning.

spuky’s picture

But if I tag the string that is suposed to be translated/or not.. with a tag "modul_name" (not the actual modulname programmers tend to replace vars "modul_name" is not a var!!!) then Views, Views slideshow, Views whatever, CCK or which ever module when having the string "views" that ist meant as module name could tag it that way. So even within views module you coud distinguish between the module name and other usages of view

So if modul names all over the place would get taged with modul_name a translator could see a this is a modulname and should be translated in a special way (in German we try to keep the english names for making modules easier to identify)

I agree that "noun, verb" don't reval much context... but more than none.. so in your pear example you could one Module have:
t("körte", array(), array("langcode"=>"hu", "context"=> "noun, fruit")) ;
and an other:
t("körte", array(), array("langcode"=>"hu", "context"=> "noun, lighting")) ;

When I would be up to translate that I would in both cases see that is a noun and get Information on context.. of course context should try to be short, to not bloat the System..

On thing is to come up with standards for possible contexts (that don't force a developer to learn whole lot when trying to ad context information to their t functions )

The other thing is to build a feedback loop betwen strugeling translators and the developers...

zirvap’s picture

I agree, there are some cases where "module name" should be a context. Example: The string "location" is used in core to mean location in file system, and in GMAP and Location (and probably other Geo/mapping modules) to mean geographical location. Those are two different words in Norwegian, so we need a context for "Geographical location". BUT: There's also a string for the module name "Location", and (at least until now) the policy in a lot of translation communities to not translate module names. If we want to keep that (and I think we do), then the module Location must use two different contexts for "Location", depending on whether it's referring to the module or to a geographical location.

There's a similar situation with Webform. The string "Webform" appears in the user interface when you need to edit a webform. Users may have permission to add or edit webforms without having knowledge about or interest in what the module is called. So it would be useful to be able to translate the string "webform" in some places, but not all.

Gábor Hojtsy’s picture

Ok, got that, agree it would be useful.

Gábor Hojtsy’s picture

Marked #802980: Defining contexts as a duplicate.

Sabine Mehrens’s picture

You're right, of course.
But aren't you taking the effort from the translators and putting it on the part of the developers, instead?
When looking for a practical approach, my idea was that the module might somehow be automatically inserted as a context, and the translators would still be free to decide they need a special translation for that module, or leave that context out and provide only one translation for general use, as before. So that the context would act like an aid, to be picked up by the translator, or not. Something like that.

As to semantic contexts: In my opinion they are necessary and useful if you aim at an automatic translation or a translation by people who don`t know anything about the context the string is used in. But it puts quite some effort on the developers, and they are no linguists, how are they to know? How will you control in which case a context has to be provided, and in which case not?

When we are asked to suggest contexts, we should take into account the whole variety of modules.
My old dictionary gives a good choice in the list of abbreviations. There are about 135 abbreviated distinctive terms used to differentiate meaning and semantic use of words.

Gábor Hojtsy’s picture

Unfortunately Drupal itself does not know about the source of the string either, so it cannot automatically add a context. There were some efforts to make this automatically available to Drupal, but they significantly degraded the site performance. It basically requires runtime stack inspection, which is pretty expensive in PHP. So as a matter of fact, it is only developers who can add contexts, and they need to work with translators.

the_phi’s picture

... do we need to look at something like W3C's "Internationalization Tag Set (ITS)"http://www.w3.org/TR/its/ ? The authors of this document also mention a few well known translation softwares!

DjebbZ’s picture

subscribing until I find some time to chip in.

wojtha’s picture

Subscribing

LarsKramer’s picture

Ad #13: It is a pity Drupal provides no context about the origin and context of a string. Wouldn't it be possible when a module is installed to save this information into some database table with the fields: string, module_name, line_number? So that this information could be retrieved when the translator or adminsitrator enters the "translate interface page". Just an idea...

Ad #7: Actually the word "Views" is also used in the module advanced_forum, meaning the number of times a forum topic has been read. In many languages that would conflict with the translation of the name of the module Views (if at all the module name should be translated, which I agree it shouldn't).

zirvap’s picture

Assigned: Unassigned » zirvap
Status: Active » Needs review

I’ve started a handbook page at http://drupal.org/node/1369936 The intention is that we can link to that page when we open issues about adding string contexts, so I’ve included various background and how-to info as well.

Please discuss and improve, as needed!

Quentin Albrand’s picture

I realized when I had to use contextual translation for a module that maybe the biggest problem we have here is that there is no way (or is there ?) to see how a string is translated by default and by any other existing context. What I mean is that we cannot enter a string into a search bar and then have the list of all translations existing for it depending on the context.

This would actually help a lot because we would know for sure if we have to create a context for a string or not.

jhodgdon’s picture

Title: Decide and document guidelines for using string context » [policy, no patch] Decide and document guidelines for using string context
Project: Documentation » Drupal core
Version: » 8.x-dev
Component: Missing documentation » documentation

Coding standards are normally discussed in the Drupal Core issue queue.

jhodgdon’s picture

Issue tags: +coding standards

forgot tag

Gábor Hojtsy’s picture

Well, yes, and no :) This would apply to core and definitely to contrib.

jhodgdon’s picture

Yes, but we still usually discuss coding standards for the Drupal project as a whole in the Drupal Core issue queue, rather than in Documentation where only docs writers will ever see them. Issues tend to get buried there. :)

dozymoe’s picture

Instead of module name, shouldn't that be noop, or keyword, it's a more generalized meaning, that the word should not be translated.

the_phi’s picture

@Lars#17 (first paragraph): You probably mean a tool like 'Translation template extractor' (http://drupal.org/project/potx )! With this contrib module you can extract all the strings of e.g. your custom module and put the resulting language-specific .po file (e.g. 'modulename_date.de.po') -- after adding the [lacking] translations -- into your custom module's 'translations' folder. Alternatively you can even produce a language neutral .pot template file with this contrib module.

The .po/.pot file contains the file name(s) and even line number(s) for every occuring t()-string in the code!!! After the manual or automatic import of the .po file this additional meta data shows up in the admin backend (translation GUI).

hass’s picture

Uyghur (here in core it means the language) from _locale_get_predefined_list(). http://en.wikipedia.org/wiki/Uyghur has listed:

  • Uyghur people
  • Uyghur language
  • Some others less relevant...

So, context could be (including my previous linked examples):

  • Action
  • Module name
  • Language
  • People
manarth’s picture

An English example, close to our developer heart: table is (usually) a noun, but do we mean database table, dining table, table of contents

Globalbility’s picture

I make a new website now and I have a probably rare use case for translating strings; so I'm adding it here both to get tips for how to use the string context in the best way and to give input on the different use cases:

I have a button called "Download" to make people able to download a poster. For some languages the word "Download" can simply not be translated without a context. E.g. for Chickasaw the string would be translated to something like "Push here to get the poster" - and there would be needed different translations for different contexts (depending on what you want to download). So I will simply add "Poster" as the context:

t('Download', array(), array('context' => 'poster')),

For other languages there only needs to be one translation, so I work to make a context fallback to the language fallback module #2002694: Add context fallback. In this way, I can simply translate "Download" one time for some languages and several times for other languages when needed. This function might be useful in other situations too. E.g. it could be possible to make it give different fallback priority to different types of string context (word classes, according to pronoun form, objects, names, etc).

Sutharsan’s picture

I've been through the process of creating issues to add context several times now. Also as a translator I have come across various contexts. I use these rules:

  • A context describes the linguistic meaning of the word.
  • A context should be concise, short and to the point.
  • A context is re-usable. Avoid the use of module names.
  • Generic terms such a 'verb', 'noun' or 'module name' should be avoided because they usually don't provide enough information for the translator.
  • A context is not the part of the module or function where the string is used. Translators may not be familiar with the code, sometimes not even familiar with the module at all.

Instead of compiling a list of standard context, as some have tried here, we should stick to guidelines and good examples. You only have to look at the list of current contexts, and you will understand that it is impossible to come up with contexts up front. Only when we encounter translation problems due to lack of context, we can come up with good contexts.

Sutharsan’s picture

@pounard mentioned in #1429822: Wrong localization context usage the gettext comments on context, which I find valuable for this discussion:

Finding a canonical msgctxt string that doesn't change over time can be hard. But you shouldn't use the file name or class name containing the pgettext call – because it is a common development task to rename a file or a class, and it shouldn't cause translator work. Also you shouldn't use a comment in the form of a complete English sentence as msgctxt – because orthography or grammar changes are often applied to such sentences, and again, it shouldn't force the translator to do a review.

Source: http://www.gnu.org/software/gettext/manual/gettext.html

Sutharsan’s picture

Issue summary: View changes
jhodgdon’s picture

Issue summary: View changes

I looked through the proposed guidelines for string contexts that is in the issue summary... Oh, also looked at the existing docs page https://drupal.org/node/1369936 -- Some comments:

- It would be clearer if some examples of good vs. bad context were shown, illustrating the guidelines.
- There are two "bad" items about not using the module name. One is probably enough. :)
- There are some grammatical and typographical issues (I've made an edit).
- The last "good" item says that you should only have a context on the "deviating" string. But that seems wrong to me. To use the example in the docs page, if "Order" is unclear, it seems to me that all instances of "Order" should have context, because how would a translator know what the "dominant" version is supposed to mean without the context being there? I think if you have a string that needs context, then each version of it should have a context?
- Maybe we need a guideline saying to check localize.drupal.org to see if a string you are putting into your module already has several choices of context defined, and pick an existing one if so?
- The last item in "bad" I don't understand at all what it means... "Don't use functional part of the module where the string is used (e.g. "Block"), function name or php class name. This is not re-usable and this context may change over time." ?!? Really I have no idea what it means.

Sutharsan’s picture

It would be clearer if some examples of good vs. bad context were shown, illustrating the guidelines.

Great idea, lean by example.

There are two "bad" items about not using the module name. One is probably enough. :)

The first one is about the context "module name" the second about a context like "commerce". But apparently we should be more clear about this.

The last "good" item says that you should only have a context on the "deviating" string. But that seems wrong to me. To use the example in the docs page, if "Order" is unclear, it seems to me that all instances of "Order" should have context, because how would a translator know what the "dominant" version is supposed to mean without the context being there? I think if you have a string that needs context, then each version of it should have a context?

It may seem that it is difficult for a translator to know what the dominant usage of a string is, but in practice we are managing quite well. Most translators know drupal and just know that the dominant usage of the word. But most of all, it is the practical approach to add a context to for the deviating meaning only. Take as example the string 'Block'. You just know that the dominant usage of 'Block' is in the context of a Drupal Block. But now Private Message module uses 'Block' in the meaning of preventing access (#2160591: Allow translation of Block with right context). It would be practically impossible to add a context to all 'Block' strings, including old versions of modules. All translations are shared even with Drupal 5 modules. If we limit the context to the deviating meaning, we keep it simple.

Maybe we need a guideline saying to check localize.drupal.org to see if a string you are putting into your module already has several choices of context defined, and pick an existing one if so?

True. But with the remark that the context is list has several bad examples. We currently can not filter it and cleaning it up is impossible since strings of old releases are included too.

The last item in "bad" I don't understand at all what it means... "Don't use functional part of the module where the string is used (e.g. "Block"), function name or php class name. This is not re-usable and this context may change over time." ?!? Really I have no idea what it means.

And I find it hard to explain too ;) I think it refers to what is described in the quote in #31. Some bad examples: context 'page title' [1], 'json_error' [2]. The string may not be used as page title or json error in the future ("may change over time") and when this string is used in a different place, not as page title or json error, we may still need this different translation.

[1] https://localize.drupal.org/translate/languages/nl/translate?context=Pag...
[2] https://localize.drupal.org/translate/languages/nl/translate?context=jso...

droplet’s picture

Come from https://drupal.org/comment/8525745#comment-8525745.

It's 3~4 years since D7 introduced the contexts. If we take a look at the "REAL WORLD" usage, no modules added Context to "Weight".

It's very clearly shown that how the maintainers thoughts when they coding modules. They don't need X.Y.Z, will never add it. Of course you will tell me to provide a patch. Right. I can patch ONE module and wait for half years or so. But no able to patch 10 or even 100..

Please consider add contexts to all drupal-specified common words, eg. "Weight" ( I think 99% of other system using the word "Order / Sort" instead.)

Thanks.

hass’s picture

I think we need to convince the developers however hard it is or write a doc page and just point them there :-). Weight is really a great example. I had others like "state" in my case it was "territory of a country" and after 6 months (OMG) we added this as context. If someone do not understand that "weight" is bad without context he should step back and let others become a co-maintainer to get this stuff fixed. Seriously.

Sutharsan’s picture

@hass I have had many (issue) discussions with developers regarding context. In my experience developers need a few things:

  1. Clear guidelines or examples. Being able to point a developer to a coding standard is easy and convincing, this issue attempts to write rules to make such a guideline for string contexts.
  2. Awareness. Security, translatability and required context have it in common. Once you are aware of the problem, you can start looking for a solution. With context, a developer needs to be pointed to the problem by a translator. For a developer it is hard, or even impossible, to predict that the translation of some string will cause conflicts in one of the 100+ languages that Drupal is translated in. It requires a low barrier between translators and developers to solve the context problem.

@droplet, Developers will indeed only add solutions if they think there is a problem. I don't want them to solve problems that do not exist, that would lead to bulky and unmaintainable code. Adding context to a string's default context is just the same thing. It will lead to bulky code, which does not solve real problems. That is why I propose to only add the context to those strings that represent the minor use case. In you example don't apply is to weight in the meaning of sort order, but to weight in the meaning of mass. Only in Drupal 8 core 'Weight' is used 53 times for sort order and never for mass. But lets continue the discussion on specific string in their respective issues.

Sutharsan’s picture

@all Lets try to finalise this discussion, and come to sensible and acceptable guidelines and example. I propose to have a BoF discussion at Developer days in Szeged two weeks from now.

droplet’s picture

Assigned: zirvap » Unassigned

@ALL

D8 going to release next week, any new guidelines (policy).

There's no perfect world. I believe we need some actions instead of do nothing.

We build framework and modules and API to solve common problems. To me, `Strings Context` is a special API layer in CORE.

Thanks All.

Version: 8.0.x-dev » 8.1.x-dev

Drupal 8.0.6 was released on April 6 and is the final bugfix release for the Drupal 8.0.x series. Drupal 8.0.x will not receive any further development aside from security fixes. Drupal 8.1.0-rc1 is now available and sites should prepare to update to 8.1.0.

Bug reports should be targeted against the 8.1.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.2.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.1.x-dev » 8.2.x-dev

Drupal 8.1.9 was released on September 7 and is the final bugfix release for the Drupal 8.1.x series. Drupal 8.1.x will not receive any further development aside from security fixes. Drupal 8.2.0-rc1 is now available and sites should prepare to upgrade to 8.2.0.

Bug reports should be targeted against the 8.2.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.3.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.2.x-dev » 8.3.x-dev

Drupal 8.2.6 was released on February 1, 2017 and is the final full bugfix release for the Drupal 8.2.x series. Drupal 8.2.x will not receive any further development aside from critical and security fixes. Sites should prepare to update to 8.3.0 on April 5, 2017. (Drupal 8.3.0-alpha1 is available for testing.)

Bug reports should be targeted against the 8.3.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.4.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.