There are multiple properties of a field that need translation to different languages. The core ones that come to mind are:

- its title (label)
- its description
- its default value
- its allowed values

Now Drupal core field modules *sometimes* do translation for the label (bust most of the time they don't) and just one time do translation for the description. First of all this is not at all consistent. The label is used many times untranslated, like in text.module references, while same style references in option.module or number.module use t(). For description, the multi-value field descriptions were translated on form display, single value descriptions were not. For default values and allowed values, no translation possibility exists.

Now, this is not just inconsistent, it is also pretty bad to use t() for this (as documented on t() btw), because all of the above data is user provided. One of the main characteristics of t() is that it relates translations to the source data. So if you fix a typo in your default value or add a new allowed value, you suddenly loose all translations for it and need to start all over again. Also, it mixes up with your default built-in UI translation strings, which do not contain your other translatables like views, blocks or menus.

So to be consistent and technically right, core should stop using t() for translating user defined strings, just like the documentation for t() says. There is nothing new here :) Then to make field labels, descriptions, default values and allowed values translatable, like for every other thing in Drupal core, i18n module includes a submodule called i18n_field. Now due to how Drupal core does these things in broken and inconsistent ways, that module is in kind of a limbo, not really sure what it should actually do. It strives to provide localization support for these 4 things on fields, but given how Drupal does this non-standard and inconsistent, it cannot do it in a reasonable and user friendly way.

Therefore I suggest we fix Drupal core to not use t() in a way that it was not supposed to be and then we can figure out the complete solution in i18n_field. This problem applies and would be great to get fixed in Drupal 7 as well soon.

CommentFileSizeAuthor
t-in-fields-is-evil.patch6.7 KBGábor Hojtsy
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Gábor Hojtsy’s picture

plach’s picture

Status: Reviewed & tested by the community » Needs review

The approach looks good to me, one thing that I would fix is that known default values aren't localized. For instance node_add_body_field() has the $label parameter that defaults to 'Body'. IMO it would be cleaner if it defaulted to t('Body', NULL, array('langcode' => language_default()->language)).

Pasqualle’s picture

Status: Needs review » Reviewed & tested by the community
Issue tags: +Needs backport to D7

the patch only removes t() from places where it should not be, so it is RTBC

plach’s picture

The subject of the OP is not about removing t() is about having a consistent and correct use of t(), so my point in #3 at least deserves a feedback.

Gábor Hojtsy’s picture

@plach: the focus of this patch is to remove runtime t() of strings. The core and contrib installer functions often use t() when pre-creating user editable things, which is a great way to set up a single foreign language site (but is mostly bad for multilingual setup). I agree we should look into creating and saving the fields with t() if for nothing else, for consistency, not sure it belongs in this patch. This is about avoiding misusing t() and enabling contrib modules to work with a consistent environment so they know what data they work with.

plach’s picture

I still think we should have a unique issue to address this: if for any reason we don't fix the behavior I'm referring to in #3 we might end up with i18n_field allowing field properties to be translated but, when the default language is not english, with wrong default values. AAMOF translated default values could be translated back through i18n_field, where non-translated values could not. This is an inconsistent behavior too.

@Pasqualle:

Anyway, feel free to RTBC this again.

Damien Tournoud’s picture

We need to move forward here. We need to stop coming up with one-shot solutions like i18n_fields and implement something consistent in core.

The idea of having a non-translating wrapper (already mentioned here: #503550-12: Translated strings are cached in _info() hooks) sounds like the best approach moving forward. It has been rejected by Angie as a DrupalWTF on wrong grounds, because it is actually a pretty common approach.

I would more strongly adding the missing t()s in the field system then removing them.

plach’s picture

@Damien Tournoud:

Are you referring to a D8 solution or do you think this is feasible in D7? I'm discussing here with the idea that this is a D7 issue which is marked for D8 just because bugs get fixed in D8 first.

Damien Tournoud’s picture

Quick overview of the sub-systems that use t() on data from info hooks and/or user input:

  • Menu system: for title and descriptions from hook_menu()
  • Locale: for language names - including those input by the user
  • DB log: for types of watchdog messages
  • Field system, as mentioned, for field types and labels (from hook_field_*_info() - inconsistently) and for field and instance label and description
  • Menu module: for custom menu names
  • System module: for package names and module descriptions
  • Theme: for region lists
  • User: for role names and permission names (inconsistently for permission names)
Gábor Hojtsy’s picture

@plach: it is in fact very consistent. When you add a block, a view, a menu item, a contact category or set your anonymous username on your site (or your date format), you are not asked about what language you are doing that in. You can add any number of blocks, views, menu items, contact categories, just like fields, and we don't know the language of those either. Now Drupal core really assumes for most things that you are running a single language environment, where your stuff is in some language. For code it assumes it is English for user provided content (again, blocks, views, menu items, etc), it assumes it is your default language. That the blocks, views, menu items, field properties etc. can be in any language regardless of your site default (eg. you might have admins in different languages creating stuff in parallel, or you might change your mind later to switch to a new default language) is a very live problem. It is not at all confined to field API.

Now the inconsistency here is that of all those things, only field API attempts to use the built-in localization system, which means mostly for our assumptions that it assumes all your fields are set up as English and they will not change (see above for issues when changing user editable strings). There are also possible security issues with using t() for user defined strings. If we use it for default values as Damien suggests, you should note that those might have formats assigned. Now once again, t() is not designed to do this, so it has no idea about those formats. Whether you'll be allowed to edit text that is going to be filtered with that format is then not considered. If you default value for a field is PHP input, you'll be able to translate that on the locale UI (but since that has some XSS validation on the input you'll most likely end up with a WSOD, because your submission might not be valid PHP). Dealing with input formats assigned to custom translatable strings is something that a contrib module needs to provide, there are no provisions in core.

In my point of view letting fields work like all other objects in core that they assume their input is single language and in whatever language we consider the site default is what makes this consistent, not to go on and use a system that was not at all designed with the workflows and security needed for user provided strings. Yes, this removes a partially implemented feature which was well intentioned but did not take the complexities of the issue into account.

Of course we can also try to solve this in core (there is a nice initiative for this in Drupal 8), but what we have in contrib is not satisfactory either. It has different problems that need to be solved. At least we can iterate on that in contrib, and intend to provide better backends for these things. We just sprinted almost all week to provide a better user experience (+ tests, and better integration among other things), but this is a huge problem space. It cannot be solved by slapping a t() to everywhere we think we miss translation.

Now we can postpone this for a couple months until we figure out a better backend in D7 contrib and can port it to D8, but in the meantime field translation will be entirely broke in Drupal 7, and would eventually still need a patch like the one I suggested.

I hope this helps get the point across.

plach’s picture

@DamZ:

I guess we crossposted, see #9

Damien Tournoud’s picture

I would like to stress one thing: the same way there is no difference between data and configuration, there should be absolutely no difference between info hook/configuration and user-entered input. Most of the time, the data available in the info hooks are user-entered input.

Gábor Hojtsy’s picture

Menu system: for title and descriptions from hook_menu()

This is coming from code, not user input, is it?

Locale: for language names - including those input by the user

The assumption is this "mostly" comes from code. We can obviously skip doing this for consistency.

DB log: for types of watchdog messages

Any module that passes on dynamic data in that field?

Field system, as mentioned, for field types and labels (from hook_field_*_info() - inconsistently) and for field and instance label and description

Yeah...

Menu module: for custom menu names

That should not happen either. i18n_menu has a solution for handling that. This should be eradicated just like fields.

System module: for package names and module descriptions
Theme: for region lists

These all come from .info files (== code). Do you know of any place where they might come from user input?

User: for role names and permission names (inconsistently for permission names)

Yes, this is a problem, and needs to be solved just like fields.

Damien Tournoud’s picture

Also, @Gabor: read the list from #10: there are plenty of places (including the damn name of custom menus) where we assume that the user-entered text is in English. All of this is nothing more then the discussion about what is the "default language" coming in circles over and over.

Gábor Hojtsy’s picture

@Damien: re #13, yes, the long term solution is definitely that we should know specifically the language of everything in Drupal. Think "Oh, you are adding a block, tell us the language for this", or "Oh, you are setting up a new field, now which language these field properties are in". Then we can mark all stuff coming from code to be language => en, or let the coder specify otherwise if they write their module in French (yes, that happens too). So then we have a clear inventory of what's in which language. Now if you then have a content type with fields coming from code in English and code in French and user input in French, you can translate only the ones in English to French when they need to be translated to French and only the ones in French to English when it needs to be displayed in English. That assumes we know about the language of everything to fine-grained details. I'm not sure how feasible if that in Drupal 7 but is definitely one of the lines being considered for Drupal 8.

Damien Tournoud’s picture

One easy way forward for Drupal 7, that will have less impact than this would be to consider that everything that is an input from the Field API is in English, and just consistently translate that.

We really need to stop trying to make a difference between what comes from code and what comes from user input and either consider that everything is in English or that everything needs to have an explicit language from somewhere. For Drupal 7, I'm getting convinced that it would be easier to just consider that everything is in English.

Gábor Hojtsy’s picture

@Damien: that is already not the case. Random Drupal shops around the world will not create their sites in English and translate it to some language just to they can maybe add multilingual support later. Random Drupal shops install Drupal localized instead, they already get their content types, forum vocabulary, anonymous username, etc. localized in their DB when installed and then they add views, panels, rules, etc. in their random language and a couple months later they decide to add a new language to the site.

Even if they start off with multilingual sites right off the gate, for most people it is just way too much to create views, blocks, content types, fields, rules, etc. all in English by default if the site will never be displayed in English. If the languages you need to support are Spanish and Catalan, why the hell would you create every configuration in English? And then still your node created menu items, etc. will not be English. So you need to create all your nodes in English as well first? Not sure these dreams are realistic at all.

plach’s picture

Status: Needs review » Reviewed & tested by the community

If the choice is between committing this as-is and not having this fixed for months, let's commit it. At least we will be able to have working multilingual D7 sites: we are already able to cope with most of what Damien cited in #13 through i18n. Finding the final solution to translate anything is not content (tm) here, is at very least out of scope.

sun’s picture

I agree with @Damien Tournoud, though not on the English by default idea. But I can definitely see how these t() calls effectively disallow any user string translation of labels via contrib, so at least for D7, it seems to be the right and necessary thing to do.

However, we should keep this issue open after commit and continue the in-depth discussion that started on this issue.

Gábor Hojtsy’s picture

It is also the best solution for Drupal 8 at the moment. We might re-introduce t() once/if t() is made aware of (a) text formats / input widget / rendering (b) source language (c) permissions (d) text independent string keys as discussed in details above. I think once we'd introduce all these to t(), its not t() anymore. (Contrib solutions attempt to solve some of these btw, but not yet all of them).

So let's get this committed to Drupal 8 and 7 too.

webchick’s picture

Issue tags: +webchick approved

Thanks for the detailed and useful discussion here. This definitely looks like the right initial step for unblocking i18n module in D7, as well as moving i18n efforts forward in D8. It also has sign-off from all the right folks. The actual patch itself just makes exactly the same change:


t($instance['label']) -> $instance['label']
t($instance['description']) -> $instance['description']

...in about 10 or 12 different spots, so it's pretty simple. It would delegate responsibility for translating these items to i18n_fields module in contrib for D7 and $forthcoming_solution for D8, since those properties are currently being treated inconsistently from all other similar user-generated field properties.

I will say though from a "way too many head-scratchy conversations with the i18n team trying to understand this stuff" perspective, I hugely agree with Damien/sun on the general concept of obliterating the distinction entirely between user generated vs. in-code content. Damien points out several places where the line is already blurry, and to most normal humans, content is content, and they just want it translated. Whether this is a Drupal 8 or more like a Drupal 10 goal, though, I can't really say. (This is just me musing; please don't let it derail the discussion here.)

But in any case, we need this patch to move forward on any and all i18n/multilingual initiatives, so I agree this should go in.

webchick’s picture

Priority: Normal » Major

And since it blocks i18n module in D7, this is pretty solidly 'major'.

yched’s picture

Just a quick note to add a bit of background :
Those t() calls in field API around field labels, descriptions et al, are remnants from CCK, where those t()s had been added at the request of the i18n module folks. IIRC, it was the only shot they had at providing translatability for those strings back then.

The current inconsistency (t($label) or not t($label) ?) appeared as new code got written in the D7 era, in the indecision whether t() was the correct approach for D7. Several issues about this (marked as duplicate at the top of this threads) had been opened and left unresolved, and the in-between status quo remained.

Now, that inconsistency is of course wrong, and I'm fine with removing t()s if that's what i18n folks think is what gives them the most room of action in the D7 cycle.

Jerome F’s picture

subscribing

Dries’s picture

This patch does not seem to apply for me locally. Asking for a re-test from the testbot(s).

Dries’s picture

Status: Reviewed & tested by the community » Needs work
Issue tags: +Needs backport to D7, +di18nscb, +i18nsprint, +webchick approved

The last submitted patch, t-in-fields-is-evil.patch, failed testing.

plach’s picture

Status: Needs work » Fixed

@Dries:

This patch does not seem to apply for me locally.

Probably because it has been committed two days ago ;)

http://drupalcode.org/project/drupal.git/commit/553737dd012f7bcffdd3373f...
http://drupalcode.org/project/drupal.git/commit/78f4cc01c110731cf657deec...

plach’s picture

Gábor Hojtsy’s picture

Following on the request to keep the discussion going (for which I don't think this issue is the best place), I've posted a summary on why t() is evil for non-code text with 9 detailed reasons: http://groups.drupal.org/node/149724 (also cross-posted at http://hojtsy.hu/blog/2011-may-19/drupals-multilingual-problem-why-t-wro... for wider distribution, but commenting is only allowed on the g.do post). I wanted to cover how to not distinguish between user provided and code based text, but the post got too long already...

plach’s picture

The g.d.o. post is actually here: http://groups.drupal.org/node/149984

Gábor Hojtsy’s picture

Doh, sorry for the wrong link...

Anonymous’s picture

I've posted a task to remove this usage in Examples module as well, #1166254: t() translate function should be removed from field definitions

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

yched’s picture

Another field property that gets run through t() : #1220964: Number field prefix/suffix get t()'ed through format_plural().
Feedback from i10n / i18n folks welcome :-)