I should have filed this under "Translation Templates" but I couldn't because of this issue:
http://drupal.org/node/63350

This can be corrected, I guess.
----------------

general.pot contains the full month names and locale-module.pot contains their abbreviations.

However, the abbreviation for May (=May) is missing from locale-module.pot, because it is the same in English and general.pot has collected it. As a result, in a language where the abbreviation for May is different, you can't specify a translation for it. I think a "May" string should stay in locale-module.pot with the other abbreviations. Or wouldn't that work?

I would have marked this as "minor", but I suspect it could have a few general implications for pot generation, so I made it "normal".

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

killes@www.drop.org’s picture

The problem isn't where the template is stored. The problem is that you can translate each string only once and that English has no abbreviation for May. This causes a problem eg. for Spanish. I've no idea how to fix it, though.

cog.rusty’s picture

So, only the English string itself is taken into account? When I go to the stings management page and search for "May", I get:

May
modules/archive.module:256, modules/locale.module:0, ;0

It seems that the information from the imported translation is saved and the question is whether and how it is utilised. I was assuming that if I split the different references to the same string I would get different translations...

The context is also important for words with genders. In Greek, for example, adjectives have genders and plural. An "Active" user (not Blocked) is male singular, while an "Active" poll (not Closed) is female singular, but they go under the same string.

killes@www.drop.org’s picture

Right, each string can only be translated once, regardless where it occurs. I concur this is a problem in some cases, but I have still no idea how to fix it.

cog.rusty’s picture

I guess it is not possible for the the t() function to know who asked, so that it can compare to the stored context information.

Perhaps an optional second argument for t() for the difficult or custom cases?

Gábor Hojtsy’s picture

CogRusty, the problem is that t() is very sensible performance-wise, so adding other conditions to it is not really a good idea unfortunately.

magico’s picture

Version: 4.7.0 » 4.7.4
Status: Active » Fixed

I do not quite understand what is the problem here.
general.pot has the following

#: modules/archive.module:256 modules/locale.module:0;0
msgid "May"
msgstr ""

allowing the translation of the "May" string.

killes@www.drop.org’s picture

Version: 4.7.4 » x.y.z
Status: Fixed » Active

This issue is unfortunately not fixed. In English, there is no separate abbreviation for the month May because the word is already so short. In Spanish for example the word is Mayo and the abbreviation is May. If we use English as the original language then the source strings for both the abbreviation and the full word are the same and can't be translated separately.

magico’s picture

Ahhh! Now I understand. This is indeed an issue that needs solving.
And what if we treat months name like a "string" (January, February, March, ...) with the abreviations like a "string" to (Jan, Feb, March, ...) having then an helper function that "splitted" the string to an array.

This way, we would use the same locale translation tools, and we only need to use the "helper" function to get the month instead of calling directly the t().

Example, today we use:

$bla = t('February');
print $bla;

We could start to use:

$bla = get_months();
print $bla['February'];

where get_months()

$x = t('January, February, March, ...);
// return an associative array with the english terms and the translated ones

What do you think?

cog.rusty’s picture

Interesting. I guess the two imploded months and abbreviations strings will be in the pot file so that they can be translated.

But at which point should they be exploded to arrays of strings? In a wrapper for date function internally aware of t()? In a special case in t() aware of dates? Outside, by the developer? In the theme by the user who has the problem?

I hope the questions makes sense because I don't know much about php.

magico’s picture

Interesting. I guess the two imploded months and abbreviations strings will be in the pot file so that they can be translated.

Yes.

But at which point should they be exploded to arrays of strings? In a wrapper for date function internally aware of t()? In a special case in t() aware of dates? Outside, by the developer? In the theme by the user who has the problem?

It could be done in a wrapper for date function -- eg: the current format_date() could support this -- or then outside, using code by the developer on the situations he needed.

kkaefer’s picture

Version: x.y.z » 6.x-dev
Component: other » locale.module
Status: Active » Needs review
FileSize
1.16 KB

The attached patch solves this issue by not piping "May" through t(), but by adding a comment: "May <!-- Long month name (remove this comment in your translation) -->". That way, we get separate strings for the abbreviation and the full name. The comment is removed afterwards (if it’s still there).

This requires a patch to potx.inc (see next issue).

kkaefer’s picture

FileSize
769 bytes
Gábor Hojtsy’s picture

This looks like an awkward, hackish way to solve the problem. What about a less hackish solution. I was thinking about this yesterday, and had an idea of translatable month name lists. If we have a list of month names abbreviated and a list with the full month names, we can easily provide the May translation in two ways. So format_date would t('January, February, March, April, May, ...') and t('Jan, Feb, Mar, Apr, May, ...') then split by ", " and use the list. This is unfortunately a tiny bit more resource consuming, although format_date could check the resulting array after the split(t()), and if translators fiddle with the commas improperly, or miss a month name or two, we get broken date translations.

Your solution is a tiny bit more performant than this suggestion, but look way too hackish AFAIS and could easily end up broken, if translators do not repeat the comment as-is (or do not remove it).

Anyway, let's think about a more elegant solution by somehow giving more context more intuitively then adding a big chunk of HTML comment.

magico’s picture

Gábor look at #8.

Anyway, I would prefer #12, because it's less resource consuming...

BTW, seems wordpress is doing something along this line of think http://codex.wordpress.org/Localizing_WordPress#Date_and_Time_Locale_Set...

Gábor Hojtsy’s picture

Hm, I looked at #8, and had a good laugh about how ewe think along :) Having two month name lists, we would only need to t() each of them once per request, as well as only need to split them once per request. We can store the resulting array in a static var in format_date(). Performance effects:

- longer string to t(), so we don't get it cached
- we need to split it once per request
- static variable needs small amount of memory for permanent storage (think context switching costs)

The string suggested by kkaefer is around the short string cache size limit, so it could still mean an SQL query on the locale tables every time, when a long month name needs to be printed. Some short sighted translators would still translate what is in the comment, so we would get broken output (eg. in emails, where the HTML comment has no hiding advantage for us).

I also advocted experimenting with some placeholder, of which the translators already know that it should not be translated, like: t('@longmonth May', array('@longmonth' => '')), if we'd like to "tag" long month names.

kkaefer’s picture

FileSize
1.03 KB

Changed the patch as per Gábor's suggestion.

kkaefer’s picture

FileSize
719 bytes

And the potx patch.

Gábor Hojtsy’s picture

Project: Drupal core » Translation template extractor
Version: 6.x-dev » 5.x-1.x-dev
Component: locale.module » Code
Status: Needs review » Reviewed & tested by the community

Thanks for the patch. I modified it to use ! instead of @ (although clearly, I suggested @), because ! does not involve an escape call, so is slightly more performant. Committed to Drupal 6.

Now, because potx has no D6 branch yet, I reclassify this issue for potx and leave it open for now. It would be an error to commit it to the D5 branch.

Gábor Hojtsy’s picture

Project: Translation template extractor » Drupal core
Version: 5.x-1.x-dev » 6.x-dev
Component: Code » language system
Status: Reviewed & tested by the community » Fixed

Thanks, now committed to potx-6.x-dev too. (So putting back to Drupal 6.x as the issue itself was for that project).

Anonymous’s picture

Status: Fixed » Closed (fixed)
ywarnier’s picture

Apparently this bug returned in Drupal 7. I'm not sure what procedure should be taken (whether to re-open this bug or to open a new one). Please advise. Just to make sure I don't forget it before I move on to something else more urgent, I'm taking a few notes here.

The problem can still be viewed easily in any date form in Spanish: "May" is translated as "Mayo", whereas any other month is translated to the 3-letters version of the month.

The code relating to this now seems to be located in the _format_date_callback() function in includes/common.inc, where I can recognize patterns similar to the ones in #16 but apparently the potx module disappeared and I haven't followed the whole Drupal history :-)