Problem/Motivation

The _potx_find_language_names() function fails to parse LanguageManager.php in Drupal 11 due to PHP array syntax changes.

Drupal 11's LanguageManager.php uses modern short array syntax:

                                                     
  public static function getStandardLanguageList() {                                                                                                                                                
    return [                                                                                                                                                                                        
      'af' => ['Afrikaans', 'Afrikaans'],
      // ...                                                                                                                                                                                        
    ];                                                      
  }                                                                                                                                                                                                 
  

But potx's parsing logic only handles old array syntax:

  public static function getStandardLanguageList() {
    return array(                                                                                                                                                                                   
      'af' => array('Afrikaans', 'Afrikaans'),
      // ...                                                                                                                                                                                        
    );                                                      
  }                                                                                                                                                                                                 
  

The parser looks for T_ARRAY tokens which are only generated by array() syntax. The modern short array syntax [] uses character tokens ([
and ]) instead, causing the parser to fail extracting language names.

Proposed resolution

Update _potx_find_language_names() to handle both array syntaxes by:

  1. Looking for T_DOUBLE_ARROW (=>) tokens to locate each language's inner array
  2. Detecting whether the inner array uses T_ARRAY (old syntax) or [ character (new syntax)
  3. Extracting language names from both array(...) and [...] structures

On the 7.x-3.x branch, also extract _potx_parse_php_file() as a separate function from _potx_process_file() to enable direct testing of PHP parsing logic without
duplicating tokenization code in tests.

Remaining tasks

None.

User interface changes

None.

API changes

7.x-3.x only: New function _potx_parse_php_file() extracted from _potx_process_file(). This is an internal API function for parsing PHP code.

Data model changes

None.

AI usage

AI assistance was used to generate this issue summary and to generate the fixes. Claude Code Soonet 4.5 to be more precise.

Issue fork potx-3570518

Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Comments

stefan.korn created an issue. See original summary.

debdeep.mukhopadhyay’s picture

Hello @stefan.korn,
thank you for your work on the German translations.

I was able to reproduce this locally. After updating translations, the file
drupal-11.3.2.de.po does not contain the translation:

msgid "German"
msgstr "Deutsch"

Running grep "German" drupal-11*.de.po returns no result for Drupal 11, while the same translation exists in the Drupal 10.6.2 PO file.

With respect, this seems to be a missing translation on localize.drupal.org for the Drupal 11 branch rather than a code issue.

Thank you.

joachim namyslo’s picture

Basically, I can already tell you that you are welcome to write in German here. I don't know if the issue has been moved from the core, but here in the project, you can and should even write in German, because translating the user interface is a task for Drupal beginners. Carsten does the same when he replies here. Greetings from rus. If in doubt, they just want to read along, and they are welcome to do so in their native language. I don't think we have a problem here. I'll take a look at it now and then get back to you.

joachim namyslo’s picture

Hallo zusammen,

ich habe mir die Mühe gemacht, das lokal in einer frischen Drupal 11.3 Installation zu untersuchen. Mein Verdacht ist, dass das Fehlen der Übersetzungen für "German" (und andere Sprachen) kein Fehler in den Übersetzungsdaten selbst ist, sondern eine Folge von Code-Änderungen im Drupal 11 Core.

Hier ist meine Herleitung:

  1. Code-Analyse: Eine Suche im gesamten core/-Verzeichnis nach dem statischen String liefert kein Ergebnis:

    grep -rnE "t\(['\"]German['\"]\)" core/ -> Keine Treffer.

    Im Vergleich dazu liefert die Suche nach t('Log in') durchaus Treffer. Das beweist, dass das Wort "German" im Core nicht mehr innerhalb der t()-Funktion vorkommt.
  2. Fundstelle im LanguageManager: Ich habe die Stelle gefunden, an der die Sprachnamen definiert sind. In core/lib/Drupal/Core/Language/LanguageManager.php in der Funktion getStandardLanguageList() stehen die Namen "nackt" in einem Array:

    'de' => ['German', 'Deutsch'],
  3. Vermutung: Da hier kein t()-Wrapper (oder ein Translation-Wrapper) verwendet wird, kann der automatisierte Parser von localize.drupal.org diese Strings nicht als übersetzbare Texte erkennen und extrahieren. In Drupal 10 waren diese Begriffe vermutlich noch an anderer Stelle im Code (z.B. in mittlerweile entfernten Formular-Definitionen) durch t() abgesichert. Durch das "Aufräumen" im Drupal 11 Core sind diese Referenzen nun verschwunden, wodurch die Strings für den Parser unsichtbar wurden.

Vielleicht kann das jemand aus dem Core-Team validieren? Wenn die Strings nicht im .pot-Template landen, können sie logischerweise auch nicht in der .po-Datei auftauchen.


Hi everyone,

I took some time to investigate this locally on a fresh Drupal 11.3 installation. My suspicion is that the missing translations for "German" (and potentially other languages) aren't an issue with the translation sets themselves, but rather a side effect of code refactoring within Drupal 11 core.

Here is my analysis:

  1. Code Search: Searching the entire core/ directory for the static string yields no results:

    grep -rnE "t\(['\"]German['\"]\)" core/ -> No matches.

    In contrast, searching for t('Log in') returns results. This confirms that "German" is no longer wrapped in a t() function anywhere in the core codebase.
  2. Discovery in LanguageManager: I located the definition of these strings. In core/lib/Drupal/Core/Language/LanguageManager.php, within the getStandardLanguageList() method, the language names are stored as plain strings in an array:

    'de' => ['German', 'Deutsch'],
  3. Hypothesis: Since these strings lack a t() wrapper or any other translation hint, the automated parser on localize.drupal.org cannot identify or extract them. It's likely that in Drupal 10, these strings were still present elsewhere in the code (e.g., in now-removed form definitions) wrapped in t(), which kept them in the translation templates. With the cleanup in Drupal 11, these references are gone, making the strings "invisible" to the parser.

Could someone from the core/localization team verify this? If the strings don't make it into the .pot template, they won't appear in the exported .po files either.

drumm’s picture

Project: German translation » Translation template extractor
Version: master » 8.x-1.x-dev
Component: PO files » Code

Moving to potx module since this currently seems like an issue with translation template extraction.

gábor hojtsy’s picture

Version: 8.x-1.x-dev » 2.x-dev
Category: Support request » Bug report
Status: Active » Postponed (maintainer needs more info)

This is where the standard language list PHP is parsed. Anyone want to check why is this not parsing the language names in Drupal 11 anymore? I assume there is PHP syntax changes that this code is not ready for anymore.

function _potx_find_language_names($file, $save_callback, $api_version = POTX_API_CURRENT) {
  global $_potx_tokens, $_potx_lookup;

  if ($api_version > POTX_API_7) {
    $key = 'getStandardLanguageList';
  }
  elseif ($api_version > POTX_API_5) {
    $key = '_locale_get_predefined_list';
  }
  else {
    $key = '_locale_get_iso639_list';
  }

  foreach ($_potx_lookup[$key] as $ti) {
    // Search for the definition of _locale_get_predefined_list(), not where it
    // is called.
    if ($_potx_tokens[$ti - 1][0] == T_FUNCTION) {
      break;
    }
  }

  $end = _potx_find_end_of_function($ti);
  // Function name, (, ), {, return, array, (.
  $ti += 7;
  while ($ti < $end) {
    while ($_potx_tokens[$ti][0] != T_ARRAY) {
      if (!is_array($_potx_tokens[$ti]) && $_potx_tokens[$ti] == ';') {
        // We passed the end of the list, break out to function level
        // to prevent an infinite loop.
        break 2;
      }
      $ti++;
    }
    // array, (.
    $ti += 2;
    // Language names are context-less.
    $save_callback(_potx_format_quoted_string($_potx_tokens[$ti][1]), POTX_CONTEXT_NONE, $file, $_potx_tokens[$ti][2]);
  }
}

gábor hojtsy’s picture

Title: missing translations in po file for Drupal 11 » Core language list parsing is likely not Drupal 11 compatible anymore
gábor hojtsy’s picture

Status: Postponed (maintainer needs more info) » Needs review

I have a local setup of potx and I asked Claude this for demo purposes:

why would _potx_find_language_names not be able to parse LanguageManager.php in Drupal 11?

Answer from Claude is spot on:

The _potx_find_language_names() function fails to parse LanguageManager.php in Drupal 11 due to a PHP syntax change.

The Problem

Drupal 11's LanguageManager.php uses modern short array syntax:

public static function getStandardLanguageList() {                                                                                                                                     
    return [                                                                                                                                                                                        
      'af' => ['Afrikaans', 'Afrikaans'],                                                                                                                                                           
      // ...                                                                                                                                                                                        
    ];                                                      
  }

But potx's parsing logic expects old array syntax:

public static function getStandardLanguageList() {
    return array(                                                                                                                                                                                   
      'af' => array('Afrikaans', 'Afrikaans'),              
      // ...                                                                                                                                                                                        
    );
  }

Technical Issue

In _potx_find_language_names() at line 1662:

while ($_potx_tokens[$ti][0] != T_ARRAY) {

The function looks for T_ARRAY tokens, which are only generated by the old array() syntax. The modern short array syntax [] uses different tokens
([ and ] characters), so the parser never finds what it's looking for and can't extract the language names.

The Fix

The function needs to be updated to handle both syntaxes:

  1. Old syntax: Look for T_ARRAY tokens
  2. New syntax: Look for [ bracket characters

The parsing logic would need to:

  • Detect which array syntax is being used
  • Handle the different token sequences appropriately
  • Extract language names from both array(...) and [...] structures

This explains why potx works fine with older Drupal versions but fails with Drupal 11's modernized syntax in the core LanguageManager.php file.

gábor hojtsy’s picture

Summary from Claude that I agree with:

Summary of Changes

Problem: _potx_find_language_names() could not parse LanguageManager.php in Drupal 11 because:

  1. The function only looked for T_ARRAY tokens (old array() syntax), but Drupal 11 uses modern [] syntax which creates different token sequences.
  2. The function broke out of parsing when encountering any semicolon, which caused premature exit in functions with control flow (like early return []; statements).

Solution:

  1. Changed the parsing logic to look for T_DOUBLE_ARROW (=>) tokens instead of array openings. This correctly identifies inner language arrays like
    'af' =>
      ['Afrikaans', ...]

    regardless of whether they use array() or [] syntax.

  2. Removed the premature semicolon break since the $end boundary from _potx_find_end_of_function() already properly handles the function scope.

Test Coverage Added:

  • testDrupal8LanguageManager - Tests old array() syntax with multiple languages
  • testDrupal8LanguageManagerShortArraySyntax - Tests new [] syntax
  • testDrupal8LanguageManagerMixedSyntax - Tests mixed array() outer with [] inner arrays
  • testDrupal8LanguageManagerEdgeCases - Tests functions with control flow (early returns, conditionals)

All 4 tests pass.

gábor hojtsy’s picture

Issue summary: View changes

Updated issue summary directly from Claude.

gábor hojtsy’s picture

Title: Core language list parsing is likely not Drupal 11 compatible anymore » Core language list parsing is not Drupal 11 compatible anymore
gábor hojtsy’s picture

Issue summary: View changes