With caching enabled, all users get the same cached page at e.g.

ca/node/32

However, with multiple languages enabled, the expected page there should differ by language.

Here is a code snippet used elsewhere that accomplishes something similar:


  // We need to store separate page caches here per-locale.
  // Alter the REQUEST_URI to include the locale string, which Drupal will
use
  // to save cached versions of the page.
  $parts = parse_url($_SERVER['REQUEST_URI']);
  $new_uri = $parts['path'] . (empty($parts['query']) ? '?locale='.
$_COOKIE['locale'] : '?'. $parts['query'] .'&locale='. $_COOKIE['locale']);
  $_SERVER['REQUEST_URI'] = $new_uri;

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

stella’s picture

I enabled "normal caching" on my test site. I then created two countries 'ca' and 'us' and languages 'fr', 'en', 'en-CA', 'en-US' and 'fr-CA'. I enabled country based language negotiation and enabled the country switcher and language switcher blocks.

First I created a node with multiple translations and used the country / language switcher blocks to switch between different translations. It all worked fine. However this is probably because each translation is a separate node, so the nid is different for each translation, so there's no caching involved.

I then imported the fr.po translation file into 'fr-CA'. I viewed the 'ca/admin' page and it correctly appeared in English. I then used the language switcher block to switch to 'fr-CA' and the interface text was correctly translated into French. The language switcher blocks adds a GET param to the URL (?language=fr-CA) however, so I repeated the same test but instead of switching languages on the page I was interested in, I had a separate tab open on a different page and switched languages there. Then on the tab I was interested in, I just refreshed the page and the language was correctly changed from English to French.

So can't reproduce this issue yet...

catch’s picture

stella: were these tests done as an anonymous user?

stella’s picture

catch: no they weren't. I've since repeated the tests. I've noticed as anonymous and authenticated user, that if you view the registration / profile edit forms and switch languages, not all of the visible text is translated. Things like breadcrumbs and form labels are, but, for example, the Legal terms & conditions aren't (with Legal multilingual patch applied).

I've also noticed on node/add/ the text for the content type names are cached. So for example 'Poll' always appears even though I am viewing the site in French (Sondage) or German (Umfrage). However, I've tested this on a lot of other pages and everything is translated...

Again no problem when viewing nodes.

However, I then disabled the country code module, cleared the cache, and the same behaviour was exhibited. I then disabled caching, cleared the cache and again the same thing happened. It may be that I need to start over with a clean installation.

zroger’s picture

I can reproduce like this:

I have countries set up for US and France. Page caching is turned on, with a 1 minute minimum cache time.

1) As admin clear cache.
2) In your settings.php add $conf['country_code_ip_address'] = '62.23.8.157';. This will hard code a french IP address.
3) Log out.
4) Browse to the site root (/). You should now see the home page as France. This is the page that will be cached as the site root (/). This simulates a french user being the first user after a cache refresh to hit the home page.
5) Remove the IP address setting from settings.php
6) In a new browser, pull up the home page. I now the the cached french home page, even though my IP should be detected as a US ip.

zroger’s picture

FileSize
938 bytes

Here's a patch to get around this. Using the above technique, I am appending a country_code_cache variable to the REQUEST_URI in hook_boot(), before the page caching happens. I originally wanted to use the language code, but language isn't available until after page caching occurs.

I decided to use country_code_cache as the variable name to try to avoid any collisions that may occur. I'm not sure if this is really a problem though.

zroger’s picture

Status: Active » Needs review
zroger’s picture

This seems to be a core issue. I did the following test on a clean Drupal 6 install, with only locale and translation turned on. I have 2 languages set up, English (default) and Spanish. Page caching is set to "normal", and language negotiation set to "path with language fallback".

I used 2 browsers to cleanly represent 2 separate users. I set Firefox to use Spanish as the first language, to simulate a native spanish speaking user. I used Safari with no changes to represent an english speaking user.

The domain I was testing on is simply drupal6 so the site root is http://drupal6/.

1) While logged in as admin, clear the cache.
2) Log out. Since I was in Firefox, I ended up at http://drupal6/es
3) Go to the site root (without the language prefix). The page is still in spanish.
4) Now, in Safari pull up the site root. The page is in spanish.

catch’s picture

I've updated #282191: TF #1: Allow different interface language for the same path which looks like the most relevant.

catch’s picture

None of those issues quite dealt with the specific bug outline by Roger, so I've opened a new core bug (copy pasted steps to reproduce) here - #337378: Interface language is cached for anonymous visitors.

catch’s picture

Should also have mentioned Roger's patch fixes the bug from my testing. I don't see the core issue being fixed any time soon, and this seems like a reliable option until then.

nedjo’s picture

Status: Needs review » Needs work

The patch relies on country, so will work only if there is a single language per country.

To make this approach work, we would need to move all the language determination to hook_boot from hook_init. This could be done, but would be costly, as we can't depend on availability of needed functions.

catch’s picture

OK, so we don't have the full language system, but since this is only for salting the cache keys, I'm wondering if we could use the first 2/3rds of http://api.drupal.org/api/function/language_from_browser/6 - then replace the call to language_list() with a variable_get('language_list', '') - which we populate in a #submit, or cron, so it's available during hook_boot(). No patch yet, since I'm not yet sure how the combination of language and country is best handled, or for that matter whether it's viable to use such thin negotiation like that.

nedjo’s picture

Well, I believe we really need the same language determination everywhere. Otherwise we'll get languages not appropriate for a country, etc.

catch’s picture

We'd get cached records for those languages, but presumably the actual cached page would be determined by our normal language determination since the page is cached after all that has run. My hope then, is that at worst we'd get some unwanted cache misses, and maybe some duplicate cache records - but correct content displayed - since every country-language combination would be assigned a different REQUEST_URI. Obviously not ideal, but the patch should be small enough that it might be worth trying out before doing more fundamental changes.

nedjo’s picture

I don't see that it helps at all to determine a language that *might* be the one that is actually used. We'll still have the same issue that the cache ID is not the same as the actual language the user gets.

E.g., user is using browser in English but registers and sets language preference on site to French. Is the first to browse a lot of pages in France site. Cache id is France, English, but all actual cached pages are in French. So all users with English set in browser now get pages in French.

catch’s picture

The page cache is only set by anonymous users though. http://api.drupal.org/api/function/page_set_cache/6

So in that case, unless anonymous users can set a language preference, then the language would be determined by either 1. language detection/default settings 2. The path of the page. Possibly a combination of the two. Assuming our handling of language detection in hook_boot either uses, or mirrors what core does, then I think it ought to work. It's not ideal of course, but for consideration anyway.

nedjo’s picture

Current plan is to implement a minimal language determination at hook_boot for anonymous users if:

* caching is enabled
* the session language is not set, and
* the user is not switching languages using the language switcher.

Likely we'll want to directly query the database at this point as we can't rely on methods like menu_get_object() and node_load().


if (arg(0) == 'node' && is_numeric(arg(1)) {
  $node_language = db_result(db_query('SELECT language FROM {node} WHERE nid = %d', arg(1)));
}

catch’s picture

Status: Needs work » Needs review
FileSize
8.93 KB
7.18 KB

Here's an initial patch. It turns out the only non-trivial functions we need in hook_boot() are menu_get_object() and node_load() - so country_code_init_language() changes to country_code_boot_language(), and we run it in hook_boot only if caching is enabled and the user is anonymous - then append the language and country code to REQUEST_URI using Roger Lopez's method above. Otherwise the code runs in hook_init() as usual. Copied language_from_browser verbatim to save loading the whole language system (for now at least) since it's only a small function.

Loading enough code to actually run hook_boot and menu_get_object() doesn't seem like a viable option - since we'd have to load all the modules implementing hook_nodeapi('load') too. So I went with a function_exists and a direct database query as mentioned by nedjo above. There's a @TODO to remove this when the core bug is fixed - since those bits aren't exactly pretty.

This means that anonymous users ought to get consistent language determination - and apart from the direct node table query, the methods used to determine this for anonymous and authenticated users are identical. We're doing a bit of extra work for each anonymous page request, but I've tried to minimise this as much as possible. Posting this up for a look at the approach, not done thorough testing yet.

Edit - forgot to mention two things:

1. $country comes up as undefined variable - because it never gets set, and as far as I can see it's a bug in the existing code, maybe that hunk isn't getting run in normal usage? I'm not really clear where that's supposed to be coming from either. There's some horrible commenting out of that hunk for now and a big @TODO

2. drupal_get_normal_path() isn't available during hook_boot() either - not started on that bit yet.

catch’s picture

Status: Needs review » Needs work

so this is still needs work for a bit.

catch’s picture

FileSize
9.01 KB

drupal_get_normal_path() requires path.inc - so I've added a require_once() for that - although I'm pretty sure it won't work if drupal is installed in a subdirectory. Not yet looked at what's supposed to be happening with $country.

nedjo’s picture

Thanks for the patch.

It looks like we need to fundamentally rethink our approach however.

There are two major issues.

1. In _drupal_bootstrap(), page_get_cache() is called *before* hook_boot() is invoked:


      // Get the page from the cache.
      $cache = $cache_mode == CACHE_DISABLED ? '' : page_get_cache();
      // If the skipping of the bootstrap hooks is not enforced, call hook_boot.
      if ($cache_mode != CACHE_AGGRESSIVE) {
        bootstrap_invoke_all('boot');
      }

So we can't affect the cache ID in hook_boot().

2. Even if we reset $language in hook_boot(), it will be overwritten at DRUPAL_BOOTSTRAP_LANGUAGE and will cause alias failures before we can reset it again in hook_init(). (See #327487: Aliases broken for translated content, re-opened.) We need to act at the DRUPAL_BOOTSTRAP_PATH phase, which comes between language and full bootstrap.

So we need to be able to act *before* hook_boot() is called. And we need to reset language (and therefore fix aliases) *after* DRUPAL_BOOTSTRAP_LANGUAGE but *before* hook_init() is called.

Roger López is probably right in #327487: Aliases broken for translated content, re-opened in saying that our best and maybe our only option is to use our path rewriting, which will be called if present at DRUPAL_BOOTSTRAP_PATH.

But since we need to act before hook_boot() is called, and hence before any module code is present, our only option seems to be to include code in settings.php.

The first draft of country_code included rewriting of settings.php to load an include file that includes the path rewriting functions. We could go back to this approach.

However, we have one key remaining issue. We still need to affect the cache ID to reflect the page's language before page_get_cache() is called. We can't do this directly in settings.php because at that point we don't have database access. And there is no obvious opening between DRUPAL_BOOTSTRAP_LANGUAGE and the page_get_cache() call at DRUPAL_BOOTSTRAP_LATE_PAGE_CACHE.

I'm not seeing a lot of choices here. I guess we could submit a patch to get the page_get_cache() call moved after hook_boot() invocation, allowing modules to affect the cache ID. There's no clear reason why it comes first.

nedjo’s picture

zroger’s picture

Just a quick thought. Admittedly, I haven't thought this through completely.

It seems like we could save ourselves a lot of grief is we go with a system that maps each path to a language in a 1:1 relationship, and force redirects in situations where this wouldn't match.

- no prefix - after country and language are determined, force a redirect to the appropriate page. This would avoid caching of a non-language prefixed page.
- country prefix, no language - in the case where this country has a single language, serve the page; otherwise redirect as above.
- country prefix and language prefix - serve the page as usual.

In any of the above situations would serve a cached page in the requested country/language combination. For non-cached pages,

This assumes that countries with a single language can use only a single prefix, like /us/node/123. Countries with multiple languages would be forced to use a 2 prefix path like /ca/fr/node/123.

nedjo’s picture

Roger: This approach seems like it might indeed be our best solution if we can't get any changes into core.

I posted a D6 patch today on the bug at #339958: Cached pages returned in wrong language when browser language used. It is looking like a fix to that issue, if one is accepted, might open new possibilities.

mr.j’s picture

Subscribing