Bug description:
On multilingual sites, when caching is enabled and language negotiation is set to "Path prefix with language fallback", page caches are set to the first visitor's browser language preferences in cases of fallback. All subsequent visitors get pages in that first visitor's language rather than their own.
Cause:
Setting the language based on browser language means there are multiple languages in which a particular page/path can be served. However, since page caching is determined on the basis of the request URI, only one cache can be set per page/path.
More generally: The Drupal cache system doesn't support multiple versions of a page per URL. Any time a particular URL may be served in more than one version, the first visitor's version will determine the cache.
Steps to reproduce:
See http://drupal.org/node/330156#comment-1116611
Potential fixes:
-------
1. Drop browser language detection support (or limit it to authenticated users).
This approach would restrict browser language detection to authenticated users, who don't get late page caches, or else remove the "Path prefix with language fallback" option altogether.
Doing so would address the issue but lose some valuable functionality.
2. In the case of a language fallback, forward to the new path, including the correct prefix.
By forwarding, we could ensure that the path matches the language of the page served.
This approach would mean:
- Move the DRUPAL_BOOTSTRAP_LANGUAGE bootstrap phase to come before DRUPAL_BOOTSTRAP_LATE_PAGE_CACHE. This is needed because we need to determine a potential language/path mismatch before consulting the cache.
- In the case that the path doesn't include a prefix for the negotiated language, abort bootstrapping and goto a prefixed path.
3. Include language information in the cache key.
If we determine language before caching, we can include language information in the cache key and so have language-appropriate cache results.
This approach would mean:
- Move the DRUPAL_BOOTSTRAP_LANGUAGE bootstrap phase to come before DRUPAL_BOOTSTRAP_LATE_PAGE_CACHE.
- In page_get_cache() and page_set_cache(), call a helper function to add the language information to the cache key. E.g., append something like
'cache_language=' . $language->language
.
-------
I think all these options are worth considering, as well as other approaches others may come up with.
Options 2 and 3 are significant API changes. They represent two different basic approaches. Option 2 reflects the position that we should never have two different versions of the same URL. Option 3 assumes that having two different versions of the same URL should sometimes be supported. (See this related issue/discussion: #282191: TF #1: Allow different interface language for the same path.)
And even if we solve late page caching, we still have early page caching to worry about. Neither option 2 nor option 3 addresses the problem presented by DRUPAL_BOOTSTRAP_EARLY_PAGE_CACHE. If we have file-based caching, invoked before we have database access etc., how can we negotiate language ahead of invoking that cache so we know which version of a page we're concerned with?
Comment | File | Size | Author |
---|---|---|---|
#139 | drupal-cached-language-fix-339958-139.patch | 624 bytes | nodoxi |
#87 | browser_redirect.zip | 1.54 KB | plach |
#69 | cache_negotiation.patch | 4.99 KB | catch |
#66 | cache_negotiation.patch | 9.63 KB | catch |
#64 | cache_negotiation_D6.patch | 752 bytes | catch |
Comments
Comment #1
nedjoHere's a draft patch for the third approach I outlined above.
Comment #2
Gábor HojtsyIn Drupal 6, to keep the caching keys intact, we should redirect to the proper language URL instead when identified based on browser or user prefs (case 2 above). I am not sure what people think about applicability of that to Drupal 7.
Comment #4
catchdrupal_get_cache_id() was missing a return. Only failure remaining is the hook_boot() hook_exit() test now.
Comment #5
catchAnd the hook_boot/exit test had a bad check for page cache id, replaced it with a similar assertion to the page cache test itself. So back to code needs reviews. Will try to benchmark this next, see what the impact is.
Comment #6
catchThis is on a vanilla install of HEAD with no content, all numbers are requests per second.
No measurable difference without page cache, as you'd expect.
Aggressive caching, APC disabled:
HEAD:
66.29
67.31
67.98
Patch:
66.65
68.43
68.59
With APC enabled:
HEAD
213.60
215.24
214.45
Patch:
211.05
216.68
214.73
Normal cache:
HEAD:
190.17
196.45
191.77
Patch:
188.93
194.83
188.51
As far as I can see, no appreciable difference either way despite the extra include.
Comment #7
c960657 CreditAttribution: c960657 commented+ return $base_root . request_uri() . $language->language;
I suggest adding a space (or some other character not valid in URLs) before the language code to avoid conflicts in rare cases where one language code is a substring of another.
Example:
URL=http://example.org/foo language=es-sv (Spanish, as spoken in El Salvador)
URL=http://example.org/fooes- language=sv (Swedish)
Comment #8
nedjoThanks catch!
If we're introducing the ability to set cache keys enabling different versions of the same page, language is only one of the possible reasons this might be desired.
So, instead of hard-coding language into the page cache key, maybe we introduce a method for setting the page cache key.
And then rework the order in which page_get_cache and hook_boot are called.
If we want modules to be able to affect the cache ID, we need to call hook_boot before fetching the cache. Our current order has page_get_cache called before hook_boot. This order seems designed to allow calling of hook_boot when no cached page is found even if aggressive caching is enabled. But why, since we explicitly say that hook_boot is incompatible with aggressive caching?
I'll look at working that into the next patch iteration.
Comment #9
nedjoChanges in this patch as per #8:
1. Introduce a method for setting the page cache key. Call this method in language negotiation. Resulting page cache keys will look like:
http://example.com/node/21?language=en
2. Change bootstrapping in DRUPAL_BOOTSTRAP_LATE_PAGE_CACHE to call hook_boot before fetching the cache, unless aggressive caching is enabled, in which case call it only after looking for the cache. This way, modules can alter the page cache key if e.g. they use a different method for language negotiation at hook_boot, or otherwise provide different versions of a given page.
@c960657: thanks. This iteration adds a key as well as a value and so shd address potential conflicts.
Comment #10
nedjoHere's a patch for D6 based on Gábor's suggested approach (my original approach 2) of redirecting to language-prefixed paths for Drupal 6 to avoid needing the change cache keys.
Gábor, can you elaborate on the problems that changing the key would cause?
We need to initialized the language system before consulting the cache. And, to do so, we need to initialize the variable system.
I've introduced a new global variable, $language_redirect, that is set to TRUE if user preference or browser settings have determined language. If so, we first call hook_boot - allowing modules to affect the outcome, e.g., by implementing their own language determination - and then conditionally redirect.
The redirection has to take into account the default language. Unprefixed paths are reserved for the default language. We only redirect if we are at a language other than the default one.
And we need to determine the normal path of $_GET['q]. This part of the patch seems particularly awkward--probably I'm overlooking a cleaner way to do this.
Comment #11
catchThe main reason I can think why changing the caching keys would be a problem is immediately after update - leading to a lot of stale keys in the cache_page table, and having to rebuild the cache. While re-building the page cache is expensive, it's a one-off expense, and we'd probably only need to do a cache_clear_all(NULL, 'page') on sites with language determination - the others could be left to slowly clear out and rebuild their page caches over time.
Ideally keys would be exactly the same for monolingual sites - it looks like #9 already accounts for that. While rebuilding the page cache is expensive, that's a one-off expense. Adding the redirect means two bootstraps for every user who needs to be redirected - although I'm not at all clear how many of such redirects might happen on multilingual sites in day to day operation. So, if we could decide whether its relatively safe to change the cache keys for multi-lingual sites (which are the ones affected by this anyway), then it would seem ideal to go for that approach.
Having said that, in both recent patches I'm concerned about moving hook_boot() around, especially for Drupal 6 - and having its execution change dependent on locale settings. Seems more potentially fragile than just going ahead and changing the cache keys. While it makes sense to leave things open for modules to intercept the page cache key setting and generation, some other modules use hook_boot() for completely different purposes, and we'd be adding that weight to normal cached pages for everyone. For Drupal 7 we can document the change (or even add another hook_intercept_page_cache()).
.
Comment #12
catchHere's an initial test to confirm the original bug - it's not working properly yet, I had the language negotiation picking up browser language at one point, but doesn't seem to be any more...
Comment #14
catchhmm, thought .txt didn't get tested (it's supposed to fail, but not those specific failures). reattaching #9.
Comment #15
catchAnd changing status.
Comment #16
c960657 CreditAttribution: c960657 commentedI suggest using something other than
?
, e.g. space, for separating the URL and the remaining part of the cache key. Otherwise the key looks like a URL, but not the actual URL being cached. Using space as separator will make it clearer that the key is actuall a URL and some more.Comment #17
XanoI don't know how much of the pages are cached, but there's a good chance this approach will conflict with #282178: Language negotiation overhaul, since the patch in this issue relies on Drupal having one language set rather than two (content/interface). Perhaps this patch could prepare Drupal for the patch in #282178: Language negotiation overhaul by taking those two languages in account already?
Comment #18
catchSome more benchmarks on #14.
- I unhid system_test.module so that we can see the impact of calling hook_boot() earlier. system_test.module adds a watchdog call to every page request. With HEAD I get about c. 6.45/6.6 reqs/sec with it disabled, c.6.3/6.42 reqs/sec enabled, with no page caching - very small but were it to make a difference on cached pages we'd notice I think.
This is on a completely blank Drupal install, ab -c -n500 - all numbers are requests per second.
APC disabled, no page cache.
HEAD:
6.36
6.37
Patched:
/node
6.49 reqs
6.35
Normal page cache:
HEAD:
42.51
41.35
Patch:
42.45
41.11
Aggressive:
HEAD
67.16
69.65
Patch
66.00
68.21
So no appreciable difference that I can see.
Patch applied with offset, re-rolled to include c960657's feedback - so using a space rather than '?' - also used drupal_query_string_encode() rather than http_build() there. We possibly don't need the encoding at all since this is just a cache key rather than a URL though.
Comment #20
catchNeed to look into those failures, just re-rolled for offset this time.
Comment #21
c960657 CreditAttribution: c960657 commentedI think using http_build_query() is better than drupal_query_string_encode(). The latter depends on drupal_urlencode() whose output depends on a variable, and the function is about to be modified in #339958: Cached pages returned in wrong language when browser language used, so http_build_query() is more stable.
If you use http_build_query(), you should supply the third argument – otherwise the separator character is based on the arg_separator.output PHP ini setting (for some reason, Drupal's default.settings.php sets this to
&
instead of&
).Comment #22
Damien Tournoud CreditAttribution: Damien Tournoud commentedI think the only acceptable approach is #2.
The approach currently worked on here can cause issues, for a very simple reason: a paged served from the cache will be returned with
Last-Modified
andETag
headers, and any proxy server on the way could return the page cached in a different language.After all, it was a main design objective of the language negotiation system of D6 to always return the same content from a given URL. After all URLs are supposed to be stable, and it's the only way HTTP/1.1 caching could possibly work.
Comment #23
c960657 CreditAttribution: c960657 commentedIf Drupal emits a
Vary: Accept-Language
header, proxy servers are able to deal with the same URL returning a page in two different languages. So I think #3 is also feasible, even with HTTP caching.Comment #24
Damien Tournoud CreditAttribution: Damien Tournoud commentedIt's technically possible, but it's always better to have stable URL: one URL should as most as possible always return the same content. I think it's easier and cleaner just to redirect the user.
Comment #25
catchThis is a re-roll of #10 for D7 which enforces a redirect. All tests pass but I've not done much manual testing and no new tests introduced. Note I had trouble writing a test for this earlier in the issue trying to alter ACCEPT_LANGUAGE for a page request - any ideas on how to get around that would be great.
Comment #26
c960657 CreditAttribution: c960657 commentedIt recently became possible to add HTTP request headers to the internal browser in SimpleTest. You should be able to set HTTP_ACCEPT_LANGUAGE using e.g.
$this->drupalGet('', array(), array('Accept-Language: en'))
.Comment #27
catchAnd some modifications in line with the other patch to ensure we don't do page_get_cache() redundantly if we're going to redirect anyway.
Comment #28
catchc960657 - thanks!
Here's a test which simply demonstrates the bug in HEAD rerolled from the one I posted above. Should have two failures.
Comment #30
nedjoNice work catch.
Some of the complexity of this issue is related to the special use of non-prefixed languages. By default, en doesn't have a prefix. But what if the default language does have a prefix? In that case, we will again have duplicate pages. E.g., if 'fr' is the default language and has a prefix of 'fr', the pages 'node' and 'fr/node' will be the same. Should we instead forward in that case even if we haven't found a user-preferred language and are defaulting to the default language (the final fallback in language_initialize())?
To decide this question, it might help to add a test where we set the default language to one with a prefix. Do the tests still pass?
Or for simplicity should we just ignore the question of non-prefixed languages and leave it to the existing issues? #146084: Default path prefix for English (and DBTNG it), #338055: Ensure non-default languages always have a path prefix.
Comment #31
nedjoSince it confirmed the bug we're addressing, that patch failure was a success ;)
Comment #32
catchHere's #27 with a slightly modified test - can't work out how to get simpletest to follow the redirect properly (yet?), but it confirms that the cache is only getting set in the correct language. On default 'en' path and requiring prefixes we should probably let those other issues deal with those and confine this to the bootstrap changes.
Comment #33
c960657 CreditAttribution: c960657 commentedThe literal string should not start with a period. This generates a warning:
Warning: require_once(/home/chsc/www/drupal7./includes/common.inc): failed to open stream: No such file or directory in /home/chsc/www/drupal7/includes/bootstrap.inc on line 1152
Without this warning, simpletest automatically follows the HTTP redirect. You can get test the actual URL using
$this->assertEqual($base_url . '/fr', $this->getUrl())
.The patch actually redirects to http://example.org/fr/node rather than just http://example.org/fr.
You may use the
$this->drupalGetHeader('ETag')
trick used in BootstrapPageCacheTestCase to verify that the page isn't just saved to the cache but also served from the cache on the second page load.Comment #34
c960657 CreditAttribution: c960657 commentedComment #35
catchModified test to check for Etag headers and minus the extra dot - all passes. Thanks again for the in depth review c960657.
Comment #36
catchAdded some additional code comments to the test, otherwise no changes.
Comment #38
catchComment #39
catchHere's a backport for D6. I confirmed it doesn't break massively, but didn't do thorough testing yet.
And a re-roll for D7 to keep the test bot happy.
Comment #40
catchAnd the D7 patch this time.
Comment #42
catchRe-rolled after the user session changes.
Comment #44
catchHere's one which doesn't completely break user logins.
Comment #45
David Lesieur CreditAttribution: David Lesieur commentedBackported to D6. Seems to work nicely! One particular thing: The call to
variable_init()
had to be moved up toDRUPAL_BOOTSTRAP_LANGUAGE
, becausedrupal_init_language()
needs the variables. We don't have theDRUPAL_BOOTSTRAP_VARIABLES
step under D6.Comment #46
David Lesieur CreditAttribution: David Lesieur commentedMmmh. I'm not too sure that the patch (at least the D6 one I have just submitted) works that well.
bootstrap_invoke_all('exit');
appears just before a call todrupal_goto()
, which will also invokehook_exit()
.Even without that apparently redundant call, we might still have issues with the statistics module (that relies on
hook_exit()
), since if we redirect to the same node, the counter will be incremented once before the redirect (fromdrupal_goto()
's invocation ofhook_exit()
), and a second time after (when bootstrapping the redirect's target page).Comment #48
catchGood catch about the duplicate call to hook_exit() - here's the HEAD patch re-rolled without that.
For statistics module, I don't think we need to deal with that here - hitting refresh has the same effect - see #90468: Only record unique hits in node counter stats
Comment #49
catchSince this makes language negotiation and page caching incompatible, I'm bumping to critical.
Comment #50
c960657 CreditAttribution: c960657 commentedAFAICT the return value is stdClass or NULL.
The comment should be removed together with the line it is referring to.
The comments do not wrap at 80 characters:
This line is 82 characters.
The first line is only 66 characters, so there is room for several more words.
Comment #51
catchThanks for the review, I think think this should cover everything.
Comment #54
catchHere's a somewhat simpler approach. If we fall back to browser language, and the language isn't the same as the default language, then we neither serve a cached page, nor cache the page request.
This is done by setting a $no_cache global, and unfortunately still requires changing the bootstrap order to avoid serving cached pages to anonymous users in the site's default language when negotiation is on (if we did that, we might as well completely remove language negotiation from browser as an option).
This version still passes all the tests, including the slightly modified new tests added with the patch.
Note that there's a more comprehensive solution to this for Drupal 7 at #54238: page cache might show messages - but that won't fix the issue here since we still need to have language_initialize() run before BOOTSTRAP_LATE_PAGE_CACHE. I'm hoping we can commit this, backport to Drupal 6 quickly, and then concentrate efforts on the more generic patch for Drupal 7.
Comment #56
catchHere's a straight backport for D6.
Comment #57
catchThought about this some more, and realised that there's no point changing the bootstrap just to avoid serving cached pages to anonymous users where the language is taken from the browser.
So this updated patch is a lot smaller, and doesn't touch bootstrap.inc at all. Also updated the simpletest to remove the check for this edge case.
Quick summary:
The main bug here, is that if your site is in English, German and Swahili, and someone visits your front page with their browser set to request Swahili, all requests for the cached version of that page will see it in Swahili if the browser fallback is enabled. Then 10 minutes later, the cache is cleared and the first anonymous visitor has their language set to German, and so on.
This (hopefully final) version of the patch prevents the page being cached when language determination is used. However to avoid bootstrap re-ordering, visitors will be served cached versions of pages in the site default language when available - since language negotiation happens after the cached page is served. This is a much smaller edge case which only affects a subset of users in certain circumstances, and IMO that particular edge case is won't fix.
This means the approach here will be compatible with the upcoming changes in #370454: Simplify page caching (which are a big win), and it also means much less intrusive changes to Drupal 6.
Comment #59
stella CreditAttribution: stella commentedI think this looks good and can be marked RTBC. Catch has summarized the underlying issue in the last comment, but to briefly summarize the patch:
Comment #60
nedjostella: agreed.
Nice work catch. This latest patch addresses the issue without the potentially disruptive bootstrap changes in previous iterations.
Introducing a new global shouldn't be done lightly, but in this case it seems warranted and could be used for purposes beyond this language-specific use case.
Comment #61
sun$no_cache is not defined anywhere else. So the proper test is
isset($no_cache) && !$no_cache
.However, while I can see and understand the actual issue, the entire, negated logic feels very poor. Please try hard to think of a way to do the opposite, or, at the very least, find a better name for that global variable.
Comment #62
toemaz CreditAttribution: toemaz commentedI applied the #57 D6 patch + the advice from sun at #61 to a production site (http://musescore.org) which was having this issue. Will report back my findings later. Thanks in advance to all developers involved.
Comment #63
catchEven smaller patch. No need to set a new global variable because we already have one (hat tip quicksketch on the imagecache in core issue).
It's not pretty, but it's a tiny change to fix a critical issue, and we absolutely need to backport this - so assuming it can go in I'd support leaving this issue open for a better system in Drupal 7 once that's done.
Comment #64
catchAnd for Drupal 6.
Comment #65
quicksketchThanks catch for pointing me to this issue. You mentioned in the #371374: Add ImageCache UI Core issue that "we need a generic way of doing this" (temporarily disabling the page cache), but the ImageCache issue was not the place to do it. This patch could use a generic way also, and then the ImageCache patch could use it. Would this be the appropriate place to add such a generic function?
Here's a stab at an implementation, which would live in bootstrap.inc below page_get_cache()
Comment #66
catchDoing it as a new function in D6 wouldn't do any harm, so let's try to fix this here. It's only a couple lines change either way anyway.
Don't like the function name though, sounds like we're going to set the page cache temporarily. Here's one with drupal_skip_page_cache() which was the least worst I could think of.
Comment #67
quicksketchNice, I like drupal_skip_page_cache(). Looks like you attached the wrong patch however. ;)
Comment #69
catchoh dear.
Comment #71
Damien Tournoud CreditAttribution: Damien Tournoud commentedAs I stated in #22 and #24, I don't believe that simply skipping page cache is the way to go. I would still prefer that we redirect the user to the proper, language-prefixed, URL. If we don't want to do that, we at least really need to output a
Vary: Accept-Language
header so as not to mess up with proxy and client-level caches.Comment #72
catch@Damien - that was implemented around #48 or so, but requires changing the bootstrap order - a change which would have to be backported to D6, and would also conflict with chx's page caching revamp which makes the page caching the first or second bootstrap phase. Are you going to support a bootstrap re-ordering in D6 (or do you have an approach which doesn't require this up your sleeve?).
Comment #73
Damien Tournoud CreditAttribution: Damien Tournoud commented@catch: If language_initialize() redirects the user to the proper, language-prefixed, URL, this page will never get cached. Basically having the same effect as your "don't cache" approach.
Comment #74
catchSummary of irc conversation with Damien:
if we do #73 (which the patches around #48 did) - then we still break language negotiation for any user visiting a cached page - so it's still a partial fix, just one which requires slightly more work. The only way the bug can be properly fixed for all users is by re-ordering the bootstrap to put language negotiation before cached pages are served.
Two other possibilities - ensuring that Drupal never outputs non-prefixed pages. Or add something to the session when language negotiation is done, and disable page caching for the visitor until that's in place.
Comment #75
quicksketchDamien, if we don't use the temporary disabling of the page cache, can you think of an alternative for #371374: Add ImageCache UI Core? I was crossing my fingers that this would get done here, since currently we need it for that patch also.
Comment #76
plachsubscribing
Comment #77
sunWould a Content-Language HTTP header be taken into account by HTTP proxies/caches?
Comment #78
intuited CreditAttribution: intuited commentedPeople looking for an immediate workaround for this issue (for Drupal 6) may have luck using the globalredirect module. This should prevent a path from displaying in two different languages at the same path.
Note that globalredirect is not normally active on the front page, however there is a patch posted in issue #317305: When browser language is used, foreign language redirection breaks on the front page. that will make it so.
Comment #79
niQo CreditAttribution: niQo commentedFor your information, there is dirty workaround using apache rewrite rules which solve this problem by addind hard coded languages in htacess. It's only for browser language detection and drop "/" frontpage.
This is not the drupal's way but perhap's can help people who absolutely need language browser detection with drupal cache.
We did this for example :
"Path prefix with language fallback" should be activated in drupal.
Comment #80
toemaz CreditAttribution: toemaz commented@niQo Thats a pretty interesting fix you have there. Any way to write a condition + rule for all language in one time, using some kind of reg exp?
Comment #81
toemaz CreditAttribution: toemaz commentedI didn't test this yet, but perhaps it's in the right direction:
Comment #82
XanoDoes this mean Drupal should be able to edit .htaccess depending on the available languages?
Comment #83
toemaz CreditAttribution: toemaz commented@Xano No, Drupal won't be able to edit the htaccess file. You'll have to do it manually for each language enabled.
Comment #84
vnb CreditAttribution: vnb commentedsubscribing
Comment #85
plachThis issue has been fixed in #282191: TF #1: Allow different interface language for the same path by disabling the browser detection when page cache is enabled and we have an anonymous user. The reasons are explained in #282191-16: TF #1: Allow different interface language for the same path. In #50-#61 we agree that we might want to find an alternative solution.
Marking as fixed, if we want to find another way to fix this and enhancing the current solution we should open a new issue taking into account the current situation.
Re-open the issue if you think we should keep handling this here (at least change the issue title in that case).
Comment #86
plachThis still needs a solution for Drupal 6.
Comment #87
plach[please ignore this post, I forgot I should never work on critical core bugs this late ;)]
Comment #88
plachWhat about something like the following?
Comment #89
ahwebd CreditAttribution: ahwebd commentedsubscribing
Comment #90
toemaz CreditAttribution: toemaz commentedI solved this issue with Varnish cache, avoiding the issue with Drupal. See http://groups.drupal.org/node/62768 and browse through the comments.
Comment #91
tomsm CreditAttribution: tomsm commentedsubscribing
Comment #92
aidanlis CreditAttribution: aidanlis commentedsubscribe
Comment #93
m.sant CreditAttribution: m.sant commentedSubscribing. I'm experiencing this issue on a site in 22 languages!
I also need the ip2locale, ip2cc, ip2country, geoip modules and therefore I cannot avoid this issue using Varnish cache.
Comment #94
plachThis is present in the latest development snapshot, please don't change the version.
Comment #95
m.sant CreditAttribution: m.sant commentedChanged status to active as that patch didn't work.
Comment #96
m.sant CreditAttribution: m.sant commentedSorry didn't see previous comment...
Comment #97
plachDowngrading to major as this does not render completely unusable the site.
Comment #98
aidanlis CreditAttribution: aidanlis commentedUpgrading back to critical: I know my english guests seeing a mongolian front page don't agree with you plach :)
Comment #99
toemaz CreditAttribution: toemaz commentedIf your problem is so critical, all I can say is that this worked for me in case you are 'not' using Varnish.
Comment #100
toemaz CreditAttribution: toemaz commented@aidanlis also, don't forget to apply this patch on the D core in case you work with exotic languages which I think you do (3 versions of Chinese is exotic).
Comment #101
aidanlis CreditAttribution: aidanlis commentedThanks toemaz, I have the patches applied and everything is working for me, but I think the "critical" status still applies here until it's fixed in D6. Plach is your patch in #88 ready for review?
Comment #102
plach@aidanlis:
I'm sorry but looking at the priority guidelines I'd say they are wrong, obviously the perception of priority is subjective :)
It's only a proposal to introduce a redirect to the prefixed version of the page as suggested by Damien. It's not a patch and can only be tested by manually applying it.
Comment #103
aidanlis CreditAttribution: aidanlis commented@plach you're indeed correct
Comment #104
Andy Inman CreditAttribution: Andy Inman commentedI'm also seeing this with a standard install. For me, the easiest work-around is MultiLink Redirect which provides a redirect from generic front-page url to language-prefixed url - but, this is my own module therefore I'm happy with it - it's overkill if all you need is a simple fix to the front-page caching problem.
Apache redirect as suggested in http://drupal.org/node/339958#comment-2503184 also seems to work fine.
Comment #105
Anonymous (not verified) CreditAttribution: Anonymous commented@#99: That didn't work for me, and neither did the patch in #63.
I have my site set such that domains are used to distinguish languages, e.g. www.example.com and zh.example.com. Language negotiation works fine for every other page on the site as they have different URLs for Chinese pages, but on the most critical page, the frontpage, I either get cached English or Chinese no matter the URL (depending on which of www.example.com and zh.example.com was visited first after clearing of caches).
I don't need to detect the anonymous user's browser language, as there are two distinct URLs for the different languages. English as default is just fine. I just need the page to display in Chinese when the user switched to zh.example.com via the language switcher!
EDIT: Using CacheExlude (http://drupal.org/project/cacheexclude) to disable caching on the frontpage seemed to do the job for me, at the cost of, obviously, a cached frontpage.
Comment #106
jlbretton CreditAttribution: jlbretton commented#63: cache_negotiation.patch queued for re-testing.
Comment #107
tomsm CreditAttribution: tomsm commentedCache Exclude is a good solution for the problem. I excluded
<front>
and enabled normal caching resulting in a much better performance for anonymous users. Thanks for the tip sypl !Comment #108
plachI have an improved version of #88 deployed on http://graphicdesignworlds.it which seems to perform well and is implemented through a contrib custom module. I ain't sure we can make such a big change for D6, perhaps it should just live in contrib for D6/D7.
We definitely need Gabor's and/or webchick's feedback before going on with a core patch implementing #88.
Edit: perhaps we just want to backport the D7 solution, i.e. disabling browser fallback for cached pages and make #88 live in contrib.
Comment #109
k3vin CreditAttribution: k3vin commented+1
Comment #110
plachComment #111
mr.j CreditAttribution: mr.j commentedSubscribing
Comment #112
iamEAP CreditAttribution: iamEAP commentedThis may be specific to my install, but is anyone else still seeing this behavior for anything other than the homepage?
All pages other than the homepage seem to successfully redirect and have the path-prefixed URLs cached.
Comment #113
iamEAP CreditAttribution: iamEAP commentedI spoke too soon. It was specific to my install, but the contrib module that was giving the (desirable) behavior was Global Redirect, as first suggested by intuited in #78.
I've since submitted a patch at the Global Redirect issue mentioned in #77 which solves this issue completely in contrib. It borrows ideas from #88.
I recommend you try Global Redirect and the patch here.
Comment #114
iamEAP CreditAttribution: iamEAP commentedNevermind, the solution above does not work.
Comment #115
goldenboy CreditAttribution: goldenboy commentedSubscribing
Comment #116
fietserwini know this is for D6, but I created a sandbox module that prevents this problem from happening by executing a real redirect instead of serving the content under the home page URL: Front page redirect.
I guess it will be very easy to backport it: the lookup using the i18n_variable module should be replaced by looking in $conf['i18n_variables'].
Comment #117
iamEAP CreditAttribution: iamEAP commented@fietserwin, took a quick look at your code and you're falling into a similar trap. hook_init() isn't called on cached pages; you'd want to implement it in a hook_boot(). That being said, you may be duplicating functionality of the Global Redirect module.
Comment #118
fietserwinThe page should not be cached because I redirect (the redirect may be cached though). I am still doing some tests though, also in a first D6 backport version.
Comment #119
Andy Inman CreditAttribution: Andy Inman commentedOn most sites, the front-page would be one of the most frequently accessed pages, so preventing it from being cached using the Cache Exclude module does not seem to me like the best solution for performance reasons.
Redirecting via a Drupal module such as Global Redirect or MultiLink Redirect is not the best solution either for performance, since quite a lot of Drupal code and MySQL queries will be run before the the user is redirected to the correct (cached) page.
So, my advice would be to redirect using Apache Rewrite, as suggested by toemaz in #81 - I have implemented this on client sites and it works well. For situations where mod_rewrite is not available then my choice would be MultiLink Redirect, but I'm biased :) Global Redirect should work too I think.
I'll do some performance comparisons and post them here.
Comment #120
fietserwinThe solution form #81 won't always cope correctly with a header like "Accept-Language: nl,en;q=0.7,fr;q=0.3". On a French-English site I want the English version. But performance wise it may still be a good bet.
There is now a patch for global redirect that works fine for me and solves this problem for D6 in contrib, though your default language must have a prefix. I guess #81 and this patch can both be applied and you will get the fast redirect if the most preferred language of the user is supported, and the slower contrib solution if not.
Comment #121
fietserwin[Edit] Oops, posted twice...
Comment #122
Andy Inman CreditAttribution: Andy Inman commentedI'm now wondering whether an Apache rewrite would be better/faster than a redirect - could make it look like the relevant language version really was at the front-page address. Just an idea, haven't tried it.
Comment #123
fietserwinShould be faster. But be careful with (intermediate) caching when serving different content under the same URL. The rewrite should also add a Vary header.
Comment #124
knalstaaf CreditAttribution: knalstaaf commentedWill there be a solution for D7 or D8?
Comment #125
Fabianx CreditAttribution: Fabianx commentedIt is possible to solve this in D6 and D7 in contrib by including a file from settings.php:
For those that need this in contrib without patching core, you can use a similar approach (only implemented for Apache so far) in a settings.inc that is included from settings.php before the page cache is used:
#1768556: Context Mobile Detect does not work with page cache enabled has a patch that makes a contrib module work that needs contextual page cache.
The idea is very simple. We adjust the request_uri() by changing the parameters to include a "?language=de" or "&language=fr" depending if the request uri has parameters or not.
On non Apache browsers will it just not work as I did not implement the other edge cases for request_uri(). As request_uri() is used for the cache key, this will always work and allow dynamic page cache keys, but the code needs to be run before bootstrap in settings.php, so it is very limited in functionality.
Enjoy!
- Fabian
Comment #126
caktux CreditAttribution: caktux commentedThis is still a serious problem in D7...
Comment #127
plachThe bug reported in the OP does not occur in D7. Please, leave the version alone.
Comment #128
paolomainardi CreditAttribution: paolomainardi commented@plach this problem occur in D7 as well.
Comment #129
plachBrowser language negotiation is disabled when page cache is enabled in D7. Hence the bug reported in the OP cannot happen. Please don't change the version again without a list of steps to reproduce the bug described in the OP (page cached in the wrong language) on a fresh D7 installation.
Comment #130
oliverpolden CreditAttribution: oliverpolden commentedI would like to propose another solution. #81 works great unless your default language doesn't have a path prefix in which case you need a bit extra...
In your settings.php (there may be a more appropriate place but hook_init() is too late) you can put:
This basically forces the root domain to always be in English whatever the browser's language. Since this runs after mod_rewrite it works perfectly well in conjunction with #81.
Comment #131
amacias CreditAttribution: amacias commentedThis is how I got around this issue. It seems that is working fine for me (I'm using Drupal 7):
Comment #132
amacias CreditAttribution: amacias commentedComment #133
pabsta CreditAttribution: pabsta commentedAs this issue been solved? I have the exact same problem with drupal 7.
Comment #134
StephenN CreditAttribution: StephenN commentedLiked the look of #131 but couldn't get it to work. My solution for Drupal 7 below. n.b you will need to resave your language negotiation settings on the admin page to get the hook to apply.
Works for me but very hacky. Would still like to see a genuine fix.
Comment #135
lmeurs CreditAttribution: lmeurs commentedFor D7 we based our fix on #131 as well, but in our case to redirect anonymous users to the frontpage at a language-prefixed URL. Since the frontpage is cached in the default language, we only need to redirect users with another preferred language set in their browsers.
Is redirecting a SEO safe approach?
Comment #136
fietserwinIf you add a "Vary: Accept-Language" header it should be.
Comment #137
lmeurs CreditAttribution: lmeurs commentedHi fietserwin,
Thanks for the comment. I changed our code by adding the "Vary: Accept-language" and "Content-language: [langcode]" headers, but also changing the HTTP response code from 301 (moved permanently) to 302 (found).
Thanks again!
Comment #138
caktux CreditAttribution: caktux commented#135 is exactly the behavior I expect from Drupal core, thanks a lot for this.
Comment #139
nodoxi CreditAttribution: nodoxi commentedI made changes to prevent saving wrong cache when language is detected by browser
Comment #141
plachPlease see #129.
Comment #142
knalstaaf CreditAttribution: knalstaaf commentedThere's a quick'n dirty fix for D7 in #81, if you're not familiar with writing modules. Some say that the Boost module does the job as well for Drupal 7.
Comment #143
kumkum29 CreditAttribution: kumkum29 commentedHello knalstaaf,
I use your fix and it's useful for the main domain (www.domain.com). But, I have several subdomains with no languages, and this fix is functionnal on my subdomains. Now the subdomains aren't accessible.
Is it possible to change the condition and create the fix for only the master domain?
Thanks.
Comment #145
xjm