Problem/Motivation
\Drupal\page_cache\StackMiddleware\PageCache::getCacheId()
uses $request->getUri()
for cache ID generation.
\Symfony\Component\HttpFoundation\Request::getUri()
doc says
Generates a normalized URI (URL) for the Request.
This means page cache IDs are based on the normalized URLs, not on the real ones.
If #2641118: Route normalizer: Global Redirect in core lands in core, we might get circular redirects, example:
- user requests http://example.com/path?test=a+b
- route normalizer responces with a redirect to nirmalized URL http://example.com/path?test=a%20b
- page_cache module caches the redirect with a normalized incoming URL - http://example.com/path?test=a%20b
- user is redirected to http://example.com/path?test=a%20b
- page_cache has a cache entry for http://example.com/path?test=a%20b
- circular redirect
Proposed resolution
Replace $request->getUri()
with $request->getSchemeAndHttpHost() . $request->getRequestUri()
As a "side effect" we'll get a small performance improvement mentioned in #2501989-17: [meta] Page Cache Performance
Comment | File | Size | Author |
---|---|---|---|
#30 | 2761639-30.patch | 2.28 KB | Leksat |
#30 | 2761639-30-test-only.patch | 1.64 KB | Leksat |
Comments
Comment #2
Leksat CreditAttribution: Leksat at Amazee Labs commentedComment #3
znerol CreditAttribution: znerol commentedThat sounds reasonable to me. I'm wondering whether we should change other instances also. Candidates are:
I guess
UrlCacheContext
should be changed also.Comment #5
Wim LeersUrlCacheContext
is intentionally only caring about the incoming site-relative URL. If you also care about the specific base URL/domain name being used, you must useSiteCacheContext
.UrlCacheContext
SiteCacheContext
We cannot change this, that'd be a huge API break.
Comment #6
Wim LeersComment #7
hitfactory CreditAttribution: hitfactory commentedWe're seeing this bug on a page which contains jobvite's iframe widget.
The jobvite iframe is on a page with URL like http://example.com/Careers.
When a user clicks a link to a job inside the iframe which has an id of, say, 'foo', the iframe will issue a redirect to http://example.com/Careers?p=job/foo which in turn triggers the redirect loop explained above.
Comment #8
hitfactory CreditAttribution: hitfactory commentedConfirming the patch in #2 resolves the redirect loop issue.
Comment #9
stella CreditAttribution: stella at Annertech for Glanbia commentedI've been testing out #2641118: Route normalizer: Global Redirect in core on a multilingual site and it works pretty well. However, I hit an infinite redirect loop once I set up url aliases per language for a view path and then tried to submit the views exposed form on it (in any language). It worked fine for logged in users, but logged out users got an infinite redirect loop.
The patch in #2 solved the redirect loop for me.
Comment #10
stella CreditAttribution: stella at Annertech for Glanbia commentedresetting accidental status change
Comment #11
stella CreditAttribution: stella at Annertech for Glanbia commentedI'm using the patch from #2641118: Route normalizer: Global Redirect in core and this one, and it has solved all of my infinite redirect loops except one. I'm not sure if it's related or not.
If I go to my home page with a query string attached e.g. www.example.com/?abc=foo or www.example.com/en/?abc=foo or www.example.com/en/node/1?abc=foo (where node 1 is my home page) then I get an infinite redirect loop.
Comment #13
SchnWalter CreditAttribution: SchnWalter as a volunteer and commentedI believe that this is an issue with redirect.module's route normalizer, and not with the page cache.
See #2852680: Redirect loop when serving cached pages to anonymous users
Comment #14
SchnWalter CreditAttribution: SchnWalter as a volunteer and commented@stella, could you please check if the patch from #2852680: Redirect loop when serving cached pages to anonymous users fixes your problem and if yes, could you please resolve this as "duplicate".
Comment #15
SchnWalter CreditAttribution: SchnWalter as a volunteer and commentedComment #16
BerdirTo be more specific, instead of using the core patch, use the latest version of redirect *and* that patch.
Comment #17
hitfactory CreditAttribution: hitfactory commentedI can confirm that after updating to Redirect 8.x-1.0-alpha4, our issue with the redirect loop above disappeared, and the patch in #2 is no longer needed.
Comment #18
daxelrod CreditAttribution: daxelrod commented@stella - I am having the same issue on homepage-only. Attributing this to the redirect module (#2852680: Redirect loop when serving cached pages to anonymous users), because I am unable to reproduce when disabled. With the redirect module disabled no redirect takes place.
I'm not entirely sure that makes this ticket a duplicate, perhaps @Leksat can confirm if the Global Redirect features are the same that are included in 8.x-1.0-alpha4.
Comment #19
gg4 CreditAttribution: gg4 commentedGiven the impact of this issue on contrib #2852680: Redirect loop when serving cached pages to anonymous users and end-users, it seems like the priority should be upgraded on this issue.
Comment #20
dawehnerSo what about just fixing page cache for now and open up a follow up to look at all the other usecases?
Comment #21
Wim LeersMy comment in #5 is irrelevant/off-topic. It talks about
\Drupal\Core\Cache\Context\UrlCacheContext
. This issue is solely about\Drupal\page_cache\StackMiddleware\PageCache
.Removing
tag.Comment #22
Wim Leers#13/#14/#16/#17 all say that this patch is no longer necessary. They say that using the latest version of the
redirect
contrib module, plus the patch at #2852680: Redirect loop when serving cached pages to anonymous users for that module is sufficient to solve the problem.So, I'm confused. Is this necessary or not?
In any case, it seems prudent/the right thing to do to commit #2 + test coverage to prove which problem it fixes.
Comment #23
BerdirThe comments from here older than some additional findings in the redirect issue, so I'm now thinking that this fix is indeed necessary.
To reproduce this example, enable redirect.module and make sure the route normalization setting is enabled.
Request: http://example.com/user/register?
Based on how redirect.module currently works, it will redirect that to "http://example.com/user/register", which I think is the right thing to do. But I'm also open to discuss that. We already made the decision to avoid touching query arguments otherwise, as other weird things can happen, wiht different order and query arguments with . or other characters that PHP converts in some way.
What happens now is that the request will then return a redirect to /user/register, but Page Cache will cache this request as /user/register, without ?, then when you access /user/register, you have a redirect loop.
Comment #24
Wim LeersFirst: we should fix the bug in Page Cache, no doubt.
Second: redirecting
/foo?
to/foo
may be more expensive than it's worth. An entire redirect, which may cost seconds of additional waiting time depending on connection speed/latency seems excessive for such a small cosmetic difference. As far as the HTTP spec is concerned,/foo
===/foo?
(AFAICT at least, it doesn't explicitly state this in RFC3986.Comment #25
dawehnerI was seeing redirects from
localhost/d8
tolocalhost/d8/
recently, which should happeny way more often.Comment #26
Leksat CreditAttribution: Leksat at Amazee Labs commentedAdded a test.
Comment #28
Leksat CreditAttribution: Leksat at Amazee Labs commentedComment #29
BerdirCan we also include an exmaple with just ? which should also be a miss?
That's the example that's problematic for redirect.module even when ignoring the query parameters.
Comment #30
Leksat CreditAttribution: Leksat at Amazee Labs commentedSure :)
Comment #31
BerdirThanks, looks good to me. Once this is in, we can cache normalized request redirects again in redirect.module and later in core.
Comment #35
catchAdded commit credit for hitfactory and stella for the manual testing.
Committed/pushed to 8.4.x, thanks! Also cherry-picked to 8.3.x since it's a pure bug fix. There's a very small chance that this results in more page cache entries, but that should be very minimal.