Problem/Motivation
#1918820: HTTP header date formats breaks page caching with Varnish and Nginx when users log out.
Prerequisites:
- You need to have Varnish configured as reverse proxy in front of your site
- Nginx configured as webserver with gzip compression enabled (
gzip on;) - Drupal Page caching is enabled
Steps to reproduce:
- As anonymous user visit the front page. You get a cached version of the page
- Login (you need a user account on the site).
- Click on the site slogan/logo to get to the front page.
- Click the logout button. You should now see the same page as anonymous users, but instead you see the front page as logged in user still. You can recognize that by still seeing a "logout" link or "my account" link.
This wrong behavior happens in all mayor browsers, tested with Firefox and Chrome.
The problem is that Nginx removes ETags when it applies gzip compression. The browsers then only send If-Modified-Since headers (no If-None-Match header), so Varnish answers with a 304 response because it has an older version of the page in memory. The browser then just displays the same logged-in user page.
Proposed resolution
We should not send out ETag and Last-Modified HTTP headers for logged-in users (uncachable pages), because those headers only make sense for cachable pages.
Remaining tasks
Review the patch.
User interface changes
none.
API changes
none.
| Comment | File | Size | Author |
|---|---|---|---|
| #24 | 2381839.patch | 3.08 KB | klausi |
Comments
Comment #1
klausiklausi opened a new pull request for this issue.
Comment #2
klausiSo this is the primitive rollback patch that we are going to use on our sites in the meantime.
Comment #3
klausiI see in my testing that browsers do not cache pages in their local cache when using the new DATE_RFC7231 as Last-Modified date. They issue requests to the server with a If-Modified-Since header (containing the last modified date from the last response) or If-None-Match headers which Varnish just answers with a 304 not modified empty response.
Then the browser just displays the last version of the page it got, which can be the authenticated user version, which is wrong when the user is logged out.
Comment #4
girishmuraly commentedWhy is Varnish storing authenticated user versions for you? It should pass through if auth session cookie is set normally.
Comment #5
damien tournoud commented@klausi: could you share your Varnish configuration (the VCL file)?
In the past, Varnish didn't respect the `no-cache` configuration from `Cache-Control`, which would explain what you are seeing. It required some specific configuration to be HTTP compliant. I don't know if that's still the case.
Comment #6
klausi@girishmuraly: Varnish is not storing anything for authenticated users. The point is that if you log out then you still get an authenticated user page from your local browser.
Varnish config file we are using attached.
Comment #7
klausiOh, and our Varnish version:
Comment #8
klausiThe issue that introduced the old DATE_RFC1123 for Last-Modified originally for Drupal 7 is #147310: Implement better cache headers for reverse proxies. It even has a comprehensive comment about the bug we are seeing here:
Comment #9
znerol commentedIt does not make any sense to add
Last-ModifiedandETagheaders to uncacheable responses. I believe that Drupal is to blame here, not Firefox.Comment #10
klausiklausi pushed some commits to the pull request.
For an interdiff please see the list of recent commits.
Comment #11
klausiRight, implemented znerol's suggestion, which also fixes this issue on my test site.
Comment #12
klausiJust confirmed that Drupal 8 also does not set Last-Modified and ETag headers for authenticated user pages. So no patch against Drupal 8 is necessary and we can continue here against Drupal 7.
znerol said on IRC that we should also check what happens with generated images (image styles) and private downloads.
Comment #13
klausiPrivate downloads are not affected by this change because they are always uncachable by design, so removing the headers is not a problem.
Image styles using file_transfer() are also not affected since they also set the default Cache-control: no-cache, must-revalidate, post-check=0, pre-check=0 header. That means the browser will request the image again anyway next time, so the Last-Modified and ETag headers are useless here as well.
Comment #14
damien tournoud commentedThe main problem seems like Varnish still doesn't respect the
Cache-Control: no-cachedirective by default, no?The description in the code comment of #8 doesn't make any sense to me. The only explanation is that the proxy server wrongly cache the page, even if the `Cache-Control: no-cache` directive is present.
Comment #15
klausiNo, the problem is that the local browser does not respect the Cache-Control: no-cache header. Then the browser sends a request to Varnish with an If-Modified-Since header and Varnish happily answers with 304, so the browser displays the old version of the page where the user is still logged in.
Comment #16
damien tournoud commented@klausi: The browser is revalidating a
no-cachepage, which is fine. Varnish is *storing* ano-cachepage, which is not.Comment #17
klausiVarnish is not storing a no-cache page. It only stores the anonymous user version with a last modified date. Then the browser asks Varnish if the page has changed with the If-Modified-Since header. Varnish answers with 304, and now the browser should display the anonymous version of the page, but it is lazy and just displays the logged in version, which it should not even have in the cache.
This all goes away when we do not send out Last-Modified headers for authenticated user pages that are not cachable anyway.
Comment #18
damien tournoud commented@klausi that doesn't make much sense. The check on the
ETagvia theIf-None-Matchshould fail anyway. The only reason for Varnish to return a 304 in the presence of both aIf-Modified-Sinceand aIf-None-Matchis for it to have cached theno-cachepage.Can you capture a complete trace of all the requests and responses (including their complete headers) going from the client to Varnish, and ideally too form Varnish to the backend?
Comment #19
klausiFirst request as anonymous user to the front page, Varnish answers, no request to the backend:
Response headers:
Then I log in on the front page login block:
Response headers:
Request headers:
Response headers:
Then I click the frontpage again:
Response headers:
Then I click the logout link:
Response headers:
Request headers:
Response headers:
And after that request my browser shows me the front page as logged in user, although I'm logged out.
All responses that have an Age header as 0 come from the nginx/PHP backend (Varnish has passed them through), all other responses with Age > 0 come from Varnish directly without invoking the backend.
Comment #20
damien tournoud commentedThe response for
GET /as an authenticated user is really wrong:No
ETag, noVary: cookie? Drupal does send both indrupal_serve_page_from_cache()so I assume something in your infrastructure is messing up with those?Comment #21
damien tournoud commentedMy bad, this is an authenticated request. Right now we don't send a
Vary: cookiefor those, which is a bug. But the response should have anETag, as one is set indrupal_page_header().So there are a series of things here:
Vary: Cookieheader for authenticated responses. This is just braindead, but it is usually not a problem because we *also* set them as non-cacheable;Last-Modified-SinceandETageven on non-cacheable responses. I don't see this as spec-breaking, but they are indeed unecessary;ETagon this authenticated homepage response?Comment #22
damien tournoud commentedJust to clarify: the absence of a
Vary: Cookieis what allows the browser to do a conditional request after logout. The absence ofETagin your particular case it what allows Varnish to satisfy the conditional request with a 304.Comment #23
klausiThanks Damien. Indeed, the Etag is missing for authenticated requests. It looks like this is related to Nginx and gzip encoding of responses. It appears that Nginx removes the Etag if it alters a response to gzip it. Could not find a good doc page or similar to confirm that yet.
Comment #24
klausiklausi pushed some commits to the pull request.
For an interdiff please see the list of recent commits.
Comment #25
klausi@Damien: the absence of
Vary: Cookieis not a bug. RFC 2616 says "A server MAY include a Vary header field with a non-cacheable response ...", so this is not a requirement.So we cleared up the mystery - Nginx messes with ETags, which we cannot easily fix (we don't want to hack Nginx sources).
Now that we have a better understanding of the situation patch reviews welcome :-)
Comment #26
David_Rothstein commentedI can put something in the 7.33 release notes about the regression, but I'm trying to figure out what... do we know this is a problem for all Nginx versions, or just some configurations, or...? (It seems like more people would have run into this if it's a problem for all Varnish/Nginx sites.)
And so the deal is that we were masking this bug by using the wrong Last-Modified format before, but since #1918820: HTTP header date formats the format is correct so browsers are now contacting the server when they're supposed to and that's what's triggering the bug?
The patch itself looks reasonable, I think. Could use another review or two given that it touches pretty fundamental code.
Comment #27
David_Rothstein commentedI went ahead and added something vague to https://www.drupal.org/drupal-7.33-release-notes now, but perhaps it could be improved upon.
Comment #28
klausiI think that addition to the release notes is fine, thanks!
You understood correctly. Now that browsers understand the Last-Modified format they make a request to the server, but Nginx has removed the ETag so they don't send a If-None-Match header. Varnish answers with 304 and the browser just displays the previous logged-in page.
Nginx removes the ETag when it applies gzip compression; I added gzip as prerequisite to the issue summary.
Comment #29
fabianx commentedRTBC, it does not make sense to send e-tag or last-modified headers for authenticated pages.
If you use Akamai or such you will send those headers yourself anyway.
Comment #32
klausiAssuming this was just a random testbot fail.
Comment #33
David_Rothstein commentedOK, I guess this is good to go - hopefully it won't break anything else :) Doesn't look like Authcache would be affected by this since it sets its own headers, and @znerol's comment above suggests he's OK with this approach also.
Committed to 7.x - thanks!
Comment #36
jbrown commentedShould this have a change record?
Comment #37
David_Rothstein commentedI didn't think it necessarily needed one since it's a pretty obscure change (though I did mention it in the release notes, in case it affects someone). If anyone does think it needs a change record and wants to create one, I'd be happy to retroactively link to it from the release notes, etc.