Problem/Motivation

#1918820: HTTP header date formats breaks page caching with Varnish and Nginx when users log out.
Prerequisites:

  1. You need to have Varnish configured as reverse proxy in front of your site
  2. Nginx configured as webserver with gzip compression enabled (gzip on;)
  3. Drupal Page caching is enabled

Steps to reproduce:

  1. As anonymous user visit the front page. You get a cached version of the page
  2. Login (you need a user account on the site).
  3. Click on the site slogan/logo to get to the front page.
  4. Click the logout button. You should now see the same page as anonymous users, but instead you see the front page as logged in user still. You can recognize that by still seeing a "logout" link or "my account" link.

This wrong behavior happens in all mayor browsers, tested with Firefox and Chrome.

The problem is that Nginx removes ETags when it applies gzip compression. The browsers then only send If-Modified-Since headers (no If-None-Match header), so Varnish answers with a 304 response because it has an older version of the page in memory. The browser then just displays the same logged-in user page.

Proposed resolution

We should not send out ETag and Last-Modified HTTP headers for logged-in users (uncachable pages), because those headers only make sense for cachable pages.

Remaining tasks

Review the patch.

User interface changes

none.

API changes

none.

Comments

klausi’s picture

Status: Active » Needs review
StatusFileSize
new1.94 KB

klausi opened a new pull request for this issue.

klausi’s picture

So this is the primitive rollback patch that we are going to use on our sites in the meantime.

klausi’s picture

I see in my testing that browsers do not cache pages in their local cache when using the new DATE_RFC7231 as Last-Modified date. They issue requests to the server with a If-Modified-Since header (containing the last modified date from the last response) or If-None-Match headers which Varnish just answers with a 304 not modified empty response.

Then the browser just displays the last version of the page it got, which can be the authenticated user version, which is wrong when the user is logged out.

girishmuraly’s picture

Why is Varnish storing authenticated user versions for you? It should pass through if auth session cookie is set normally.

damien tournoud’s picture

@klausi: could you share your Varnish configuration (the VCL file)?

In the past, Varnish didn't respect the `no-cache` configuration from `Cache-Control`, which would explain what you are seeing. It required some specific configuration to be HTTP compliant. I don't know if that's still the case.

klausi’s picture

StatusFileSize
new4.63 KB

@girishmuraly: Varnish is not storing anything for authenticated users. The point is that if you log out then you still get an authenticated user page from your local browser.

Varnish config file we are using attached.

klausi’s picture

Oh, and our Varnish version:

$ varnishd -V
varnishd (varnish-3.0.6 revision 1899836)
klausi’s picture

The issue that introduced the old DATE_RFC1123 for Last-Modified originally for Drupal 7 is #147310: Implement better cache headers for reverse proxies. It even has a comprehensive comment about the bug we are seeing here:

+ * Also give each page a unique ETag. This will force clients to include both
+ * an If-Modified-Since header and an If-None-Match header when doing
+ * conditional requests for the page (required by RFC 2616, section 13.3.4),
+ * making the validation more robust. This is a workaround for a bug in Mozilla
+ * Firefox that is triggered when Drupal's caching is enabled and the user
+ * accesses Drupal via an HTTP proxy (see
+ * https://bugzilla.mozilla.org/show_bug.cgi?id=269303): When an authenticated
+ * user requests a page, and then logs out and requests the same page again,
+ * Firefox may send a conditional request based on the page that was cached
+ * locally when the user was logged in. If this page did not have an ETag
+ * header, the request only contains an If-Modified-Since header. The date will
+ * be recent, because with authenticated users the Last-Modified header always
+ * refers to the time of the request. If the user accesses Drupal via a proxy
+ * server, and the proxy already has a cached copy of the anonymous page with an
+ * older Last-Modified date, the proxy may respond with 304 Not Modified, making
+ * the client think that the anonymous and authenticated pageviews are
+ * identical.

znerol’s picture

because with authenticated users the Last-Modified header always refers to the time of the request.

It does not make any sense to add Last-Modified and ETag headers to uncacheable responses. I believe that Drupal is to blame here, not Firefox.

klausi’s picture

StatusFileSize
new2.23 KB

klausi pushed some commits to the pull request.

For an interdiff please see the list of recent commits.

klausi’s picture

Right, implemented znerol's suggestion, which also fixes this issue on my test site.

klausi’s picture

Just confirmed that Drupal 8 also does not set Last-Modified and ETag headers for authenticated user pages. So no patch against Drupal 8 is necessary and we can continue here against Drupal 7.

znerol said on IRC that we should also check what happens with generated images (image styles) and private downloads.

klausi’s picture

Private downloads are not affected by this change because they are always uncachable by design, so removing the headers is not a problem.

Image styles using file_transfer() are also not affected since they also set the default Cache-control: no-cache, must-revalidate, post-check=0, pre-check=0 header. That means the browser will request the image again anyway next time, so the Last-Modified and ETag headers are useless here as well.

damien tournoud’s picture

The main problem seems like Varnish still doesn't respect the Cache-Control: no-cache directive by default, no?

The description in the code comment of #8 doesn't make any sense to me. The only explanation is that the proxy server wrongly cache the page, even if the `Cache-Control: no-cache` directive is present.

klausi’s picture

No, the problem is that the local browser does not respect the Cache-Control: no-cache header. Then the browser sends a request to Varnish with an If-Modified-Since header and Varnish happily answers with 304, so the browser displays the old version of the page where the user is still logged in.

damien tournoud’s picture

@klausi: The browser is revalidating a no-cache page, which is fine. Varnish is *storing* a no-cache page, which is not.

klausi’s picture

Varnish is not storing a no-cache page. It only stores the anonymous user version with a last modified date. Then the browser asks Varnish if the page has changed with the If-Modified-Since header. Varnish answers with 304, and now the browser should display the anonymous version of the page, but it is lazy and just displays the logged in version, which it should not even have in the cache.

This all goes away when we do not send out Last-Modified headers for authenticated user pages that are not cachable anyway.

damien tournoud’s picture

Status: Needs review » Postponed (maintainer needs more info)

@klausi that doesn't make much sense. The check on the ETag via the If-None-Match should fail anyway. The only reason for Varnish to return a 304 in the presence of both a If-Modified-Since and a If-None-Match is for it to have cached the no-cache page.

Can you capture a complete trace of all the requests and responses (including their complete headers) going from the client to Varnish, and ideally too form Varnish to the backend?

klausi’s picture

Status: Postponed (maintainer needs more info) » Needs review

First request as anonymous user to the front page, Varnish answers, no request to the backend:

Request headers:
GET / HTTP/1.1
Host: example.com
Connection: keep-alive
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
User-Agent: Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2114.2 Safari/537.36
Referer: http://example.com/
Accept-Encoding: gzip, deflate, sdch
Accept-Language: en-US,en;q=0.8,de;q=0.6

Response headers:

HTTP/1.1 200 OK
Server: nginx
Content-Type: text/html; charset=utf-8
X-Drupal-Cache: MISS
Etag: "1417192716-1"
Content-Language: en
X-Generator: Drupal 7 (http://drupal.org)
Cache-Control: public, max-age=1800
Last-Modified: Fri, 28 Nov 2014 16:38:36 GMT
Expires: Sun, 19 Nov 1978 05:00:00 GMT
Vary: Cookie, Accept-Encoding
Content-Encoding: gzip
Content-Length: 2337
Accept-Ranges: bytes
Date: Fri, 28 Nov 2014 16:42:29 GMT
X-Varnish: 61743614 61743488
Age: 232
Via: 1.1 varnish
Connection: keep-alive

Then I log in on the front page login block:

Request headers:
POST /node?destination=node HTTP/1.1
Host: example.com
Connection: keep-alive
Content-Length: 117
Cache-Control: max-age=0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Origin: http://example.com
User-Agent: Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2114.2 Safari/537.36
Content-Type: application/x-www-form-urlencoded
Referer: http://example.com/
Accept-Encoding: gzip, deflate
Accept-Language: en-US,en;q=0.8,de;q=0.6
Cookie: has_js=1

Response headers:

HTTP/1.1 302 Moved Temporarily
Server: nginx
Content-Type: text/html; charset=utf-8
X-Drupal-Cache: MISS
Expires: Sun, 19 Nov 1978 05:00:00 GMT
Last-Modified: Fri, 28 Nov 2014 16:46:06 GMT
Cache-Control: no-cache, must-revalidate, post-check=0, pre-check=0
ETag: "1417193166"
Content-Language: en
Set-Cookie: SESSac4b504041d138bd7b96d9d03d14626e=JbU-YqUHoUQN_XE8GN5R3sw_WbfCTCwcdi--2polO-8; expires=Sun, 21-Dec-2014 20:19:26 GMT; Max-Age=2000000; path=/; domain=.example.com; HttpOnly
Location: http://example.com/node
Content-Length: 0
Accept-Ranges: bytes
Date: Fri, 28 Nov 2014 16:46:06 GMT
X-Varnish: 61743694
Age: 0
Via: 1.1 varnish
Connection: keep-alive

Request headers:

GET /node HTTP/1.1
Host: example.com
Connection: keep-alive
Cache-Control: max-age=0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
User-Agent: Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2114.2 Safari/537.36
Referer: http://example.com/
Accept-Encoding: gzip, deflate, sdch
Accept-Language: en-US,en;q=0.8,de;q=0.6
Cookie: has_js=1; SESSac4b504041d138bd7b96d9d03d14626e=JbU-YqUHoUQN_XE8GN5R3sw_WbfCTCwcdi--2polO-8

Response headers:

HTTP/1.1 200 OK
Server: nginx
Content-Type: text/html; charset=utf-8
Expires: Sun, 19 Nov 1978 05:00:00 GMT
Last-Modified: Fri, 28 Nov 2014 16:46:06 GMT
Cache-Control: no-cache, must-revalidate, post-check=0, pre-check=0
Content-Language: en
X-Generator: Drupal 7 (http://drupal.org)
Content-Encoding: gzip
Content-Length: 2083
Accept-Ranges: bytes
Date: Fri, 28 Nov 2014 16:46:06 GMT
X-Varnish: 61743695
Age: 0
Via: 1.1 varnish
Connection: keep-alive

Then I click the frontpage again:

GET / HTTP/1.1
Host: example.com
Connection: keep-alive
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
User-Agent: Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2114.2 Safari/537.36
Referer: http://example.com/node
Accept-Encoding: gzip, deflate, sdch
Accept-Language: en-US,en;q=0.8,de;q=0.6
Cookie: SESSac4b504041d138bd7b96d9d03d14626e=FDWmwVi9DKwUSDpY-NhuEe_GVqrvPtOCLo69DMb-sRg; has_js=1
If-None-Match: "1417192716-1"

Response headers:

HTTP/1.1 200 OK
Server: nginx
Content-Type: text/html; charset=utf-8
Expires: Sun, 19 Nov 1978 05:00:00 GMT
Last-Modified: Fri, 28 Nov 2014 16:51:02 GMT
Cache-Control: no-cache, must-revalidate, post-check=0, pre-check=0
Content-Language: en
X-Generator: Drupal 7 (http://drupal.org)
Content-Encoding: gzip
Content-Length: 2082
Accept-Ranges: bytes
Date: Fri, 28 Nov 2014 16:51:02 GMT
X-Varnish: 61743800
Age: 0
Via: 1.1 varnish
Connection: keep-alive

Then I click the logout link:

GET /user/logout HTTP/1.1
Host: example.com
Connection: keep-alive
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
User-Agent: Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2114.2 Safari/537.36
Referer: http://example.com/
Accept-Encoding: gzip, deflate, sdch
Accept-Language: en-US,en;q=0.8,de;q=0.6
Cookie: SESSac4b504041d138bd7b96d9d03d14626e=FDWmwVi9DKwUSDpY-NhuEe_GVqrvPtOCLo69DMb-sRg; has_js=1

Response headers:

HTTP/1.1 302 Moved Temporarily
Server: nginx
Content-Type: text/html
Expires: Sun, 19 Nov 1978 05:00:00 GMT
Last-Modified: Fri, 28 Nov 2014 16:52:42 GMT
Cache-Control: no-cache, must-revalidate, post-check=0, pre-check=0
ETag: "1417193562"
Set-Cookie: SESSac4b504041d138bd7b96d9d03d14626e=deleted; expires=Thu, 01-Jan-1970 00:00:01 GMT; Max-Age=0; path=/; domain=.example.com; httponly
Location: http://example.com/
Content-Length: 0
Accept-Ranges: bytes
Date: Fri, 28 Nov 2014 16:52:42 GMT
X-Varnish: 61743816
Age: 0
Via: 1.1 varnish
Connection: keep-alive

Request headers:

GET / HTTP/1.1
Host: example.com
Connection: keep-alive
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
User-Agent: Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2114.2 Safari/537.36
Referer: http://example.com/
Accept-Encoding: gzip, deflate, sdch
Accept-Language: en-US,en;q=0.8,de;q=0.6
Cookie: has_js=1
If-Modified-Since: Fri, 28 Nov 2014 16:51:02 GMT

Response headers:

HTTP/1.1 304 Not Modified
Server: nginx
Content-Type: text/html; charset=utf-8
X-Drupal-Cache: MISS
Etag: "1417192716-1"
Content-Language: en
X-Generator: Drupal 7 (http://drupal.org)
Cache-Control: public, max-age=1800
Last-Modified: Fri, 28 Nov 2014 16:38:36 GMT
Expires: Sun, 19 Nov 1978 05:00:00 GMT
Vary: Cookie, Accept-Encoding
Content-Encoding: gzip
Accept-Ranges: bytes
Date: Fri, 28 Nov 2014 16:52:42 GMT
X-Varnish: 61743817 61743488
Age: 845
Via: 1.1 varnish
Connection: keep-alive

And after that request my browser shows me the front page as logged in user, although I'm logged out.

All responses that have an Age header as 0 come from the nginx/PHP backend (Varnish has passed them through), all other responses with Age > 0 come from Varnish directly without invoking the backend.

damien tournoud’s picture

The response for GET / as an authenticated user is really wrong:

HTTP/1.1 200 OK
Server: nginx
Content-Type: text/html; charset=utf-8
Expires: Sun, 19 Nov 1978 05:00:00 GMT
Last-Modified: Fri, 28 Nov 2014 16:51:02 GMT
Cache-Control: no-cache, must-revalidate, post-check=0, pre-check=0
Content-Language: en
X-Generator: Drupal 7 (http://drupal.org)
Content-Encoding: gzip
Content-Length: 2082
Accept-Ranges: bytes
Date: Fri, 28 Nov 2014 16:51:02 GMT
X-Varnish: 61743800
Age: 0
Via: 1.1 varnish
Connection: keep-alive

No ETag, no Vary: cookie? Drupal does send both in drupal_serve_page_from_cache() so I assume something in your infrastructure is messing up with those?

damien tournoud’s picture

My bad, this is an authenticated request. Right now we don't send a Vary: cookie for those, which is a bug. But the response should have an ETag, as one is set in drupal_page_header().

So there are a series of things here:

  • A real bug: we don't send a Vary: Cookie header for authenticated responses. This is just braindead, but it is usually not a problem because we *also* set them as non-cacheable;
  • An annoyance: we send Last-Modified-Since and ETag even on non-cacheable responses. I don't see this as spec-breaking, but they are indeed unecessary;
  • A mystery: why don't you get a ETag on this authenticated homepage response?
damien tournoud’s picture

Just to clarify: the absence of a Vary: Cookie is what allows the browser to do a conditional request after logout. The absence of ETag in your particular case it what allows Varnish to satisfy the conditional request with a 304.

klausi’s picture

Title: Changed date format for Last-Modified header breaks Varnish caching » Changed date format for Last-Modified header breaks caching for Varnish/Nginx configs

Thanks Damien. Indeed, the Etag is missing for authenticated requests. It looks like this is related to Nginx and gzip encoding of responses. It appears that Nginx removes the Etag if it alters a response to gzip it. Could not find a good doc page or similar to confirm that yet.

klausi’s picture

StatusFileSize
new3.08 KB

klausi pushed some commits to the pull request.

For an interdiff please see the list of recent commits.

klausi’s picture

Issue summary: View changes

@Damien: the absence of Vary: Cookie is not a bug. RFC 2616 says "A server MAY include a Vary header field with a non-cacheable response ...", so this is not a requirement.

So we cleared up the mystery - Nginx messes with ETags, which we cannot easily fix (we don't want to hack Nginx sources).

Now that we have a better understanding of the situation patch reviews welcome :-)

David_Rothstein’s picture

I can put something in the 7.33 release notes about the regression, but I'm trying to figure out what... do we know this is a problem for all Nginx versions, or just some configurations, or...? (It seems like more people would have run into this if it's a problem for all Varnish/Nginx sites.)

And so the deal is that we were masking this bug by using the wrong Last-Modified format before, but since #1918820: HTTP header date formats the format is correct so browsers are now contacting the server when they're supposed to and that's what's triggering the bug?

The patch itself looks reasonable, I think. Could use another review or two given that it touches pretty fundamental code.

David_Rothstein’s picture

I went ahead and added something vague to https://www.drupal.org/drupal-7.33-release-notes now, but perhaps it could be improved upon.

klausi’s picture

Issue summary: View changes

I think that addition to the release notes is fine, thanks!

You understood correctly. Now that browsers understand the Last-Modified format they make a request to the server, but Nginx has removed the ETag so they don't send a If-None-Match header. Varnish answers with 304 and the browser just displays the previous logged-in page.

Nginx removes the ETag when it applies gzip compression; I added gzip as prerequisite to the issue summary.

fabianx’s picture

Status: Needs review » Reviewed & tested by the community

RTBC, it does not make sense to send e-tag or last-modified headers for authenticated pages.

If you use Akamai or such you will send those headers yourself anyway.

Status: Reviewed & tested by the community » Needs work

The last submitted patch, 24: 2381839.patch, failed testing.

Status: Needs work » Needs review

klausi queued 24: 2381839.patch for re-testing.

klausi’s picture

Status: Needs review » Reviewed & tested by the community

Assuming this was just a random testbot fail.

David_Rothstein’s picture

Status: Reviewed & tested by the community » Fixed
Issue tags: +7.36 release notes

OK, I guess this is good to go - hopefully it won't break anything else :) Doesn't look like Authcache would be affected by this since it sets its own headers, and @znerol's comment above suggests he's OK with this approach also.

Committed to 7.x - thanks!

  • David_Rothstein committed f7cda60 on 7.x
    Issue #2381839 by klausi, Damien Tournoud: Changed date format for Last-...

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

jbrown’s picture

Should this have a change record?

David_Rothstein’s picture

I didn't think it necessarily needed one since it's a pretty obscure change (though I did mention it in the release notes, in case it affects someone). If anyone does think it needs a change record and wants to create one, I'd be happy to retroactively link to it from the release notes, etc.