This bug has been discussed before, but not resolved.
http://drupal.org/node/view/1740
http://drupal.org/node/view/5686

One suggestion was to change the browser settings. That means I have to tell all my users to change their browser cache settings. Not a great way to run a site.

Instead, the login should just work with default browser settings. Currently, login does not work, as described below.

I log in to my drupal site, feingold.christianlong.com, and the logged-in home page appears, with my user name. OK.

The address bar shows
http://feingold.christianlong.com/node?PHPSESSID=867c4e2d57d290a3f59e138...

In the site menu, the "Home" link links to http://feingold.christianlong.com/

Here's the problem: when I click on the "Home" link, the browser takes me to the un-logged-in version of http://feingold.christianlong.com/, which was cached when I was not logged in.

When I hit refresh, I do get the logged-in version of the page.

So, to restate
Start at home page of site, not logged in
Log in - this works, and brings up a logged-in version of the page
Click "Home" - this is where the problem is. I get the cached version of the home page (from when I was not logged in)
Refresh browser - now I see the correct, logged-in version of the home page.

It looks like I am logging in OK, but that Drupal is not telling my browser that there is a new version of the home page that it needs to check for

Maybe the original (non-logged-in) version of the page is not marked for no-cache, and so when I click on the "Home" link, I get the cached (non-logged-in) version

Browsers: MSIE 6 (happens a lot) and Firebird (happens sometimes).

Also happens with christianlong.com

Attached, find an annotated record of the HTTP header traffic.

Thanks

Christian

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

christianlong@christianlong.com’s picture

Forum discussion is here

http://drupal.org/node/view/5669

moshe weitzman’s picture

I am seing this same behavior at http://www.nshp.org. To see for yourself, login using the box on the home page while using IE (Firefox doesn't have a problem).

user: mwtest8
pass: testpass

Here is the php config of the server.

Note that IE is set to "Check for newer pages: automatically" in Options -> Temporary Internet Files -> Settings

Thoughts on how to resolve this are welcome.

duztin’s picture

I had this problem, fix the date on your pc, mine was a month ahead and my cookies were expiring too soon.

killes@www.drop.org’s picture

Setting to cvs. Is the fix proposed by duztin a real fix? if not: what else could we do?

matt westgate’s picture

Priority: Normal » Critical
FileSize
644 bytes

I'm moving this to critical because in some cases this bug causes the hidden $edit['destination'] value of the login form to be set to 'logout', and to the user it appears they can't login since they are immediately logged out again. Very frustrating.

The other side effect of this browser caching bug are, as stated above, the authenticated user will receive stale pages and if you have the login block enabled it will look as though they're were unexplicably logged out.

It all depends on your browser and its settings, but to attempt to reproduce:

1. Login to your site.

2. Next, Click the homepage link. If you are served a stale copy of the page, you've hit the bug. This seems to happen more with IE and Safari than Firefox.

A potential patch is to have Drupal issue the following header:

header('Cache-Control: no-store, no-cache, must-revalidate');

It works, but I don't know the implications this has on other Drupal components such as RSS conditional GETS and gzip page serving.

matt westgate’s picture

Assigned: Unassigned » matt westgate
FileSize
593 bytes

Thanks to Ethereal and LiveHTTPHeaders, I was able to trace this problem to the Cache-Control header being sent by Mac OS X server. On this OS, mod_expires is enabled by default for Apache which sets the Cache-Control time to 60 seconds for dynamically rendered pages.

The implications of this action are that once you login, re-visiting any page on the site will result in stale, locally-cached versions if you viewed that page within 60 seconds of logging in. Since you weren't logged in on those pages, the system will have appeared to logged you out. It will keep doing this unless you wait 60 seconds to login. Thus, users perceive this as a failure to successfully log in.

This new patch doesn't tweak bootstrap.inc. Instead it uses .htaccess to test if mod_expires is enabled and resets the caching time to 1 second for dynamically rendered content. The benefit of this approach is that it doesn't interfere with other types of caching that may be in place for images, pdf files, etc.

# Overload mod_expires variables.
<IfModule mod_expires.c>
  # Reduce the time dynamically generated HTML pages are cache-able.
  ExpiresDefault A1
</IfModule>

Down the road it may also be wise to consider sending our own caching headers to maintain control of our caching environment. I reviewed the following pieces of software, all of which intentionally disable caching by the client browser or proxy caches. I just grabbed the ones I thought were most popular.

  • Plone (Cache-Control parameter is configurable)
  • Wordpress (except for RSS feeds)
  • eZpublish
  • Mambo
  • phpBB

And if you're so inclined, here's the relevent code snippets for each piece of software.

moshe weitzman’s picture

I'v seen this bug on drupal.org, so this impacts more than OSX server ... Looks like a nice clean patch to me.

jvandyk’s picture

+1 from me. This reduces confusion among end users while retaining caching ability for other mime types.

joshuajabbour’s picture

Does this end up having any effect on the RSS conditional GETS? If not, then +1 from me.

(The WordPress code does check for RSS, so I'm thinking this may set them to expire too?)

jvandyk’s picture

FileSize
792 bytes

Regarding the concern about RSS caching: this patch is identical to the previous one except instead of targetting the default we target the text/html MIME type.

<IfModule mod_expires.c>
  # Reduce the time dynamically generated HTML pages are cache-able.
  ExpiresByType text/html A1
</IfModule>

This means that RSS feeds, which are MIME type text/xml, are not affected.

moshe weitzman’s picture

i think this is a small, worthwhile patch. this bug plagues drupal.org as well.

Junyor’s picture

@jvandyk: Is it just me or is there some junk before the meat of that patch?

Better cache header handling would be a welcome addition here[1], but I'm not sure if the suggested patch will solve the problem efficiently. Isn't this change saying that all pages must always be redownloaded if mod_expires is enabled? Won't that cause unnecessary bandwidth and performance overhead?

[1] I'm constantly seeing stale pages throughout all authenticated pages with Opera.

moshe weitzman’s picture

"Isn't this change saying that all pages must always be redownloaded if mod_expires is enabled?"

Correct. And that is desired behavior. A dynamic application like ours has no choice but to redownload every page (excluding RSS feeds).

killes@www.drop.org’s picture

This patch only affects users of mod_expires and harms nobody else. +1

Steven’s picture

Applied to CVS. Do we still need our own Cache Control mechanism? I'm included to let the server handle this.

Anonymous’s picture

matt westgate’s picture

Title: Clicking on Home after login brings up browser-cached non-logged-in version of home page » Prevent browser page caching of dynamic content
FileSize
713 bytes

Browsers are sometimes pulling files from their cache and serving stale pages when instead they should be asking the server for a new copy. This is because Drupal doesn't issue its own Cache-Control headers like most other CMS's. Instead those details are currently left up to each server, which users on shared hosting aren't authorized to configure.

This bug rears it's ugly head when site admins use Drupal's caching mechanism. If a user clicks on the homepage after logging in, they're very likely to see a stale unauthenticated view. Or if they log out and returns to the homepage, it'll appear that they're still logged in. But perhaps the most confusing of these caching issues is when you click the login button and are once again presented with the same login form. These cached views are occurring because the browser still thinks those pages are valid in it's cache. In otherwards, it's sending if-Modified-Since headers and receiving a valid HTTP 304 response.

Now this doesn't happen 100% of the time in all cases. It very much depends on the browser (usually Firefox or Safari but never Konqueror) and how the server is configured (Mac OS X server can be quite troublesome for example while a default FreeBSD install of apache usually sends the proper headers).

The most elegant solution I've come up with is to issue our own Cache-Control headers after a request has been cached by Drupal's caching mechanism. The header workflow then becomes:

// User requests a page that isn't in the Drupal cache table
HTTP 200
(server issues it's own caching headers)

// User requests the same page (now in the cache table)
HTTP 200
(Drupal issues it's own caching headers explicitly stating not to cache the page)

// User requests the same page (now in the cache table)
HTTP 304
(server issues it's own caching headers)

If I'm understanding how things work, this lets the browser use the cached page only as long as we're sending 304 responses. When the user logs in, the browser cache is invalidated since a different set of headers are emitted, causing the stale copies to expire.

The end result is that browsers are still allowed to do caching of requests (including XML feeds) but only hold on to those copies until Drupal says otherwise.

chx’s picture

+1

Bèr Kessels’s picture

+1 from me. Nice clean patch. Works in FF and konq, cannot test in IE, which has very aggressive caching, AFAIK.

matt westgate’s picture

Setting this to active at the moment since Safari users are still experiencing caching issues, albeit less in frequency.

matt westgate’s picture

I think this is the elegant catch-all case.

The most problematic page with stale caching is the frontpage. Not variable_get('site_frontpage'), but when $_GET['q'] is NULL. The solution that works is to make sure this request is never cached by the browser (Drupal can still cache it of course).

So in summary. This patch should resolve all browser caching problems while still gracefully emitting HTTP 304 headers in all cases but the / request.

Dries’s picture

-1 for the $_GET['q'] addition. While it might be the most problematic page, it is also the most popular page. The $_GET['q'] scenario merely hides the fact that the headers/caching are not working like they should.

After a POST operation (eg. log in), Drupal should never send a cached page. This is checked for in page_get_cache() and page_get_cache(). Maybe those checks are bogus.

matt westgate’s picture

Component: user system » other
FileSize
698 bytes

I figured it out.

Let's assume for a moment that we're using Drupal caching and the cache table is empty. A page is requested and the browser and Drupal both cache it. On the next request Drupal pulls the page from its cache and all of a sudden the browser sees a different set of headers sent for the same page (such as an etag and gzip header), so another HTTP 200 response is invoked. Finally on the third request for the same page, everything lines up and a HTTP 304 is emitted, saving valuable bandwidth. This is true for subsequent requests until the Drupal cache is cleared.

The caching problem is a result of some browsers getting confused between the first and second requests for the same page. Sending 304s for a page that has multiple cached copies confuses the heck out of some browsers. In the case of Safari, it tries to resolve this by displaying the most recent copy it has (even if the server told it not to cache that copy). Other browsers just show their last known cacheable copy which is still wrong.

The solution is to explicitly tell the browser when to cache a page that will be 304'd later on. In Drupal talk, this means the browser shouldn't cache a request that isn't also in the Drupal cache table. In the above example this would be the first request.

Steven’s picture

That explanation makes perfect sense ;). Thanks for figuring this one out, I applied it to HEAD.

Will this patch fix the problem that some people have experienced of seeing a stale page after logging in (making it seem as if the log in failed)? This sounds like a case where the browser is confused about which cache copy to pick.

(PS: November 17 1978? Birthday? ;) )

matt westgate’s picture

In my tests this solved what appeared to be failed login attempts (stale caching).

And yes, that date is very important for the success of Drupal ;-)

Anonymous’s picture

sneakin@nolan.eakins.net’s picture

Version: » 4.6.0

I just setup 4.6 for progressiveindiana.org, and I've been having cache problems in Opera. The site's editor reported problems with IE. I've been able to reproduce it by changing the blocks and then going back to a page like the front page. The new blocks setup does not appear. I'm using the .htaccess that came with 4.6 which includes the mod_expire block, but I don't think that is doing anything.

I also have another drupal site, my own, at nolan.eakin.net. It uses 4.5, and I have not been able to reproduce the cache problem in Opera.

So perhaps something has changed which is causing this problem during the interrim?

- Nolan

matt westgate’s picture

Marking this as closed since this patch was for 4.6 and not 4.5.

Also we can't offer any feedback with caching problems related to this specific issue without a dump of the headers.