Currently, Drupal spits out files that cannot be cached by the browser nor proxies. While this is very good for PHP and dynamic files, this is very bad for CSS, JS, and images. These files tend not to change at all and having to constantly server them because of improper caching is extremely bad. Why is Google so fast? It's really great at caching :-)

So with a simple clean install, Drupal spits out:

18 files, at 49kb total

With this patch, it still spits out those files, but now a whopping 42kb of these files are cached by the browser. That means the webserver only servers 7kb, a A SAVINGS OF 85%!!

Now, original ideas called for modifying the core files to send out different headers -- this could lead to some hackery because we want those headers for almost all files, except a handle of ones that we want to properly cache. Editing that file is not the most optimal way, so instead we patch .htaccess and can reap a super clean patch with superb performance benefits.

More info: http://perishablepress.com/press/2006/01/10/stupid-htaccess-tricks/

CommentFileSizeAuthor
#10 d_8.patch692 bytesm3avrck
#6 d_2.patch655 bytesm3avrck
#3 d_1.patch664 bytesm3avrck
d_0.patch629 bytesm3avrck
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

m3avrck’s picture

If you want to see whether I'm telling the truth or not, grab firebug and run it on a regular drupal site. Gasp at the fact that *no* files are cached by the browser properly. Visit other sites to see the difference. Apply this patch and then breathe relief :-)

m3avrck’s picture

Version: 6.x-dev » 5.x-dev

Per chx's suggestion on IRC

m3avrck’s picture

FileSize
664 bytes

Updated patch after talking with Heine on IRC. Wrap the headers in an IF incase that module is turned off.

Dries’s picture

Any idea why these headers are not being sent? I'd like to understand the source of the problem.

What is the availability of mod_headers? Is that a common module?

Can anyone reproduce this?

m3avrck’s picture

Dries, sure:

/**
 * Set HTTP headers in preparation for a page response.
 *
 * Authenticated users are always given a 'no-cache' header, and will
 * fetch a fresh page on every request.  This prevents authenticated
 * users seeing locally cached pages that show them as logged out.
 *
 * @see page_set_cache
 */
function drupal_page_header() {
  header("Expires: Sun, 19 Nov 1978 05:00:00 GMT");
  header("Last-Modified: " . gmdate("D, d M Y H:i:s") . " GMT");
  header("Cache-Control: no-store, no-cache, must-revalidate");
  header("Cache-Control: post-check=0, pre-check=0", FALSE);
  header("Pragma: no-cache");
}

We send the headers *not* to cache any file.

Now--this is *correct* for PHP files and any dynamic file. It's in correct for regular text and binary files. You'd have to hack this function up to say if the file is an image or css or js or whatever to not send those headers.

This solution instead overrides the headers before anything is sent fixing it for those. It's very clean.

Gerhard just put this patch on Drupal.org and everyone in IRC is acknowledging the fact that the site is *much* more snappier now.

m3avrck’s picture

FileSize
655 bytes

Don't cache html or htm files as per John VanDykes request on IRC.

chx’s picture

drupal.org is sometimes snappy, sometimes not so much:

[01:02] [craq] drupal.org is slow as hell for me today :(

seems time dependent.

Also your explanation makes very little sense -- when serving a jpg or css file, Drupal is not booted.

Dries’s picture

But, we only generate those headers for dynamically generated pages, not for static files. Static files are served by Apache and Apache deals with such headers already ... Maybe if you are using private files, but most files (i.e. those that are part of the theme) aren't stored in the database anyway ... I still don't understand.

Gerhard also wrote: " Yesterday I implemented http://drupal.org/node/104506 for example. I didn't see that much of an improvement, but with some good will you can see a dent in the bandwidth graph."

drumm’s picture

Version: 5.x-dev » 6.x-dev
m3avrck’s picture

Title: Cache files spit out by drupal, speed up by 85%+ » cache non-PHP files that live within Drupal
Version: 6.x-dev » 5.x-dev
FileSize
692 bytes

@chx and @dries - Yes my post about the headers was incorrect. It was made in haste and those headers apply *only* to text/html pages served by Drupal. Does not apply to images, CSS, JS, etc..

@killes - You won't see a huge drop in bandwith/resources right away. Browsers still have local caches--some are session based, others are smarter like Opera, and so forth. But overtime this patch should show a decrease in bandwith/resources provided regular traffic stays constant.

That said, here is what Drupal currently sends for files:

Date	Sat, 06 Jan 2007 03:50:22 GMT
Server	Apache/2.0.59 (Unix) PHP/5.1.6 DAV/2
Last-Modified	Tue, 05 Sep 2006 03:50:56 GMT
Etag	"b676b4-2cd-c40d4800"
Accept-Ranges	bytes
Content-Length	717
Keep-Alive	timeout=15, max=97
Connection	Keep-Alive
Content-Type	text/css

Note, there is nothing about about caching the file. Proxies would not cache this file. Some browsers would look and compare the Etag but not guaranteed IIRC.

So here's a new patch which changes the headers to:

Date	Sat, 06 Jan 2007 03:51:46 GMT
Server	Apache/2.0.59 (Unix) PHP/5.1.6 DAV/2
Last-Modified	Tue, 05 Sep 2006 03:50:56 GMT
Etag	"b676b4-2cd-c40d4800"
Accept-Ranges	bytes
Content-Length	717
<strong>Cache-Control	max-age=1209600
Expires	Sat, 20 Jan 2007 03:51:46 GMT</strong>
Keep-Alive	timeout=15, max=98
Connection	Keep-Alive
Content-Type	text/css

Now, these files can be properly cached--both by proxies, browsers, and everything else.

Here's what changed in the patch:

  • Instead of using mod_headers, use mod_expires, this removes this dependency and puts it on one that Drupal already relies on
  • Fix the current incorrect usage of mod_expires, since it's worthless without turning it on ;-)
  • Add in an expiration default of 2 weeks which applies to all files
  • Override this default for PHP pages served by Drupal (which is now *actually* working)

The previous patch was correct, it set Cache-control to 2 weeks explicitly, but it didn't have the properly paired expiration header. By using mod_expires, it automatically adds both headers.

This patch should be ready go. This is a bug in 5 and it's not an API change. Not only that, but it fixes 2 bugs: incorrect usage of mod_expires and the fact that Drupal doesn't send proper headers for files to be cached.

m3avrck’s picture

Looks like my <strong> within <code> didn't get rendered. You get the point though :-p

moshe weitzman’s picture

I got Ted to explain to me what was going on because I was agreeing with Dries and others that Apache handles http headers for css, images, etc.

If you look at the patch, it changes .htaccess - so in fact we *can* influence the behavior of files which we do not serve. Thats the key - Apache's default headers for these files is lousy for our needs, and we should just fix them. I am light on details, but I hope this clears up some confusion.

if this all sounds complicated, just look at the patch. it is 2 lines, plus comments.

m3avrck’s picture

If you still don't know *why* this is useful, have a read:

http://www.websiteoptimization.com/speed/tweak/cache/

(although he incorrectly uses Expires within FileMatch which seems to cause an error on Apache 2, perhaps it works in 1.3 -- concepts are the same though :-)

Junyor’s picture

@m3avrck: How does this work with the CSS grouping patch? Does Drupal cache the CSS files, so that the file name and content will be unchanged most of the time when a client visits? Otherwise, this patch doesn't make much sense.

moshe weitzman’s picture

Yes, the preprocessed css files keep same name and content for a long time. see the new comment at top of drupaql_add_css for a bit more detail.

Dries’s picture

I still don't understand what files you are talking about. Do you have examples of files that benefit from this?

Gerhard Killesreiter’s picture

Any static files like the small bluebeach images and css files should profit form this.

m3avrck’s picture

Right, this applies to a significant number of files. On drupal.org this applies to 38 files out of 39. That's 97% less HTTP requests the *2nd* time around. Not too shabby :-p

Junyor’s picture

@moshe: Thank you for the pointer.

This patch gets a +1 from me in theory, though I haven't had a chance to test it.

FWIW, I talked about Opera's caching of documents a while back on the development list: http://lists.drupal.org/pipermail/development/2005-April/003851.html

Dries’s picture

Status: Needs review » Fixed

Alright, committed to HEAD. Thanks.

m3avrck’s picture

Version: 5.x-dev » 4.7.x-dev
Status: Fixed » Reviewed & tested by the community

Easy fix for 4.7 too :-)

Since in effect the mod_expires in 4.7 is useless without the ExpiresActivate On part (otherwise it's off).

matt westgate’s picture

Just wanted to say I notice a big speed boost on Drupal.org as a result of this patch. With the cache headers being set correctly, things are much snappier now. Great patch guys!

killes@www.drop.org’s picture

Status: Reviewed & tested by the community » Fixed
Anonymous’s picture

Status: Fixed » Closed (fixed)
Rob T’s picture

Which of these patches are to be used for Drupal version 4.7.x? What about for 5.1.x?

It appears the "d8" patch is for v 5.1.x, and the "d2" is for 4.7.x. Is this correct? I'd like to apply this to some of my installs if possible, so I just want to make sure.

m3avrck’s picture

The patch that was commited (D8) will work on 5 and 4.7.