The problem

#101227: Gzip aggregated CSS and JS introduced gzipping of css and js aggregates, and added the following to .htaccess:

  <IfModule mod_headers.c>
    # Serve gzip compressed CSS files if they exist and the client accepts gzip.
    RewriteCond %{HTTP:Accept-encoding} gzip
    RewriteCond %{REQUEST_FILENAME}\.gz -s
    RewriteRule ^(.*)\.css $1\.css\.gz [QSA]

    # Serve gzip compressed JS files if they exist and the client accepts gzip.
    RewriteCond %{HTTP:Accept-encoding} gzip
    RewriteCond %{REQUEST_FILENAME}\.gz -s
    RewriteRule ^(.*)\.js $1\.js\.gz [QSA]

    # Serve correct content types, and prevent mod_deflate double gzip.
    RewriteRule \.css\.gz$ - [T=text/css,E=no-gzip:1]
    RewriteRule \.js\.gz$ - [T=text/javascript,E=no-gzip:1]

    <FilesMatch "(\.js\.gz|\.css\.gz)$">
      # Serve correct encoding type.
      Header append Content-Encoding gzip
      # Force proxies to cache gzipped & non-gzipped css/js files separately.
      Header append Vary Accept-Encoding
    </FilesMatch>
  </IfModule>
</IfModule>

This means that gzipped versions of the files are served by default when available, but there are a couple of issues:

1. The regexp matches .css.gz, so apache checks if css_blah.gz.gz exists.

2. The regexp matches any .css file anywhere in the filesystem, so unaggregated files (like modernizr) are checked too - i.e. modernizr.min.js.gz which is never going to exist.

We can tweak the regular expression to only look for files that are very likely to be css or js aggregates.

Before (with an aggregate):

130654 stat("/home/catch/www/drupal-dev/sites/default/files/css/css_ufcCLAJ8Y9oLIXsFfOXn4xGjiuPiyazLLWmbA8w2Hyg.css", {st_mode=S_IFREG|0664, st_size=23115, ...}) = 0
130654 stat("/home/catch/www/drupal-dev/sites/default/files/css/css_ufcCLAJ8Y9oLIXsFfOXn4xGjiuPiyazLLWmbA8w2Hyg.css.gz", {st_mode=S_IFREG|0664, st_size=5077, ...}) = 0
130654 stat("/home/catch/www/drupal-dev/sites/default/files/css/css_ufcCLAJ8Y9oLIXsFfOXn4xGjiuPiyazLLWmbA8w2Hyg.css.gz", {st_mode=S_IFREG|0664, st_size=5077, ...}) = 0
130654 stat("/home/catch/www/drupal-dev/sites/default/files/css/css_ufcCLAJ8Y9oLIXsFfOXn4xGjiuPiyazLLWmbA8w2Hyg.css.gz", {st_mode=S_IFREG|0664, st_size=5077, ...}) = 0
130654 stat("/home/catch/www/drupal-dev/sites/default/files/css/css_ufcCLAJ8Y9oLIXsFfOXn4xGjiuPiyazLLWmbA8w2Hyg.css.gz.gz", 0x7fffd84c01d0) = -1 ENOENT (No such file or directory)
130654 openat(AT_FDCWD, "/home/catch/www/drupal-dev/sites/default/files/css/css_ufcCLAJ8Y9oLIXsFfOXn4xGjiuPiyazLLWmbA8w2Hyg.css.gz", O_RDONLY|O_CLOEXEC) = 16
130654 mmap(NULL, 5077, PROT_READ, MAP_SHARED, 16, 0) = 0x7fae25418000

After:

130735 stat("/home/catch/www/drupal-dev/sites/default/files/css/css_ufcCLAJ8Y9oLIXsFfOXn4xGjiuPiyazLLWmbA8w2Hyg.css", and {st_mode=S_IFREG|0664, st_size=23115, ...}) = 0
130735 stat("/home/catch/www/drupal-dev/sites/default/files/css/css_ufcCLAJ8Y9oLIXsFfOXn4xGjiuPiyazLLWmbA8w2Hyg.css.gz", {st_mode=S_IFREG|0664, st_size=5077, ...}) = 0
130735 stat("/home/catch/www/drupal-dev/sites/default/files/css/css_ufcCLAJ8Y9oLIXsFfOXn4xGjiuPiyazLLWmbA8w2Hyg.css.gz", {st_mode=S_IFREG|0664, st_size=5077, ...}) = 0
130735 stat("/home/catch/www/drupal-dev/sites/default/files/css/css_ufcCLAJ8Y9oLIXsFfOXn4xGjiuPiyazLLWmbA8w2Hyg.css.gz", {st_mode=S_IFREG|0664, st_size=5077, ...}) = 0
130735 openat(AT_FDCWD, "/home/catch/www/drupal-dev/sites/default/files/css/css_ufcCLAJ8Y9oLIXsFfOXn4xGjiuPiyazLLWmbA8w2Hyg.css.gz", O_RDONLY|O_CLOEXEC) = 16
130735 close(16)    

Before (with modernizr):

130733 stat("/home/catch/www/drupal-dev/core/assets/vendor/modernizr/modernizr.min.js",  <unfinished ...>
130733 stat("/home/catch/www/drupal-dev/core/assets/vendor/modernizr/modernizr.min.js.gz",  <unfinished ...>
130733 <... stat resumed>0x7fffd84c0270) = -1 ENOENT (No such file or directory)
130733 openat(AT_FDCWD, "/home/catch/www/drupal-dev/core/assets/vendor/modernizr/modernizr.min.js", O_RDONLY|O_CLOEXEC <unfinished ...>

After:

138554 stat("/home/catch/www/drupal-dev/core/assets/vendor/modernizr/modernizr.min.js",  <unfinished ...>
138554 openat(AT_FDCWD, "/home/catch/www/drupal-dev/core/assets/vendor/modernizr/modernizr.min.js", O_RDONLY|O_CLOEXEC <unfinished ...>

Proposed solution

Improve the regular expression.

Release note snippet

The htaccess rewrite rule that enables static gzipped JavaScript and CSS aggregates to be served to browsers has been modified to improve performance. Sites should update their .htaccess files to take advantage of this performance improvement.

Comments

catch’s picture

Another issue:

1. Set css/js gzipping off in Drupal
2. Make sure mode_deflate is set up correctly
3. Delete files/css/* and files/js/*
4. Clear your browser cache
5. hit a page, check the headers of the css and js files - they will /not/ be gzip
6. hack this section out of .htaccess
7. clear browser cache and hit the page again.
8. Now the files are served gzip compressed.

So I'm not sure why yet, but this appears to be preventing mod_deflate from working altogether. Running out of time this afternoon to investigate further and .htaccess rules aren't one of my favourite things in the world unfortunately.

mikeytown2’s picture

This seems interesting to me.
subscribe

bryancasler’s picture

subscribe

sun’s picture

Good idea to write that .htaccess file dynamically into the appropriate folders, if enabled. And yes, let's make that configurable.

catch’s picture

Since there's a cost to the file stats, it'd be good to understand how that compares to the cost of mod_deflate gzipping on each request. It's hard to do a direct comparison since one is i/o and one is CPU, but I wonder if it'd be possible to benchmark or get an idea from the strace output. I've not straced with mod_deflate running to see what that looks like yet (and may not for a while, will see how it goes). This doesn't affect what happens in this issue but would be nice to have documented.

And another issue I realised, if you have css/js aggregation disabled, then you also still get these extra stats for the css/js files (and ten it could be for 20-30 files per page instead of just a handful or so). I don't care that much about performance on sites that don't use aggregation, but that could also be fixed by writing the .htaccess dynamically.

Places we could do that:

1. Form submits from the performance settings screen.
2. hook_flush_caches()
3. hook_requirements()

Just the former doesn't help people who use settings.php for this kind of configuration change, so one of the latter, or elsewhere, would be handy, but we don't want that running too often either.

owen barton’s picture

I did some server side benchmarks on http://drupal.org/node/101227#comment-3176382 - mod_deflate came out over 10x slower, and the file stats were not detectable (they might be on NFS, of course). I am not sure the CPU overhead of mod_deflate is favorable compared to the file stats on the vast majority of setups. I should also point out that if you care about file stats, you should be using vhost configuration rather than .htaccess in the first place.

Tweaking the regexp here so it doesn't look for js.gz.gz sounds like a fine idea, as does dynamically writing the .htaccess to the files/css and files/js directories. The only downside of this is that I was thinking that it might be nice to write a little Drush script to make gzip versions of all the contrib js and css, and the latter change would make this harder (you would need to copy all the files to files/css and files/js and write a module to alter all the requests to this directory).

andypost’s picture

subscribe

pwolanin’s picture

I'm tempted to suggest this whole feature be rolled back - not clear it's worth the headache for a typical site.

owen barton’s picture

@pwolanin - see http://drupal.org/node/101227#comment-3152752 "this patch gives a ~20-60% saving in page load performance for initial page loads (or other page loads where you download an aggregate), saving somewhere from ~0.3-3 seconds, depending on your connection speed". I think this is *well* worth the headache - this is probably the single biggest measurable improvement in end user performance in Drupal 7.

File stats were not measurable in my tests, the IO gain from having to read/serve a smaller file in the first place may actually compensate completely. Sites with NFS storage are hardly that typical, it has not yet been shown that mod_deflate is actually faster even using NFS - and in any case this is trivial to remove from htaccess if this is a concern. The proposed changes are reasonable performance tweaks - but I don't see any fundamental problem here.

mikeytown2’s picture

I'm with Owen Barton on this one. Let's try to get rid of the .gz.gz stat, but in general this is a very good performance improvement for the vast majority of drupal sites out there.

As for the performance implications of a stat, I found one interesting article. In this article the author says that using a stat cache "just got in the way"
http://www.sirgroane.net/2010/03/tuning-glusterfs-for-apache-on-ec2/

Frequently used files will get cached in memory. I don't know about a cache miss, because the file isn't there, and how that will effect performance on a shared file system. This is something that needs to be benchmarked. For the the vast majority of drupal sites out there, they are on one box, this is a win; that's what Owen Barton's benchmarks showed. Multi webhead setups need to be tested before we make a decision on this; but because of drupal's market share with single server setups I am against reverting this change.

catch’s picture

I think we can remove the extra stats altogether while keeping the feature if we do this:

* write the .htaccess conditionally to files/css and files/js
* Only write it when both aggregation and gzipping are enabled.
* check it reasonably aggressively (performance page, cache clears at least).
* Might need to do something about existing files that aren't gzipped in the situation that aggregation is on, then gzipping is turned on later, that's not impossible to resolve.

If we get that set up, then it ought to be safe enough to just redirect to the gzip version from the generated .htaccess without the file system check.

mikeytown2’s picture

Issue tags: -Perfomance +Performance

Here's how core does "dynamic" htaccess writing.
http://api.drupal.org/api/drupal/includes--file.inc/function/file_create...

Looks like we would have to do an htaccess check every time a .gz file is created in the css/js dir drupal_build_css_cache, drupal_build_js_cache. Just like file_unmanaged_copy does the check every time a file is created in the files directory.

Fixed this issue tag's spelling.

mikeytown2’s picture

Untested; but worth testing out IMHO. RewriteCond %{REQUEST_URI} (^/(.*)(\.js|\.css)) [NC] comes from the parallel module; forget why I didn't use RewriteCond %{REQUEST_URI} (\.js|\.css)$ [NC] maybe it works; I forget.

  <IfModule mod_headers.c>
    # Serve gzip compressed JS/CSS files if they exist and the client accepts gzip.
    RewriteCond %{HTTP:Accept-encoding} gzip
    RewriteCond %{REQUEST_URI} (^/(.*)(\.js|\.css)) [NC]
    RewriteCond %{REQUEST_FILENAME}\.gz -s
    RewriteRule ^(.*) $1\.gz [QSA]

    # Serve correct content types, and prevent mod_deflate double gzip.
    RewriteRule \.css\.gz$ - [T=text/css,E=no-gzip:1]
    RewriteRule \.js\.gz$ - [T=text/javascript,E=no-gzip:1]

    <FilesMatch "(\.js\.gz|\.css\.gz)$">
      # Serve correct encoding type.
      Header append Content-Encoding gzip
      # Force proxies to cache gzipped & non-gzipped css/js files separately.
      Header append Vary Accept-Encoding
    </FilesMatch>
  </IfModule>
</IfModule>

Another alt

  <IfModule mod_headers.c>
    # Serve gzip compressed JS/CSS files if they exist and the client accepts gzip.
    RewriteCond %{HTTP:Accept-encoding} gzip
    RewriteCond %{REQUEST_URI} (^/(.*)/(css|js)/(.*)(\.js|\.css)) [NC]
    RewriteCond %{REQUEST_FILENAME}\.gz -s
    RewriteRule ^(.*) $1\.gz [QSA]

    # Serve correct content types, and prevent mod_deflate double gzip.
    RewriteRule \.css\.gz$ - [T=text/css,E=no-gzip:1]
    RewriteRule \.js\.gz$ - [T=text/javascript,E=no-gzip:1]

    <FilesMatch "(\.js\.gz|\.css\.gz)$">
      # Serve correct encoding type.
      Header append Content-Encoding gzip
      # Force proxies to cache gzipped & non-gzipped css/js files separately.
      Header append Vary Accept-Encoding
    </FilesMatch>
  </IfModule>
</IfModule>
wobbler’s picture

It must be 5 years since I used apache so my mod_rewrite skills are not the best.

RewriteCond %{REQUEST_URI} (^/(.*)/(css|js)/(.*)(\.js|\.css)) [NC]

could it be ...

RewriteCond %{REQUEST_URI} ^/.*/(css|js)/.*\.(css|js)$ [NC]

and

<FilesMatch "(\.js\.gz|\.css\.gz)$">

could be

<FilesMatch "\.(js|css)\.gz$">

? Other than that with a quick refresher through the apache docs it looks like it would work fine. Course Nginx and the gzip_static module takes care of all of that functionality anyway ;-)

omega8cc’s picture

As a side note I'm a bit surprised Drupal 7 assumes the .htaccess files will just work, while they are by default disabled in *every* Aegir install and are never used in any Nginx based install.

Is it possible to simply disable creating those never used files (and checks if they exist)? Am I missing something? It looks like a waste of resources (php + I/O) for any site not using Apache or managed in the Aegir Hosting System?

wobbler’s picture

Commenting on the issue of file stats, if you open a file with the intention to gzip it, then at os level it will perform a stat to check the existance of the file before attempting to open it anyway, then there is IO to read that file, cpu to crank out the output, if the file is large apache may write the gzip output into and temporary file etc etc.
When I first started experimenting with boost I looked into the performance cost of file stats and after a fair bit of fannying around I realised that the cost is pretty negligable.
As mentioned earlier in the post if there are 10+ files to compress per page I would definately say that 10 file stats would be quicker.
The point about misc/(js|css) files is a good one though, along with having the .htaccess files in the directory, although if memory serves me right you need a special directive in apache to get it to look for .htaccess files in all directories... oh no more file stats to check for the .htaccess files in each directory... where will it end!!! ;-)

omega8cc’s picture

Also, to speed up the regex it is a good idea to disable back-references when they are not used to match/replace anything, so in this case:

RewriteCond %{REQUEST_URI} ^/.*/(?:css|js)/.*\.(?:css|js)$ [NC]

At least it works for regex in Nginx, no idea if that is compatible with Apache, but it should be, imo.

omega8cc’s picture

Oh, and while we are at caching files checks, it is so easy in Nginx with something like:

  open_file_cache max=8000 inactive=30s; 
  open_file_cache_valid          60s; 
  open_file_cache_min_uses         3;
  open_file_cache_errors          on;
catch’s picture

@omega8cc. There's already a variable for the gzipping itself, sites using aegir should disable that. This issue is just about optimizing the regexp and making the .htaccess rules themselves optional.

ogi’s picture

subscribe

ogi’s picture

I hit a problem with these rewrite rules. Let's say I have http://example.com and http://hosting.com/example.com which lead to the same physical path with Drupal 7. example.com works perfectly. I had Alias /example.com /path/to/drupal and when hosting.com/example.com is accessed, Apache's rewrite engine mysteriously transforms the new URI /sites/...css into /path/to/drupal/sites/...css and handles it like URI, resulting in Not Found error. I removed the Alias directive and put symbolic link instead and everything works perfectly. There's something with Apache's order of URI-to-filename handling and it breaks in this case.

There has to be some Drupal docs that warns that Alias and these rewrites don't play together.

mikeytown2’s picture

issues I've encountered with boost and apache alias
#1044814: Boost working with a multisite subfolder installation (not subdomain)
Hard coding the document root is how I got around the issue. symlinks are 100% better.

owen barton’s picture

Status: Active » Needs work
StatusFileSize
new1.45 KB

Gone through the rules in some detail (at least from a theoretical regex point of view - I haven't tried to profile the code) and come up with the following patch.

The separate js/css rewrite blocks are replaced by one. I opted to leave the extension check in the rewrite - moving it to a separate RewriteCond doesn't add any performance that I can see (any complexity removed by the rewrite is added to the cond, and you have another setup/teardown of the regex matcher too).

Both the rewrite and FilesMatch regular expressions have been optimized to limit the number of character matches in the parens - \.(j|cs)s\.gz - this should avoid backtracks in some cases. Although it is a good idea, I didn't add any non-capturing parens, as they seem to really only apply to the FilesMatch (which may disable them itself) and are only supported in Apache 2 (and I suspect may break Apache 1).

I think the next step is to move this block into php such that it is written to files/css and files/js dynamically, as Catch describes.

ogi’s picture

Escaping dots (RE-style) in the second argument of RewriteRule is useless. The only special characters in substitution are $ and %.

RewriteRule ^(.*)\.(j|cs)s$ $1\.$2s\.gz [QSA]

danreb’s picture

StatusFileSize
new259.53 KB
new187.95 KB
  <FilesMatch "(\.js\.gz|\.css\.gz)$">
      # Serve correct encoding type.
      Header append Content-Encoding gzip
      # Force proxies to cache gzipped & non-gzipped css/js files separately.
      Header append Vary Accept-Encoding
    </FilesMatch>

Wonder why it's use append rather that set or merge in serving the correct encoding type, I check for documentation here - http://httpd.apache.org/docs/current/mod/mod_headers.html#examples

The problem I've encountered in Header append Content-Enocding gzip is I can't enable CSS and JS aggregation in Drupal performance page, the site breaks if it is installed in a Apache server which already doing gzip compression, the gzip in the header sent to the browser seems to be twice, please take a look in the attached screenshot.

This happens when Drupal is installed in a sub domain, sub directory as well as in multi-site setup using symlink, I've tested it in those situation. Drupal installed in the root directory seems not affected, I wonder why but if you want to verify this problem, here I setup a fresh install of a D7 site - > http://d7.levinson.com.ph

Login with user : admin/admin
go to configuration > performance
then turn on CSS / JS aggregation
VIOLA - the site is broken!

No effects even if you clear the cache several times.

Changing the code

      # Serve correct encoding type.
      Header append Content-Encoding gzip

to

      # Serve correct encoding type.
      Header set Content-Encoding gzip

seems to solve the problem.

Any suggestion on this?

mikeytown2’s picture

I can confirm the double encoding on that site. Saving one of the aggregated css files as a .css.gz and opening it has the correct contents then.

I'll be changing advagg from append to set as a result of this test.

owen barton’s picture

Please open a new issue for the double encoding header and post the issue # here - we can probably backport that change to D7

danreb’s picture

Here's the new issue - > http://drupal.org/node/1116416

batje’s picture

I came across this page

http://www.webmasterworld.com/apache/4217718.htm

where 2 .htaccess cracks (one of them they claim rewrote the Wordpress one) work on the drupal .htaccess file.

One of the parts they rewrite is the .gz part, and they make it look like

# Rules to serve gzip-compressed CSS and JS files.
# Requires both mod_rewrite and mod_headers to be enabled.
#
# If the client accepts gzip and compressed CSS and JS files exist, serve them with
# proper Content-Type headers and set the no-gzip variable to prevent double-encoding
RewriteCond %{HTTP:Accept-Encoding} gzip
RewriteCond %{REQUEST_FILENAME}.gz -s
RewriteCond $2->text/css ^css->(.+)$ [OR]
RewriteCond $2->text/javascript ^js->(.+)$
RewriteRule ^(.+\.(css|js))$ /$1.gz [T=%1,E=no-gzip:1,L]

Adding

And that <Filesmatch> on the Content-Encoding section could also be shortened-up...

<FilesMatch "\.(css|js)\.gz$">

As most people here, apache .htaccess files are not my forte either, so using knowledge of people who do understand this seems a good idea. The thread on webmasterworld is pretty old, but you could always try revive it (referring here), i guess.

owen barton’s picture

This looks like an improvement to be sure (although it further reduces the number of people who grok that piece of code). I think the main thing to do would be to test it, to see if it does indeed reduce the number of fstat calls and of course ensuring headers are all correct/unchanged with all the different combinations (Apache 1 & 2, each module enabled/disabled, .gz files present/absent).

yesct’s picture

Issue tags: -i/o +i-o

(the slash in the i/o tag breaks the autocomplete from adding new tags)

fietserwin’s picture

There is another error in the .htaccess that will be solved by this patch: json files that happen to have a json.gz version as well, will be rewritten to ....js.gz. If people think that this error needs a backport to D7, I will file it as a separate issue.

catch’s picture

Version: 7.x-dev » 8.1.x-dev

This is still an issue in 8.x and we should still remove the 'feature'. Behaviour change so moving to 8.1.x.

Version: 8.1.x-dev » 8.2.x-dev

Drupal 8.1.0-beta1 was released on March 2, 2016, which means new developments and disruptive changes should now be targeted against the 8.2.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.2.x-dev » 8.3.x-dev

Drupal 8.2.0-beta1 was released on August 3, 2016, which means new developments and disruptive changes should now be targeted against the 8.3.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.3.x-dev » 8.4.x-dev

Drupal 8.3.0-alpha1 will be released the week of January 30, 2017, which means new developments and disruptive changes should now be targeted against the 8.4.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.4.x-dev » 8.5.x-dev

Drupal 8.4.0-alpha1 will be released the week of July 31, 2017, which means new developments and disruptive changes should now be targeted against the 8.5.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.5.x-dev » 8.6.x-dev

Drupal 8.5.0-alpha1 will be released the week of January 17, 2018, which means new developments and disruptive changes should now be targeted against the 8.6.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.6.x-dev » 8.7.x-dev

Drupal 8.6.0-alpha1 will be released the week of July 16, 2018, which means new developments and disruptive changes should now be targeted against the 8.7.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.7.x-dev » 8.8.x-dev

Drupal 8.7.0-alpha1 will be released the week of March 11, 2019, which means new developments and disruptive changes should now be targeted against the 8.8.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.8.x-dev » 8.9.x-dev

Drupal 8.8.0-alpha1 will be released the week of October 14th, 2019, which means new developments and disruptive changes should now be targeted against the 8.9.x-dev branch. (Any changes to 8.9.x will also be committed to 9.0.x in preparation for Drupal 9’s release, but some changes like significant feature additions will be deferred to 9.1.x.). For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

Version: 8.9.x-dev » 9.1.x-dev

Drupal 8.9.0-beta1 was released on March 20, 2020. 8.9.x is the final, long-term support (LTS) minor release of Drupal 8, which means new developments and disruptive changes should now be targeted against the 9.1.x-dev branch. For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

Version: 9.1.x-dev » 9.2.x-dev

Drupal 9.1.0-alpha1 will be released the week of October 19, 2020, which means new developments and disruptive changes should now be targeted for the 9.2.x-dev branch. For more information see the Drupal 9 minor version schedule and the Allowed changes during the Drupal 9 release cycle.

Version: 9.2.x-dev » 9.3.x-dev

Drupal 9.2.0-alpha1 will be released the week of May 3, 2021, which means new developments and disruptive changes should now be targeted for the 9.3.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 9.3.x-dev » 9.4.x-dev

Drupal 9.3.0-rc1 was released on November 26, 2021, which means new developments and disruptive changes should now be targeted for the 9.4.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

catch’s picture

Issue summary: View changes
Status: Needs work » Needs review
Related issues: +#3027639: Make css/js optimized assets path configurable

Here's a patch.

catch’s picture

Issue tags: +Bug Smash Initiative
StatusFileSize
new4.71 KB

This time with the patch.

catch’s picture

Title: Lots of file stats from aggregate gzip checking in .htaccess » Stop creating gzip copies of aggregates, leave it to web servers and proxies
StatusFileSize
new4.7 KB

Removing a blank line, otherwise no changes.

Status: Needs review » Needs work

The last submitted patch, 48: 1040534-48.patch, failed testing. View results

catch’s picture

Status: Needs work » Needs review
StatusFileSize
new2.56 KB
new7.26 KB

Should hopefully resolve both test failures.

Status: Needs review » Needs work

The last submitted patch, 50: 1040534-50.patch, failed testing. View results

catch’s picture

Status: Needs work » Needs review
StatusFileSize
new1.02 KB
new8.28 KB
larowlan’s picture

Issue summary: View changes
Status: Needs review » Reviewed & tested by the community

Added a release note snippet

alexpott’s picture

I read through this issue and the original issue - #101227: Gzip aggregated CSS and JS. What seems odd to me is that the apache docs for mod_deflate say:

Since mod_deflate re-compresses content each time a request is made, some performance benefit can be derived by pre-compressing the content and telling mod_deflate to serve them without re-compressing them.

And the original issue has performance tests that show that performance is improved by Drupal's current approach as opposed to mod_deflate.

There's nothing to stop a site from disabling compression and changing their .htaccess currently. It's just what should the default be. Does the current configuration save bandwidth costs for those on shared / smaller hosting? Will this change be to their detriment. I think we need some analysis to refute the findings of #101227-122: Gzip aggregated CSS and JS

Status: Reviewed & tested by the community » Needs work

The last submitted patch, 52: 1040534-52.patch, failed testing. View results

catch’s picture

Title: Stop creating gzip copies of aggregates, leave it to web servers and proxies » [PP-1] Stop creating gzip copies of aggregates, leave it to web servers and proxies
Status: Needs work » Postponed
Related issues: +#1014086: Stampedes and cold cache performance issues with css/js aggregation

There are four performance issues being balanced here:

1. Serving of compressed aggregates - saves bandwidth.

2. 'Caching' gzipped versions of aggregates, so that mod_deflate doesn't have to do it every request (if there's no varnish/CDN in front of mod_deflate)

This issue would definitely be a regression for the first two, for sites that aren't doing anything else.

3. lstats from checking gzip existence of every request. We check gzip existence for shipped assets which is always a miss, as well as .gz.gz, and lstat doesn't cache non-existing files. This could potentially be improved with generating the .htaccess only in the css/js aggregates directory, and making sure the rewrite doesn't try to rewrite requests that are already for .gz, but that would add more complexity.

4. The actual compression and writing of the gzip aggregate when aggregates are being created. This currently blocks requests from being served altogether, and it's already an expensive operation.

Having said all that, thinking through balancing those four, I think I might have a way to solve both #3 and #4 without a lot of additional complexity.

#1014086: Stampedes and cold cache performance issues with css/js aggregation would add a route (similar to image derivatives), which will generate CSS and JavaScript aggregates when it's requested, this partially solves #4 in that while aggregate generation would take the same amount of time, it'll happen in a dedicated request that doesn't block page generation.

However, we can go a step further, and have that route also serve .css.gz and .js.gz extensions - where it would gzip the aggregate, serve it, and also write it out ready for the next request.

This would mean: the .css request wouldn't need to also create the .gz version, saving a bit of time. Also the .gz request could look for the uncompressed version if available and use that, if not, generate the aggregate and gzip it.

But it would also solve #3, because we could do an unconditional rewrite for .gz versions whether or not they exist, because we'll always hit the PHP route if they don't and serve the gzipped version from there, removing all of the lstats from the .htaccess rule. The one part not fixed yet is that we'd still want to avoid trying to serve .gz for shipped assets, but for that I wonder if we can just make this a bit more specific:

  RewriteCond %{REQUEST_FILENAME}\.gz -s
  RewriteRule ^(.*)\.css $1\.css\.gz [QSA]

- as in have RewriteCond be a bit more restrictive so it only matches aggregate URLs.

So... I'm going to postpone this on #1014086: Stampedes and cold cache performance issues with css/js aggregation (which needs reviews, but should be very close to ready), and we can revisit it once that's in.

catch’s picture

Title: [PP-1] Stop creating gzip copies of aggregates, leave it to web servers and proxies » [PP-1] Rewrite rules for gzipped CSS and JavaScript aggregates cause lots of lstats for files that will never exist

Version: 9.4.x-dev » 9.5.x-dev

Drupal 9.4.0-alpha1 was released on May 6, 2022, which means new developments and disruptive changes should now be targeted for the 9.5.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

catch’s picture

Title: [PP-1] Rewrite rules for gzipped CSS and JavaScript aggregates cause lots of lstats for files that will never exist » Rewrite rules for gzipped CSS and JavaScript aggregates cause lots of lstats for files that will never exist
Status: Postponed » Active
catch’s picture

Status: Active » Needs review
StatusFileSize
new847 bytes

I took a look at #3184242: Support brotli compression of assets which adds .br versions if the brotl extension is available. Unless we required the brotl extension, which we won't be able to do for years, the idea of using separate routes for the uncompressed and compressed aggregates won't work - we can only do that if we can guarantee something is there, and there's no way for Drupal to tell apache whether a route is there or not.

However, if we're going to keep the file existence checks, we can at least stop looking for css_blah.js.gz.gz and modernizr.min.js.gz which we know for a fact are never going to exist. Attaching a patch which makes the rewrite rule a bit more restrictive so it only acts on aggregates rather than literally any file with a .css extension.

Checking this using strace.

With an aggregate:

Before:

130654 stat("/home/catch/www/drupal-dev/sites/default/files/css/css_ufcCLAJ8Y9oLIXsFfOXn4xGjiuPiyazLLWmbA8w2Hyg.css", {st_mode=S_IFREG|0664, st_size=23115, ...}) = 0
130654 stat("/home/catch/www/drupal-dev/sites/default/files/css/css_ufcCLAJ8Y9oLIXsFfOXn4xGjiuPiyazLLWmbA8w2Hyg.css.gz", {st_mode=S_IFREG|0664, st_size=5077, ...}) = 0
130654 stat("/home/catch/www/drupal-dev/sites/default/files/css/css_ufcCLAJ8Y9oLIXsFfOXn4xGjiuPiyazLLWmbA8w2Hyg.css.gz", {st_mode=S_IFREG|0664, st_size=5077, ...}) = 0
130654 stat("/home/catch/www/drupal-dev/sites/default/files/css/css_ufcCLAJ8Y9oLIXsFfOXn4xGjiuPiyazLLWmbA8w2Hyg.css.gz", {st_mode=S_IFREG|0664, st_size=5077, ...}) = 0
130654 stat("/home/catch/www/drupal-dev/sites/default/files/css/css_ufcCLAJ8Y9oLIXsFfOXn4xGjiuPiyazLLWmbA8w2Hyg.css.gz.gz", 0x7fffd84c01d0) = -1 ENOENT (No such file or directory)
130654 openat(AT_FDCWD, "/home/catch/www/drupal-dev/sites/default/files/css/css_ufcCLAJ8Y9oLIXsFfOXn4xGjiuPiyazLLWmbA8w2Hyg.css.gz", O_RDONLY|O_CLOEXEC) = 16
130654 mmap(NULL, 5077, PROT_READ, MAP_SHARED, 16, 0) = 0x7fae25418000

After:

130735 stat("/home/catch/www/drupal-dev/sites/default/files/css/css_ufcCLAJ8Y9oLIXsFfOXn4xGjiuPiyazLLWmbA8w2Hyg.css", and {st_mode=S_IFREG|0664, st_size=23115, ...}) = 0
130735 stat("/home/catch/www/drupal-dev/sites/default/files/css/css_ufcCLAJ8Y9oLIXsFfOXn4xGjiuPiyazLLWmbA8w2Hyg.css.gz", {st_mode=S_IFREG|0664, st_size=5077, ...}) = 0
130735 stat("/home/catch/www/drupal-dev/sites/default/files/css/css_ufcCLAJ8Y9oLIXsFfOXn4xGjiuPiyazLLWmbA8w2Hyg.css.gz", {st_mode=S_IFREG|0664, st_size=5077, ...}) = 0
130735 stat("/home/catch/www/drupal-dev/sites/default/files/css/css_ufcCLAJ8Y9oLIXsFfOXn4xGjiuPiyazLLWmbA8w2Hyg.css.gz", {st_mode=S_IFREG|0664, st_size=5077, ...}) = 0
130735 openat(AT_FDCWD, "/home/catch/www/drupal-dev/sites/default/files/css/css_ufcCLAJ8Y9oLIXsFfOXn4xGjiuPiyazLLWmbA8w2Hyg.css.gz", O_RDONLY|O_CLOEXEC) = 16
130735 close(16)    

With a non-aggregated file like modernizr (note that aggregation is on, it's just opted out of aggregation in the library definition)

Before:

130733 stat("/home/catch/www/drupal-dev/core/assets/vendor/modernizr/modernizr.min.js",  <unfinished ...>
130733 stat("/home/catch/www/drupal-dev/core/assets/vendor/modernizr/modernizr.min.js.gz",  <unfinished ...>
130733 <... stat resumed>0x7fffd84c0270) = -1 ENOENT (No such file or directory)
130733 openat(AT_FDCWD, "/home/catch/www/drupal-dev/core/assets/vendor/modernizr/modernizr.min.js", O_RDONLY|O_CLOEXEC <unfinished ...>

After:

138554 stat("/home/catch/www/drupal-dev/core/assets/vendor/modernizr/modernizr.min.js",  <unfinished ...>
138554 openat(AT_FDCWD, "/home/catch/www/drupal-dev/core/assets/vendor/modernizr/modernizr.min.js", O_RDONLY|O_CLOEXEC <unfinished ...>
catch’s picture

Issue summary: View changes
catch’s picture

Issue summary: View changes
catch’s picture

StatusFileSize
new846 bytes

The last submitted patch, 60: 1040534.patch, failed testing. View results

Status: Needs review » Needs work

The last submitted patch, 63: 1040534.patch, failed testing. View results

catch’s picture

Status: Needs work » Needs review
StatusFileSize
new1.75 KB

Same change needs to be made for the scaffold version.

smustgrave’s picture

Status: Needs review » Reviewed & tested by the community

Regex looks good to me.

alexpott’s picture

Version: 9.5.x-dev » 10.0.x-dev

I think this is only for 10.0.x - right?

catch’s picture

The regexp should work fine in both 9.5.x and 10.x (both aggregate URL methods have the same prefix), so I think it would be fine to backport. However we might not want to annoy people with htaccess changes in 9.5.x

alexpott’s picture

Issue tags: +Needs release note

Oh yeah .htaccess changes need a release note, I think.

catch’s picture

Adding the release note and tagging.

xjm’s picture

Status: Reviewed & tested by the community » Needs work
Issue tags: +Needs change record

Can we move the details including the regex stuff into a CR and link it ? Thanks!

xjm’s picture

I would suggest something like:

The htaccess rewrite rule that enables static gzipped JavaScript and CSS aggregates to be served to browsers has been modified to improve performance. Sites should <a href="">update their .htaccess files</a> to take advantage of this performance improvement.
catch’s picture

Issue summary: View changes
Status: Needs work » Reviewed & tested by the community
Issue tags: -Needs change record

That makes sense, add a change record, and the suggest release notes text looks great so used that.

alexpott’s picture

Saving issue credit.

alexpott’s picture

Version: 10.0.x-dev » 9.5.x-dev
Status: Reviewed & tested by the community » Fixed

Discussed with @catch we agreed to backport to 9.5.x as this is a bugfix and although .htaccess is fixed applying the changes is not 100% necessary - though it is good for performance.

Committed and pushed a6c49dbae4 to 10.1.x and 3ef4faacbb to 10.0.x and 8412918b4d to 9.5.x. Thanks!

  • alexpott committed a6c49db on 10.1.x
    Issue #1040534 by catch, Owen Barton, danreb, mikeytown2, ogi, omega8cc...

  • alexpott committed 3ef4faa on 10.0.x
    Issue #1040534 by catch, Owen Barton, danreb, mikeytown2, ogi, omega8cc...

  • alexpott committed 8412918 on 9.5.x
    Issue #1040534 by catch, Owen Barton, danreb, mikeytown2, ogi, omega8cc...

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

quietone’s picture