Problem/Motivation

Drupal handles 404s for certain files that are not Drupal content.

Examples are .js, .css, .jpg, .gif, ...etc.

This has two disadvantages:

- We do an expensive, slow bootstrap for every 404 for such files that cause server load and delay browser page rendering.

- We send far more bytes for that 404 than if we let the web server standard error page get sent, wasting bandwidth and further delaying browser page rendering.

From a scalability point of view 404s can be a huge problem for sites with old domain names that have gone through multiple re-structurings over the years. The 404 traffic simply from hotlinked files alone can be a significant draw on resources. The current setup is also a huge risk for active sites - if you add a 404 to a busy page, or perhaps accidentally move, chmod or otherwise make the files directory inaccessible, server load instantly skyrockets and can render the server unresponsive. An example of this is the (missing by default) favicon.ico file which, until we added a workaround for in .htaccess was basically doubling server load for all Drupal sites.

From an end user point of view this is also a major issue. Browsers will patiently sit around with the loading spinner going waiting for Drupal to fully bootstrap, generate a beautiful, fully themed page to represent an image or css file, that the browser then completely ignores as soon as it notes the 404 http code. Not only does this mean that we end up waiting for Drupal to bootstrap twice (or more!), but the response is at least a factor of ten larger than it needs to be. In some cases (e.g. whitehouse.gov or amnesty.org) it is over 100 times larger than necessary. Multiply this by all the 404s on all the Drupal 7 sites, and the end result is a lot of end user wait time, and a lot of bandwidth and cycles wasted.

Proposed resolution

The approach is to return a lightweight 200 byte page earlier in the page generation cycle. By default it is returned in drupal_deliver_html_page(), which at least cuts out some amount of page generation time. However we also have an option to return the fast 404 when settings.php is being loaded, which is extremely fast but has some tradeoffs to consider. There are settings in settings.php to disable this completely (and get pretty 404s back), to adjust the extensions that trigger a fast 404, and to enable the settings.php early return.

We spent the first zillion comments or so considering .htaccess based approaches, but these had enough tradeoffs to not work for core as a default, and issues with logging that most people would not want to opt in to.

Here is a table describing the status-quo and the 2 options added by this patch:

Configuration A- disabled Configuration B - called from drupal_deliver_html_page() Configuration C - called in settings.php
Heavy 404 page generation cost Smaller 404 page generation cost (~50% of A?) Minute 404 page generation cost (< fastpath cache)
Slow 404 page response time (e.g. 1-10 seconds) Faster 404 page response time (e.g. 0.5-5 seconds) Very fast 404 page response time (e.g. < 0.2 seconds)
Large 404 page size (2,000-30,000+ bytes, depending on site) Tiny 404 page size (~200 bytes) Tiny 404 page size (~200 bytes)
Inlined assets (images, JS, CSS) 404 as normal Inlined assets (images, JS, CSS) 404 as normal Inlined assets (images, JS, CSS) 404 as normal
Directly requested assets show fully themed 404 page Directly requested assets show simple 404 page Directly requested assets show simple 404 page
All 404s logged in watchdog as normal All 404s logged in watchdog as normal "Fast" 404s not logged in watchdog
All 404s logged in Apache logs as normal All 404s logged in Apache logs as normal All 404s logged in Apache logs as normal
Core image api works normally Core image api works normally Core image api works normally
404 based lazy loaders (e.g. D6 imagecache) work normally 404 based lazy loaders (e.g. D6 imagecache) work normally 404 based lazy loaders (e.g. D6 imagecache) don't work without tweaking the regexp
Pages with aliases that match regex load as normal Pages with aliases that match regex load as normal Pages with aliases that match regex 404 instead of load

Server side benchmarks

Using apachebench, and visting example.com/none (which is a 404), with 5 users, 1000 requests

1. Without patch
Requests per second: 49.71

2. With patch, default settings
Requests per second: 51.11

3. With patch, with function uncommented in settings.php
Requests per second: 51.15

When visiting example.com/none.jpg (which a 404 that triggers a fast 404), 5 users, 1000 requests

With patch, with function uncommented in settings.php
Requests per second: 352.31

Browser side benchmarks

Benchmarks show the effect of the current slow inline 404s on browser display latency for end users. These were done using http://www.webpagetest.org/ testing the same site (in different configurations, 404s introduced through a node body promoted to home page) using default settings, 10 test runs each time.

First View Summary (seconds)
  No patch Patch Patch (fast 404 enabled)
No 404 2.591
Image 404 2.944 2.881 2.525
CSS 404 3.571 2.890 2.501
Percentage Improvement
  Patch Patch (fast 404 enabled)
Image 404 2% 14%
CSS 404 19% 30%
Absolute Improvement (seconds)
  Patch Patch (fast 404 enabled)
Image 404 0.063 0.419
CSS 404 0.681 1.07
Repeat View Summary (seconds)
  No patch Patch Patch (fast 404 enabled)
No 404 0.886
Image 404 1.716 1.488 1.011
CSS 404 1.604 1.500 1.287
Percentage Improvement
  Patch Patch (fast 404 enabled)
Image 404 13% 41%
CSS 404 6% 20%
Absolute Improvement (seconds)
  Patch Patch (fast 404 enabled)
Image 404 0.228 0.705
CSS 404 0.104 0.317

See attached file for more detailed benchmarks.

The results are not insignificant - an inline 404 currently slows down the parent page display by anything from 500ms to a full second, depending on the type of 404. Committing the patch will shave something like 100ms-500ms off that time. If site builders enable the "fast 404" return in settings.php the pages display almost as fast as if the 404 wasn't there - anything from 500ms to a full second faster than the current performance. In other words, this patch clearly has enormous benefits for both end user responsiveness, in addition to the server utilization benefits described in depth above.

Remaining tasks

This has been through many months of revision and tweaking. One potential improvement that comes to mind is a test that the lightweight page is indeed returned (we already check for 404 status in several cases) - this would testing some pretty trivial code (one function call, one if), but is probably worth considering.

The previously submitted patch in #286 was incompatible with image styles and other generated files (such paths would return a 404). The latest patch in #316 addresses this issue by adding a configurable list of excluded paths that should not use the fast 404 feature. An interdiff showing changes between the two patches is available in #320. This revision of the patch has been tested by several people.

It may be possible to further optimize the regex; see #319 and #322.

User interface changes

None

API changes

None

Files: 
CommentFileSizeAuthor
#353 fast404-76824-351.patch5.06 KBShellingfox
PASSED: [[SimpleTest]]: [MySQL] 190 pass(es).
[ View ]
#351 fast404-76824-347.patch5.04 KBShellingfox
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch fast404-76824-347.patch. This may be a -p0 (old style) patch, which is no longer supported by the testbots.
[ View ]
#350 fast404file-76824-346.patch5.04 KBShellingfox
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch fast404file-76824-346_0.patch. This may be a -p0 (old style) patch, which is no longer supported by the testbots.
[ View ]
#347 fast404file-76824-346.patch5.04 KBShellingfox
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch fast404file-76824-346.patch. This may be a -p0 (old style) patch, which is no longer supported by the testbots.
[ View ]
#328 404_fast_paths_7x-76824-328.patch6.33 KBdroplet
PASSED: [[SimpleTest]]: [MySQL] 36,012 pass(es).
[ View ]
#328 404_fast_paths_8x-76824-328.patch6.33 KBdroplet
PASSED: [[SimpleTest]]: [MySQL] 32,866 pass(es).
[ View ]
#326 404_fast_paths_8x-76824-326.patch6.33 KBgeerlingguy
PASSED: [[SimpleTest]]: [MySQL] 32,861 pass(es).
[ View ]
#326 404_fast_paths_7x-76824-326.patch6.33 KBgeerlingguy
PASSED: [[SimpleTest]]: [MySQL] 32,860 pass(es).
[ View ]
#320 interdiff-286-316.txt3.17 KBxjm
#316 404_fast_paths_7x-76824-316.patch6.33 KBgeerlingguy
PASSED: [[SimpleTest]]: [MySQL] 33,643 pass(es).
[ View ]
#316 404_fast_paths_8x-76824-316.patch6.33 KBgeerlingguy
PASSED: [[SimpleTest]]: [MySQL] 33,642 pass(es).
[ View ]
#311 404_fast_paths_7x-76824-311.patch6.33 KBgeerlingguy
PASSED: [[SimpleTest]]: [MySQL] 33,632 pass(es).
[ View ]
#311 404_fast_paths_8x-76824-311.patch6.33 KBgeerlingguy
PASSED: [[SimpleTest]]: [MySQL] 33,647 pass(es).
[ View ]
#310 404_fast_paths_7x-76824-310.patch6.34 KBgeerlingguy
PASSED: [[SimpleTest]]: [MySQL] 33,641 pass(es).
[ View ]
#310 404_fast_paths_8x-76824-310.patch6.34 KBgeerlingguy
PASSED: [[SimpleTest]]: [MySQL] 33,640 pass(es).
[ View ]
#309 404_fast_paths_7x-76824-309.patch6.33 KBgeerlingguy
PASSED: [[SimpleTest]]: [MySQL] 33,649 pass(es).
[ View ]
#309 404_fast_paths_8x-76824-309.patch6.33 KBgeerlingguy
PASSED: [[SimpleTest]]: [MySQL] 33,634 pass(es).
[ View ]
#307 404_fast_paths_7x-76824-307.patch6.33 KBgeerlingguy
FAILED: [[SimpleTest]]: [MySQL] Invalid PHP syntax in includes/bootstrap.inc.
[ View ]
#307 404_fast_paths_8x-76824-307.patch6.33 KBgeerlingguy
FAILED: [[SimpleTest]]: [MySQL] Invalid PHP syntax in includes/bootstrap.inc.
[ View ]
#305 404_fast_paths_7x-76824-305.patch6.33 KBgeerlingguy
FAILED: [[SimpleTest]]: [MySQL] Invalid PHP syntax in includes/bootstrap.inc.
[ View ]
#305 404_fast_paths_8x-76824-305.patch6.33 KBgeerlingguy
FAILED: [[SimpleTest]]: [MySQL] Invalid PHP syntax in includes/bootstrap.inc.
[ View ]
#304 404_fast_paths_8x-76824-303.patch5.93 KBgeerlingguy
PASSED: [[SimpleTest]]: [MySQL] 33,638 pass(es).
[ View ]
#303 404_fast_paths_7x-76824-303.patch5.93 KBgeerlingguy
PASSED: [[SimpleTest]]: [MySQL] 33,620 pass(es).
[ View ]
#286 fast_404_handling-76824-286.patch5.77 KBgeerlingguy
PASSED: [[SimpleTest]]: [MySQL] 29,849 pass(es).
[ View ]
#283 76824-283.diff6.38 KBjoelstein
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch 76824-283.diff.
[ View ]
#230 404.diff6.08 KBmoshe weitzman
PASSED: [[SimpleTest]]: [MySQL] 28,851 pass(es).
[ View ]
#217 404 benchmarks.ods6.14 KBOwen Barton
#221 76824-fast_404_221.patch6.05 KBscor
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch 76824-fast_404_221.patch.
[ View ]
#215 76824-fast_404_215.patch6.03 KBscor
PASSED: [[SimpleTest]]: [MySQL] 22,749 pass(es).
[ View ]
#208 76824-fast_404_208_0.patch4.64 KBOwen Barton
PASSED: [[SimpleTest]]: [MySQL] 22,462 pass(es).
[ View ]
#206 76824-fast_404_206.patch6.03 KBscor
PASSED: [[SimpleTest]]: [MySQL] 20,688 pass(es).
[ View ]
#196 76824-18.patch6.18 KBkbahey
FAILED: [[SimpleTest]]: [MySQL] 20,258 pass(es), 1 fail(s), and 0 exception(es).
[ View ]
#194 76824-17.patch5.93 KBOwen Barton
FAILED: [[SimpleTest]]: [MySQL] 20,148 pass(es), 1 fail(s), and 0 exception(es).
[ View ]
#192 76824-16.patch6.1 KBkbahey
FAILED: [[SimpleTest]]: [MySQL] 20,158 pass(es), 1 fail(s), and 0 exception(es).
[ View ]
#189 76824-15.patch6.02 KBkbahey
FAILED: [[SimpleTest]]: [MySQL] 20,104 pass(es), 1 fail(s), and 0 exception(es).
[ View ]
#175 76824-14.patch5.35 KBOwen Barton
FAILED: [[SimpleTest]]: [MySQL] Drupal installation failed.
[ View ]
#167 76824-13.patch5.31 KBkbahey
PASSED: [[SimpleTest]]: [MySQL] 19,157 pass(es).
[ View ]
#163 76824-12.patch4.64 KBkbahey
PASSED: [[SimpleTest]]: [MySQL] 19,168 pass(es).
[ View ]
#157 76824-11.patch4.61 KBOwen Barton
PASSED: [[SimpleTest]]: [MySQL] 19,107 pass(es).
[ View ]
#154 76824-10.patch4.9 KBOwen Barton
PASSED: [[SimpleTest]]: [MySQL] 19,095 pass(es).
[ View ]
#153 76824-9.patch4.48 KBOwen Barton
PASSED: [[SimpleTest]]: [MySQL] 19,107 pass(es).
[ View ]
#151 76824-8.patch4.48 KBOwen Barton
PASSED: [[SimpleTest]]: [MySQL] 19,111 pass(es).
[ View ]
#149 76824-7.patch4.15 KBkbahey
PASSED: [[SimpleTest]]: [MySQL] 19,101 pass(es).
[ View ]
#148 76824-6.patch4.12 KBOwen Barton
PASSED: [[SimpleTest]]: [MySQL] 19,100 pass(es).
[ View ]
#141 76824-4.patch8.97 KBOwen Barton
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch 76824-4.patch.
[ View ]
#114 76824-113.patch8.87 KBOwen Barton
Failed: 11124 passes, 1 fail, 0 exceptions
[ View ]
#110 76824-110.patch6.01 KBkbahey
Failed: 11098 passes, 18 fails, 0 exceptions
[ View ]
#107 patch12.patch2.29 KBm3avrck
Failed: Failed to apply patch.
[ View ]
#102 76824.patch4.5 KBOwen Barton
Failed: Failed to apply patch.
[ View ]
#92 htaccess.patch1.58 KBzeezooz
Passed on all environments.
[ View ]
#73 files-404-3.patch2.44 KBkbahey
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch files-404-3.patch.
[ View ]
#70 files-404-2.patch2.56 KBkbahey
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch files-404-2.patch.
[ View ]
#67 files-404.patch2.49 KBkbahey
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch files-404.patch.
[ View ]
#66 htaccess-404.patch2.15 KBkbahey
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch htaccess-404.patch.
[ View ]
#54 d_5.patch1.19 KBm3avrck
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch d_5.patch.
[ View ]
#51 htaccessjv.patch1.08 KBjvandyk
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch htaccessjv.patch.
[ View ]
#48 d_3.patch1010 bytesm3avrck
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch d_3.patch.
[ View ]
#41 404_2.patch1.2 KBm3avrck
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch 404_2.patch.
[ View ]
#38 404_1.patch1.2 KBm3avrck
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch 404_1.patch.
[ View ]
#36 block_9.patch4.05 KBm3avrck
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch block_9.patch.
[ View ]
#32 404_0.patch5.24 KBm3avrck
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch 404_0.patch.
[ View ]
#26 htaccess-404.patch_0.txt1.19 KBrobertDouglass
#24 patch_953.35 KBmoshe weitzman
#13 htaccess-404.patch.txt1.97 KBkbahey
#11 htaccess-index.patch1.55 KBrobertDouglass
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch htaccess-index.patch.
[ View ]
#4 htaccess_new_0.txt2.59 KBagentrickard
#1 htaccess_7.patch947 byteskbahey
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch htaccess_7.patch.
[ View ]
htaccess.patch.txt939 byteskbahey

Comments

morningtime’s picture

@geerlingguy
>> You have to know what you're doing to set this up, and its buried towards the bottom of settings.php with other advanced configuration options.

I accept your logic. However, there should be an example regex included, or at least a warning it will break image styles. Neither was explained in the patch. So how are people going to figure that out? That's just my opinion.

By the way, settings.php is the first file any end-user looks at. You can't "bury" anything there :)

geerlingguy’s picture

I'll work on a new patch for D7 and D8 in a little bit. I actually need to set this up on a new site in a short bit, so it'll be good to test on.

geerlingguy’s picture

StatusFileSize
new5.93 KB
PASSED: [[SimpleTest]]: [MySQL] 33,620 pass(es).
[ View ]

Patch against 7.x attached (I'll make an 8.x patch in the next comment). Leaving needs work until I can post the 8.x patch.

A little explanation of the one change I've made (after considering a few different methods): In drupal_fast_404(), I've added a check for the string 'styles' in the query string, as this should take care of the great majority of use cases, is still very performant (doesn't require any more regex, and strpos() is faster than another preg_match()), and is simple enough for someone that needs to add more paths (for example, for the Image Resize Filter's 'resize' directory) to be able to do so by adding another condition.

if ($fast_paths && preg_match($fast_paths, $_GET['q'] && !strpos($_GET['q'], 'styles'))) { ...

We could maybe do something where the user could set an array of options in settings.php that would be checked, but I think this method makes the best tradeoff between sensible defaults and performance... thoughts?

geerlingguy’s picture

Status:Needs work» Needs review
StatusFileSize
new5.93 KB
PASSED: [[SimpleTest]]: [MySQL] 33,638 pass(es).
[ View ]

8.x patch attached.

The only way to make a configurable array of paths to check against the url for dynamic content in settings.php (so people wouldn't need to change bootstrap.inc to add more dynamic paths) would be to add a loop in drupal_fast_404() to check each string against the url using foreach... but maybe preg_match() would be quicker :-/

geerlingguy’s picture

StatusFileSize
new6.33 KB
FAILED: [[SimpleTest]]: [MySQL] Invalid PHP syntax in includes/bootstrap.inc.
[ View ]
new6.33 KB
FAILED: [[SimpleTest]]: [MySQL] Invalid PHP syntax in includes/bootstrap.inc.
[ View ]

Nevermind... I think this technique (everything's configurable in settings.php) should be a bit better. See attached patches, which also add an explanation of the dynamic paths configuration variable, for D7 and D8.

Status:Needs review» Needs work

The last submitted patch, 404_fast_paths_8x-76824-305.patch, failed testing.

geerlingguy’s picture

Status:Needs work» Needs review
StatusFileSize
new6.33 KB
FAILED: [[SimpleTest]]: [MySQL] Invalid PHP syntax in includes/bootstrap.inc.
[ View ]
new6.33 KB
FAILED: [[SimpleTest]]: [MySQL] Invalid PHP syntax in includes/bootstrap.inc.
[ View ]

Yeah, sorry about that... ending parentheses:

Status:Needs review» Needs work

The last submitted patch, 404_fast_paths_8x-76824-307.patch, failed testing.

geerlingguy’s picture

Status:Needs work» Needs review
StatusFileSize
new6.33 KB
PASSED: [[SimpleTest]]: [MySQL] 33,634 pass(es).
[ View ]
new6.33 KB
PASSED: [[SimpleTest]]: [MySQL] 33,649 pass(es).
[ View ]

Yeah, about those parentheses...

geerlingguy’s picture

StatusFileSize
new6.34 KB
PASSED: [[SimpleTest]]: [MySQL] 33,640 pass(es).
[ View ]
new6.34 KB
PASSED: [[SimpleTest]]: [MySQL] 33,641 pass(es).
[ View ]

Updated patch to use the more standardized variable name 404_fast_paths_exclude (instead of 404_dynamic_paths).

geerlingguy’s picture

StatusFileSize
new6.33 KB
PASSED: [[SimpleTest]]: [MySQL] 33,647 pass(es).
[ View ]
new6.33 KB
PASSED: [[SimpleTest]]: [MySQL] 33,632 pass(es).
[ View ]

Fixed comments.

joelstein’s picture

Status:Needs review» Reviewed & tested by the community

I like the idea of having "exclude" paths in addition to "include" paths, and I'm using the same technique with great success on one of my D7 websites. Patch works for me!

morningtime’s picture

thumbs up, i think it's perfect

droplet’s picture

Status:Reviewed & tested by the community» Needs work
+++ b/sites/default/default.settings.phpundefined
@@ -428,6 +428,42 @@ ini_set('session.cookie_lifetime', 2000000);
+$conf['404_fast_paths_exclude'] = '/\b(styles)\w*\b/';

can anyone explain why use this regexp but not \b(styles)\b ?

/\b(styles)\w*\b/
Matched:
1. \stylesxxxxxxxxxxx\abc
2. \styles\abc
3. xxxx\styles.jpg
4. xxx\styles\xxx\xxx

/\b(styles)\b/
Only matched:
2. \styles\abc
3. xxxx\styles.jpg
4. xxx\styles\xxx\xxx

or consider this:
/\b\\(styles)\\\b/
Matched:
4. xxx\styles\xxx\xxx

20 days to next Drupal core point release.

geerlingguy’s picture

Status:Needs review» Needs work

I'm presuming you meant something more like:

<?php
  $conf
['404_fast_paths_exclude'] = '/\b\/(styles)\/\b/';
?>

(forward slashes instead of back slashes). I have the updated regex on my local checkout and will roll a patch in a few minutes. Pending testbot approval, I hope we can RTBC this asap and get it in.

I've tested with the following patterns:

  • http://www.example.com/example/styles - no match
  • http://www.example.com/example/styles/file.jpg - match (won't return a 404)
  • http://www.example.com/example/styles.jpg - no match

To add more 'dynamic' or 'safe' paths (for things like ctools-generated css, or other dynamically-created content), just add a pipe, and the string to match. For example, in settings.php, instead of the code at the top of this comment, put:

<?php
  $conf
['404_fast_paths_exclude'] = '/\b\/(styles|ctools)\/\b/';
?>
geerlingguy’s picture

Status:Needs work» Needs review
StatusFileSize
new6.33 KB
PASSED: [[SimpleTest]]: [MySQL] 33,642 pass(es).
[ View ]
new6.33 KB
PASSED: [[SimpleTest]]: [MySQL] 33,643 pass(es).
[ View ]

Updated patches with regex from #315, tested thoroughly, hopefully ready for commit!

geerlingguy’s picture

Status:Needs work» Reviewed & tested by the community

Back to RTBC, as the simple line of regex is all that was changed, and it's been tested by two people.

rootwork’s picture

Subscribe (sorry, I know...)

markus_petrux’s picture

Capturing back references in regular expressions do not seem to be required. It could be optimized by adding ?: after the opening parenthesis.

xjm’s picture

StatusFileSize
new3.17 KB

Here's an interdiff against #286 to make it easier to see what's changed in #316.

xjm’s picture

Issue summary:View changes

First cut at issue summary

xjm’s picture

I updated the Remaining Tasks section of the summary to reflect the current status of the issue.

xjm’s picture

Issue summary:View changes

Updated issue summary.

droplet’s picture

@#315,
thanks geerlingguy, that's what i meant.

and we can simplify it to:

\/(?:styles)\/

(non-capturing, no word-boundary)

1. xxxxxxx/default/files/styles/large/public/field/image/xxxxxx.jpg - Matched
2. styles/default/files/xxxxxxx/large/public/field/image/xxxxxx.jpg
3. xxxxx/default/files/xxxxxx/large/public/field/image/styles.jpg

Yes, it can be more simplified to

\/styles\/

but first case, beginner users can just add a pipe:
\/(?:styles|ctools)\/

droplet’s picture

Issue summary:View changes

Removed unnecessary (and now inaccurate) phrase.

xjm’s picture

Setting to NR based on #319 and #322.

xjm’s picture

Status:Reviewed & tested by the community» Needs review

No, really. :)

geerlingguy’s picture

Will write up a new patch in the morning.

geerlingguy’s picture

StatusFileSize
new6.33 KB
PASSED: [[SimpleTest]]: [MySQL] 32,860 pass(es).
[ View ]
new6.33 KB
PASSED: [[SimpleTest]]: [MySQL] 32,861 pass(es).
[ View ]

I've set the new regex to:

<?php
  $conf
['404_fast_paths_exclude'] = '/\/(?:styles)\//';
?>

It was:

<?php
  $conf
['404_fast_paths_exclude'] = '/\b\/(styles)\/\b/';
?>

Tested with same paths as earlier, and D7/D8 patches are both attached. Nice to see profile module gone in D8!

xjm’s picture

Nice, hopefully we can get a couple people to review this version and RTBC it again.

droplet’s picture

StatusFileSize
new6.33 KB
PASSED: [[SimpleTest]]: [MySQL] 32,866 pass(es).
[ View ]
new6.33 KB
PASSED: [[SimpleTest]]: [MySQL] 36,012 pass(es).
[ View ]

Good Job! and this one too:

<?php
 $conf
['404_fast_paths'] = '/\.(?:txt|png|gif|jpe?g|css|js|ico|swf|flv|cgi|bat|pl|dll|exe|asp)$/i';
?>

It was:

<?php
  $conf
['404_fast_paths'] = '/\.(txt|png|gif|jpe?g|css|js|ico|swf|flv|cgi|bat|pl|dll|exe|asp)$/i';
?>

sorry I haven't mention it before.

Owen Barton’s picture

This looks great to me (tests not back yet though).

One further (very minor) tweak that comes to mind, is that if we used a regex delimiter other than '/' we wouldn't need to escape the slashes in the exclude expression, which might and reduce the overall "too many slashes!" fear factor and generally make the function of the slashes a bit more obvious to folks who are not regex geeks. Of course there is also a risk that a less typical delimiter would confuse people more than the slashes - feedback welcome :)

e.g. instead of:

<?php
$conf
['404_fast_paths_exclude'] = '!/(?:styles)/!';
$conf['404_fast_paths'] = '!\.(?:txt|png|gif|jpe?g|css|js|ico|swf|flv|cgi|bat|pl|dll|exe|asp)$!i';
?>

we could have:

<?php
$conf
['404_fast_paths_exclude'] = '!/(?:styles)/!';
$conf['404_fast_paths'] = '!\.(?:txt|png|gif|jpe?g|css|js|ico|swf|flv|cgi|bat|pl|dll|exe|asp)$!i';
?>

I went with '!' as it seems the most common in core (after '/') - other non-escape-needed delimiter options (also used elsewhere in core) include '@', '%', '~', '#'. What do people think?

geerlingguy’s picture

That's a simple style preference... but I would argue in favor of escaping things with slashes; exclamation points make it seem like something special is going on there (to a less-experienced regex guy), and would be more confusing, imo.

geerlingguy’s picture

Status:Needs review» Reviewed & tested by the community

Back to RTBC... I think most people are in agreement over everything, except possibly the style of regex used—semantically, it's about as concise as it can get while still allowing ease of end-user modification in settings.php (ref #322).

Owen Barton’s picture

FWIW, I am perfectly happy with the regex as is. I suggested changing the delimiters to avoid escaping, but I agree "special" looking delimiters may be similar for new-to-regex folks - don't feel strongly either way.

Dries’s picture

Version:8.x-dev» 7.x-dev

Committed to 8.x. Oh yeah.

Sorry for being slow to commit this patch. For the longest time I wasn't convinced this was the right thing to do and was quite torn about it. I just spent time thinking about it, and now think it is actually proper. Thanks for hanging on there.

Moving to 7.x.

Dries’s picture

Assigned:Dries» Unassigned
geerlingguy’s picture

Still RTBC for Drupal 7 as well - and as noted above, there are no regressions introduced by this patch, and for it to have any effect at all, a site would need to overwrite its existing settings.php file (for existing sites). Might we be able to get it committed soon?

geerlingguy’s picture

#328: 404_fast_paths_7x-76824-328.patch queued for re-testing.

geerlingguy’s picture

@webchick - any chance of getting this in before the next point release? Been RTBC for a few weeks now...

webchick’s picture

Title:Drupal should not handle 404 for certain files» Change notice for Drupal should not handle 404 for certain files
Priority:Major» Critical
Status:Reviewed & tested by the community» Active

Sorry; I've been swamped lately and haven't had a chance to give the D7 queue a good love-fest.

This seems like a pretty harmless addition for D7, and passes Dries's muster, so I'm cool with committing it.

So! Committed and pushed to 7.x.

This needs a change notice so it can be mentioned in the 7.9 release notes.

geerlingguy’s picture

The link was relative, so didn't work - should be http://drupal.org/node/add/changenotice.

I glance through there, but this is the first issue I've ever dealt with that needs a Change Notice, and I'm unfamiliar with what exactly that entails. Specifically, what would I need to do in the 'Updates Done' section, and is there need to attach any particular file (like the committed patch)?

I'd love to do this (or at least see a couple examples of change notice nodes), but just need a little nudge—do any docs exist?

geerlingguy’s picture

Status:Active» Needs review

Here's the change notice I've posted: http://drupal.org/node/1296384 — please see my questions in the comment above, as this is the first time I've submitted a change notices node...

Thanks!

webchick’s picture

Title:Change notice for Drupal should not handle 404 for certain files» Drupal should not handle 404 for certain files
Priority:Critical» Major
Status:Needs review» Fixed

That looks perfect to me, thanks! Restoring status.

The "Updates" section is for the documentation team, etc. to note when they have updated the relevant sections.

If you have suggestions on how to make this feature more clear, please file some feedback under http://drupal.org/project/issues/drupalorg.

perusio’s picture

Status:Fixed» Needs work

@geerlinguy #339

Your change notice is incomplete. It fails to mention that if you're "capturing" the handling of the static files at the server level there's no need to patch anything or fiddling with the settings.php file.

geerlingguy’s picture

@perusio - could you update the change notice, then? I'm not sure how I'd want to phrase that, as I thought it more or less self-explanatory that you bypass Drupal's handling if you handle 404s before Drupal got ahold of them.

perusio’s picture

Status:Needs work» Fixed

Status:Fixed» Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

Shellingfox’s picture

Version:7.x-dev» 6.x-dev
Assigned:Unassigned» Shellingfox
Issue tags:+needs backport to D6

I'm working on this to backport to D6

Shellingfox’s picture

Status:Active» Closed (fixed)
StatusFileSize
new5.04 KB
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch fast404file-76824-346.patch. This may be a -p0 (old style) patch, which is no longer supported by the testbots.
[ View ]

Here is my patch for #346

Shellingfox’s picture

Status:Closed (fixed)» Active
thehong’s picture

Status:Closed (fixed)» Needs review
Shellingfox’s picture

Status:Needs review» Active
StatusFileSize
new5.04 KB
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch fast404file-76824-346_0.patch. This may be a -p0 (old style) patch, which is no longer supported by the testbots.
[ View ]

Since i was change status, last file wasn't test.

Shellingfox’s picture

Status:Active» Needs review
StatusFileSize
new5.04 KB
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch fast404-76824-347.patch. This may be a -p0 (old style) patch, which is no longer supported by the testbots.
[ View ]

Re-create patch via git diff

Status:Needs review» Needs work

The last submitted patch, fast404-76824-347.patch, failed testing.

Shellingfox’s picture

Status:Needs work» Needs review
StatusFileSize
new5.06 KB
PASSED: [[SimpleTest]]: [MySQL] 190 pass(es).
[ View ]
droplet’s picture

we don't has such D6 backport policy? It will change the default behavior.

Nigeria’s picture

#353: fast404-76824-351.patch queued for re-testing.

jcisio’s picture

Status:Needs review» Reviewed & tested by the community
Issue tags:-needs backport to D7

It was backported to D7, so it could be backported to D6. The default behavior does not change.

There are two trailing spaces for nitpickers. The backport is straightaway. Works as expected.

pwolanin’s picture

Is this already in Pressflow? D6 is much less open to new features than D7 is right now.

jcisio’s picture

No, it is not in Pressflow. Pressflow recently has not been well maintained, no backport has been recently merged into Pressflow. Pressflow 6.26 needed two months to be tagged after the release of Drupal 6.26.

That said, if we decide not to backport this one, we can simply remove tag and close the issue. There are not many new D6 sites, so a change in the default settings file is not much useful.

donquixote’s picture

A quick note:

If this is what we currently find in
admin/config/search/search404 > Advanced settings > "EXTENSIONS TO ABORT SEARCH"
then there is the "ico" missing there, for favicon.ico

pwolanin’s picture

@jcisio - it looks like 6.26 was merged in: https://github.com/pressflow/6

Here's a 6.26 tag: https://github.com/pressflow/6/commits/pressflow-6.26.109

what's missing?

jcisio’s picture

Version:6.x-dev» 7.x-dev
Status:Reviewed & tested by the community» Fixed
Issue tags:-needs backport to D6

@pwolanin it needed two months after the release of Drupal 6.26 to be tagged. Pressflow has many pull requests (https://github.com/pressflow/6/pulls) but it looks like none was accepted, except for official Drupal releases merge.

In short:
- it is not in Pressflow now
- there is no interest for it to be in Pressflow 6 (there is mostly no new P6 sites)
- there is no interest for it to be in Drupal 6 (there is mostly no new D6 sites)

I'm closing this issue as it won't be in D6, unless Gabor goes ahead and commits it. Restoring the 7.x tag.

Status:Fixed» Closed (fixed)
Issue tags:-Performance

Automatically closed -- issue fixed for 2 weeks with no activity.

Anonymous’s picture

Issue summary:View changes

Updated issue summary.

jelo’s picture

Issue summary:View changes

This issue states at the top: "The previously submitted patch in #286 was incompatible with image styles and other generated files (such paths would return a 404). The latest patch in #316 addresses this issue by adding a configurable list of excluded paths that should not use the fast 404 feature."

I am not sure if "other generated files" includes clashes with images that are loaded using private file download method. I just spent a couple of hours troubleshooting why all inline images had disappeared from my Drupal site before I realized that fast 404 had been enabled with the default settings, i.e. no additional paths had been added.

The result is that all private image files that are served through /system/files/filename.jpg break. All other documents still function because fast 404 only defines extensions png|gif|jpe?g, but not pdf, xls etc.

Given that private file setting options are in Drupal core and could be enabled, should we not add /system/files as path by default to be excluded in case someone enables fast 404?

jelo’s picture

Given that this issue is closed I created a new issue with a proposal to add "system/files" by default to the excluded paths at https://www.drupal.org/node/2455057

Pages