When clean urls are enabled, non-existant pages are rewritten to www.site.com/?q=non/existant. The menu handler will throw a 404 on the non-existant Drupal path.
However, if clean urls are not active, then the 404 page (index.php) is called as is. $GET['q'] will be empty, and the front page will be shown without any indication of the 404 or without a log entry in the watchdog. Note that this only happens if .htaccess is being read, but mod_rewrite is not present or enabled. Not too common, but still a bug.
The attached patch fixes this problem by checking for a 404 in index.php and calling drupal_not_found() if needed. It also sets $_GET['q'] to request_uri(), so the 404'd URL shows up correctly in the watchdog (same as the rewrite rule).
This should go in 4.5 as well as HEAD. Patch is for HEAD, but it should apply to 4.5.
Comment | File | Size | Author |
---|---|---|---|
#23 | drupal-13594-4.patch | 1.08 KB | derjochenmeyer |
#17 | drupal-13594-3.patch | 1.37 KB | JeremyFrench |
#13 | drupal-13594-2.patch | 1.23 KB | JeremyFrench |
#10 | drupal-13594.patch | 1.21 KB | JeremyFrench |
404.patch | 1.07 KB | Steven | |
Comments
Comment #1
Steven CreditAttribution: Steven commentedAn alternative solution is to drop the ErrorDocument 404 from .htaccess, then the web server will handle them if there are no clean URLs. However I think the intended behaviour is for Drupal to log all 404s.
Comment #2
killes@www.drop.org CreditAttribution: killes@www.drop.org commentedDoesn't apply anymore.
Comment #3
killes@www.drop.org CreditAttribution: killes@www.drop.org commentedSteven what's up with this?
Comment #4
Steven CreditAttribution: Steven commentedAs far as I can tell, the issue is still there. I can't make an updated patch atm. Either we need to check for a non-clean 404, or stop intercepting non-clean 404s.
It affects setups where htaccess is being read, but mod_rewrite is not installed or not working.
Comment #5
neclimdulI can't reproduce this on HEAD(5.0) or 4.7.
Comment #6
neclimdulAfter testing with a disabled mod_rewrite I'm fairly certain this has been fixed already.
Comment #7
(not verified) CreditAttribution: commentedComment #8
JeremyFrench CreditAttribution: JeremyFrench commentedI think I will have to re open this. As it is kind of broken with D7 now.
I did an install of drupal 7 alpha 7 on a server with no mod rewite. It looked like everything was fine, and the install wen't ok. Pages were showing ok.
BUT.
Forms would not submit, and every page was returning a 404 code, but with the correct content for the page. I only realised abuot the 404 code as I had firebug on.
This is kind of an inciduous and evil bug, as I didn't realise that mod rewrite was disabled and clean urls looked like they were working, it was only after hacking around with the rewite rules to find out what was going on that I realised that it was the 404 handler which was causing the issue. But that was the best part of an afternoon spent trying to work out what was up.
I am bumping to major as it can break Drupal.
It would be nice to see if anyone else can recreate.
Comment #9
JeremyFrench CreditAttribution: JeremyFrench commentedLooking into this a little.
I think this is more prevalent as request_path() will parse a clean URL, even if clean urls are off. So if index.php gets called by the 404 handler instead of mod_rewite it can look like the site is working.
I can see a couple of ways to stop this, none ideal
1. request_path could honor clean urls, but this would make testing if clean urls work problematic
2. We could make the 404 handler in .htaccess conditional on mod rewrite being enabled, but as mod rewrite handles non existent files would this do anything?.
3. We could check in Drupal (somehow) to see if we are being called from the 404 handler or mod rewrite.
4. We could have the 404 handler go to a file other than index.php so we can flag that this is a file not found.
Comment #10
JeremyFrench CreditAttribution: JeremyFrench commentedI have looked into this some more and created a patch. It occurs when a server becomes misconfigured and clean urls no longer work. To reproduce is easy.
On your server turn off mod rewrite, then try to log in and disable clean urls.
You won't be able to, as the 404 handler doesn’t pass on form variables, as well as forcing a 404 return code for each request.
This essentially makes a Drupal site unusable, and it is not obvious what has happened, or even that things are broken until you try to submit a form.
The patch will check to see if what appears to Drupal to be a valid URL is also considered so by the web browser, if not, it will alert the user and turn clean URLs off.
Comment #11
JeremyFrench CreditAttribution: JeremyFrench commentedComment #12
JeremyFrench CreditAttribution: JeremyFrench commentedComment #13
JeremyFrench CreditAttribution: JeremyFrench commentedUpdating patch to get rid of tabs
Comment #14
JeremyFrench CreditAttribution: JeremyFrench commentedComment #15
neclimdulPatch seems to fix as expected. Couple things though.
Comment #16
neclimdulComment #17
JeremyFrench CreditAttribution: JeremyFrench commented@neclimdul Thanks for the review. Points 2,3 & 4 have been addressed.
I'm not sure about point 1, it may be useful to alert someone to the fact that something has just broken as it isn't immediately apparent unless you are keeping an eye on the URLs.
Comment #18
neclimdulWell, that may be but your average site visitor doesn't care or understand what that means. That's what the watchdog does. I don't know if we have this sort of failure logic documented anywhere but it is consistent with handling other sorts of failures in core.
Comment #19
derjochenmeyer CreditAttribution: derjochenmeyer commented#17: drupal-13594-3.patch queued for re-testing.
Comment #20
derjochenmeyer CreditAttribution: derjochenmeyer commentedThe patch in #17 works as expected.
Another way to reproduce this behaviour and test the patch is by setting up a fresh D7 install with clean urls enabled. Then turn the RewriteEngine off in your .htaccess
The site returns "307 Temporary Redirect", clean urls are deactivated.
Comment #21
amc CreditAttribution: amc commentedIs #718848: 404 when enabling overlay module when Short URLs are not supported. a duplicate of this?
Comment #22
neclimdulI'm not sure since this isn't directly associated with the Overlay module. That said, the problem with Overlay may be symptom and that would make it a duplicate.
Why don't you try the patch and see if it fixes the problem? I don't think there was anything keeping this from being RTBC other than I wasn't comfortable forcing that message to sit visitors.
Can you image if you saw something like that on Facebook? Some user would be like "Short URLs? WTF and why do I care?" Then there would be a post on thedailywtf and we'd all laugh at them.
Comment #23
derjochenmeyer CreditAttribution: derjochenmeyer commentedI removed the set_message and some whitespaces from JeremyFrench's patch #17.
Comment #24
Starbuck CreditAttribution: Starbuck commentedI have a brand-new install of D7 and I ran into this issue. I checked this site for issues related to a 404 message. I found #886182: Editing Path in Page Redirects to admin/content 404 and that led me here.
I applied the patch from #23 and the issue has indeed been fixed. I'm hoping this makes its way back into core.
For this issue there is a common question "Are clean URLs enabled?" Personally, I have no idea. When I go to Administration » Configuration » Search and metadata » Clean URLs, and run the test, the page refreshes, but there is no success/fail message. Looking at the site with a non-registered user I don't see clean URLs, so I'm assuming it's turned off. So where do we turn it on and how can we really verify that it's on or off?
Thanks!
[EDIT] It looks like the problem persists if the admin overlay is not in use, but when overlay is in use, the 404 does not occur. I'm not 100% sure of the cause/effect situation here but will post if I can narrow it down.
Comment #25
sunPatch looks good. Wondering whether we could test this in any way (hacks).
Comment #26
JeremyFrench CreditAttribution: JeremyFrench commentedI couldn't think of a way to automate the testing.
If you want to test manually it is fairly straightforward (from memory). Install a site on a server with mod rewrite working. Then disable mod rewrite in your Apache config (and restart).
Without the patch the site will look like it is working, but you will be unable to login (or do any post form submission).
With the patch clean urls should be turned off and login should then work.
Comment #27
pillarsdotnet CreditAttribution: pillarsdotnet commentedSo the test would be to change the same things that the "clean urls" test changes (thus simulating that it once passed), and then try to log in. If the login succeeds, the test passes.
Comment #28
quicksketchReading over this patch, I can immediately see a DrupalWTF coming out of it. If I made an accidental change to .htaccess or Apache that busted clean URLs, I'd probably notice quickly. In this hypothetical situation, I'd appreciate Drupal changing to non-clean URLs temporarily, but if I then corrected the issue with .htaccess/Apache my expectation would be that clean URLs would start working again immediately. This situation could go terribly wrong when dealing with multi-site situations where you'd have to go through dozens of sites and manually re-enable clean URLs on every one of them.
Additionally, most users have no idea what Clean URLs are at all. They don't know what they're called or why they wouldn't come back on immediately after correcting the problem with their server. I'd suggest that we make this change temporary (memory-only) by setting
$GLOBALS['conf']['clean_urls'] = 0;
before the call to drupal_goto() rather than saving the change to the database. "Automatically" changing configuration and leaving the user with no idea how to turn it back on is a hundred support requests just waiting to happen.Additionally the watchdog message is incredibly vague (couldn't we at least add a link to the Clean URLs configuration page?) and should describe how to remedy the problem rather than making a vague statement. I'd suggest something like this:
Comment #29
neclimdul@quicksketch thanks for digging this up and reviewing!
That is a very nice watchdog message! I'm for changing it.
I'm not sure about the not setting the variable though. If we don't there will be a swarm of watchdog messages until the site administrator figures out there's a problem.
Comment #30
JeremyFrench CreditAttribution: JeremyFrench commentedI'm thinking that we would need a separate variable. To say that clean urls had been fixed, with a periodic check to enable again if they are working.
I'm also not sure if form submissions would be passed to pages if they are 302 redirected (I haven't tested though).