Support from Acquia helps fund testing for Drupal Acquia logo

Comments

marcvangend’s picture

Status: Active » Needs review
FileSize
704 bytes

Lucas, good catch. I can confirm the bug.

I noticed that the request_path function returns "index.php" when visiting www.example.com/index.php. Simply doing if ($path == 'index.php') { $path = ''; } seems to fix this. I'm not sure if it is the proper way to do this, but I'm attaching a patch anyway :-)

Status: Needs review » Needs work

The last submitted patch, index_php_404.patch, failed testing.

marcvangend’s picture

Version: 7.0-alpha1 » 7.x-dev
Status: Needs work » Needs review
FileSize
606 bytes

Hmmmm. Try again.

marcvangend’s picture

Title: URL without trailing "index.php" not recognized the same as without. » www.example.com/index.php returns 404 error

Phew, tests passed :-)
Also: choosing a better title.

Dries’s picture

Priority: Critical » Normal

I don't think this is critical. It is certainly not a regression from D6, as far as I can tell.

chx’s picture

Priority: Normal » Critical
Status: Needs review » Needs work

No way this is correct. Would this bomb with update.php, cron.php etc too? Why is the path index.php?

marcvangend’s picture

I think $path is 'index.php' because false assumptions are made in request_path().

request_path() assumes that, if there is no $_GET['q'], the path is equal to $_SERVER['REQUEST_URI'] without the leading directory name(s). That assumption is wrong - QED.

@Dries: this certainly is a regression, otherwise, http://drupal.org/index.php would return a 404.

chx’s picture

Then what about fixing that assumption? Why are we making such an assumption?

marcvangend’s picture

FileSize
1.36 KB

chx: sure, my patch is exactly that: an attempt to correct the logic which contains a false assumption. However I just noticed that the patch still isn't perfect.

The documentation for request_path already lists a few examples of possible paths and their return values:

 - http://example.com/node/306 returns "node/306".
 - http://example.com/drupalfolder/node/306 returns "node/306" while
   base_path() returns "/drupalfolder/".
 - http://example.com/path/alias (which is a path alias for node/306) returns
   "path/alias" as opposed to the internal path.

The first patch added support for the following example:

 - http://example.com/index.php returns an empty string (meaning: front page)

However, it didn't support the following example:

 - http://example.com/index.php?page=1 returns an empty string

This is now also supported in the attached patch.

marcvangend’s picture

Status: Needs work » Needs review

needs review!

cha0s’s picture

FileSize
1.52 KB

Cool, I rerolled it to fix a typo ($GET -> $_GET)

cosmicdreams’s picture

tested this on a fresh cvs:

http://example.com/index.php - WORKS
http://example.com/index.php?q=admin WORKS
http://example.com/index.php?page=1 WORKS
http://example.com/folder/firstpage (a page I created) - WORKS

any other paths you want me to test for you?

Damien Tournoud’s picture

Priority: Critical » Normal
Status: Needs review » Postponed (maintainer needs more info)

Why is that considered a bug?

It is definitively a change of behavior from Drupal 6, but the way I see that, we actually fixed a bug (see #432384: Any non-existent URL containing a valid parent path returns 200, not 404 for a discussion, for example).

cha0s’s picture

I would consider it a regression, but I'm curious to hear more opinions on the matter. index.php is kind of a special case, IMO.

marcvangend’s picture

Cosmicdreams:
Thanks for testing. Where you wrote "folder/firstpage", that is path alias for node/1, right?
I thinks all urls should also be tested when drupal is installed in a folder:
http://example.com/drupal/index.php
http://example.com/drupal/index.php?q=admin
http://example.com/drupal/index.php?page=1
http://example.com/drupal/folder/firstpage

Damien:
I consider this a bug because it breaks a url that works in D6. For me, the fact that example.com/index.php serves the frontpage, is expected behavior, not just on Drupal, but on every php site (like www.facebook.com/index.php). On the other hand, I'm not saying that nothing should ever change and I understand the arguments in #432384.
This change might cause broken links and bookmarks after people upgrade from D6 to D7, but I think it's hard to predict how large that problem would be. I agree with cha0s that index.php is a special case; maybe it's better to implement 301 redirect in .htaccess, from example.com/index.php to example.com?

marcvangend’s picture

Priority: Normal » Critical
Status: Postponed (maintainer needs more info) » Needs work

I'm setting this back to critical / needs work. I don't want this to be forgotten about before we decide if the current behavior is a bug or a feature.

catch’s picture

Priority: Critical » Normal

I agree with Damien that it's a feature per #23, #432384: Any non-existent URL containing a valid parent path returns 200, not 404 looks like an annoying bug which has been albeit inadvertently fixed. So although I'm not won't fixing this, it's definitely not critical. example.com/index.php not returning a page really feels like a good thing to me though.

marcvangend’s picture

Sure, this is not critical - I agree. I can also agree that, in most cases, this is a feature rather than a bug. Still, this is a change in behavior from D6.

Obviously there are lots of changes between D6 and D7, so we try to provide an upgrade path for all of them. Unfortunately, when external sites are linking to your Drupal site as www.example.com/index.php, there is no such thing as an upgrade path. How many broken links this change is going to cause? It may be close to zero or it may be much more than we expect - I'm afraid there's no way to tell.

I don't think that we need a setting in the back-end to choose the behavior of index.php, but would it be appropriate to provide some kind of toggle in .htaccess or settings.php, so site maintainers can choose to revert to the D6 behavior?

catch’s picture

I just added an alias for index.php pointing to /node, worked fine. However if sites are linking to index.php?q=node/1 etc. then that's not going to work obviously - is this actually happening? If that's the case then a (commented out) toggle seems appropriate.

marcvangend’s picture

Yes, that is actually happening. I did a quick search and I'm surprised at the results... have a look: http://www.google.com/search?q=link%3Aindex.php%3Fq%3Dnode%2F. Google indexed about 171.000 pages that either have a url containing 'index.php?q=node/' themselves, or link to a url containing 'index.php?q=node/'.

cha0s’s picture

Priority: Normal » Critical

I'm sorry but 'We fixed another bug by changing this so we have to change it' is a crap explanation when #20 provided demonstrable proof that hundreds of thousands of sites would be affected by such a careless attitude.

I'm marking this as critical again, and welcome discussion. In other words, explain why you think breaking 360,000 links (from the latest Google search from my location) is a good idea, and not a critical bug.

Damien Tournoud’s picture

Priority: Critical » Normal

Each Drupal version breaks links. This doesn't qualify as a critical issue.

Damien Tournoud’s picture

What #20 provided is a way to find Drupal sites that are running on Microsoft Windows. It would be nice to issue a 301 'Moved permanently' in that case.

catch’s picture

@cha0s, I'd rather we stop generating all that duplicate content rather than continue doing so in subsequent releases until it's not 360k paths but 3.6m. Like Damien said, issuing a 301 for the paths that already exists would be nice.

marcvangend’s picture

OK, so we need a patch that takes the index.php out of the request and redirects with a 301 - correct? Would .htaccess be the appropriate place to do that? And what about IIS & web.config?

david3605’s picture

Hi, we've noticed that the priority on this Drupal 7 issue has been downgraded. This has a significant impact on Dreamweaver CS5's ability to integrate with Drupal 7. We rely on browsing the index.php file in order to introspect dynamic includes and related files for CMS sites (a new feature in Dreamweaver CS5). With the current inability to browse to index.php in Drupal 7 alpha, we're unable to provide this Drupal integration to our users. Is there any chance we can have this reconsidered? We'd be happy to share reproducible steps to help with resolving this issue.

Thanks,
David Alcala
Adobe Dreamweaver Team

Damien Tournoud’s picture

@david3605: is there something preventing you from visiting the frontpage directly (http://example.com/ instead of http://example.com/index.php)?

@marcvangend: sorry for the delay, if you are still willing to work on this, we need to add redirect rules for both .htaccess and web.config. Those two should be quite similar.

marcvangend’s picture

sure, i'll give it a try. I kinda suck at regex stuff, but this should be simple enough I suppose.

@david: sounds like a cool feature (even though I haven't used dreamweaver for years myself). can DW cs5 also handle the dynamic function names it's using for the hook system?

david3605’s picture

We rely on browsing to a modified version of index.php in order to introspect the dynamic includes/related files. So for the Dreamweaver CS5 Dynamically-Related Files feature, we're browsing to something like http://example.com/index_foo.php. So that's why we can't browse to http://example.com/.

This issue also causes a problem with DW CS5's Live View feature, which actually was introduce in DW CS4. Live View is an embedded WebKit browser inside DW. When you turn on Live View for index.php, Live View adds index.php to the URL and you get the "File not found" message in the main content area. Live View always adds the file name to the URL.

@marcvangend - DW shows all of the .css, .inc, .module, .php include files that are used to generate the code for the home view of a Drupal site. Here are a few videos & articles that describe the new CMS functionality in more detail:

Top 3 Features in DW CS5 (7 minute video)
http://tv.adobe.com/watch/dreamweaver-cs5-feature-tour/top-3-features-in...

DW CS5 Feature Overview - video (CMS info starts at 3:40)
http://tv.adobe.com/watch/dreamweaver-cs5-feature-tour/dreamweaver-overv...

Introducing Dreamweaver CS5 (article)
http://www.adobe.com/devnet/logged_in/sfegette_dwcs5.html

You can download a free 30-day trial of DW CS5 if you want to see the problem first hand: http://www.adobe.com/products/dreamweaver/. We'd be happy to help with the DW setup if you have any questions.

Thanks,

David

catch’s picture

Not showing anything at index.php is a bug fix, not a regression, this issue is only about adding redirects for old sites.

There are other ways to find out which css, js, and php files are used to generate a page request. CSS and JS Drupal provides by itself, for php files either xdebug or the inclued extension works great without the need to hack core.

Chris Charlton’s picture

I'm surprised this issue was knocked down from critical to normal. If D7 ships once all critical bugs are resolved and this isn't in the list then tons of sites will break. Even if we try to tell people upgrading from 6 to 7 "works" this issue shows their site will not work out of the box once upgraded. And we all know how people don't read the manual/instructions until something breaks.

marcvangend’s picture

Status: Needs work » Needs review
FileSize
867 bytes

OK, here is a patch for .htaccess. I have tested this on my local Apache server, with different url's, with the following results:

Original URL:                               ->  Redirected to (http 301): 
---------------------------------------------------------------------------------
http://d7test.localhost/index.php           ->  http://d7test.localhost/ 
http://d7test.localhost/index.php?q=node/1  ->  http://d7test.localhost/?q=node/1 
http://localhost/d7test/index.php           ->  http://localhost/d7test/ 
http://localhost/d7test/index.php?q=node/1  ->  http://localhost/d7test/?q=node/1 

Regarding web.config, I don't have an IIS server but I think something like this should be added (at line 16 I guess):

        <rule name="Redirect www.example.com/index.php to www.example.com" stopProcessing="true">
          <match url="^/(.*)index.php$" />
          <action type="Redirect" url="/{R:1}" appendQueryString="false" redirectType="Permanent" />
        </rule>

As said, I'm not an .htaccess / web.config / regex expert, so all feedback and testing is very welcome.

Status: Needs review » Needs work

The last submitted patch, 711650-htaccess_redirect_index_php.patch, failed testing.

aspilicious’s picture

Status: Needs work » Needs review

Status: Needs review » Needs work

The last submitted patch, 711650-htaccess_redirect_index_php.patch, failed testing.

marcvangend’s picture

Status: Needs work » Needs review

Status: Needs review » Needs work

The last submitted patch, 711650-htaccess_redirect_index_php.patch, failed testing.

marcvangend’s picture

Status: Needs work » Needs review

#33: 711650-htaccess_redirect_index_php.patch queued for re-testing.

I guess head was broken, let's see what happens now.

Status: Needs review » Needs work

The last submitted patch, 711650-htaccess_redirect_index_php.patch, failed testing.

marcvangend’s picture

Nevermind, sorry about all the testing. Caching made me believe that the patch works, but it doesn't. Currently, .htaccess performs an internal redirect to index.php. The rewrite rule added by my patch did not just evaluate the original url (the one in the browser address bar), but also evaluated the internally redirected url, which of course matched and performed a 301 to the site root. I'll try to submit a new patch later this evening.

marcvangend’s picture

Status: Needs work » Needs review
FileSize
901 bytes

New patch. Should be better.

cha0s’s picture

Test bot thinks it's okay, however I'm not an expert with Apache regex/redirection (That syntax always looks like gibberish to me :P) so it'd be nice if someone else could look over it. :)

marcvangend’s picture

Re #43: and let's not forget that, if the patch from #42 is good, the same should be done for web.config.

marcvangend’s picture

Can somebody please review my patch?

chx’s picture

Status: Needs review » Needs work

yes. Why are you using THE_REQUEST? And if you must, then why are you matching the HTTP method w a regexp (which takes time)? Do you care whether it's GET HTTP or PROPFIND or whatever else? And you can skip the HTTP at the end too. I would suggest REQUEST_URI but i might have overlooked something.

marcvangend’s picture

chx, thanks for the review. I know there was a reason to use THE_REQUEST but I'm not sure wat it was... I'll get back on this.

marcvangend’s picture

Not sure why I didn't find this before, but there is indeed a way to do this without THE_REQUEST.
New patch, simpler code, less regex! :-)

Regarding performance, I'm still not sure if (.*)/index\.php(.*) is the fastest way to match a uri with "index.php" in it. For instance, would ^(.*)/index\.php(.*)$ (with caret and dollar signs) make a difference?

I have tested this patch with the following url's:

Original URL:                               ->  Redirected to (http 301): 
---------------------------------------------------------------------------------
http://d7test.localhost/index.php           ->  http://d7test.localhost/ 
http://d7test.localhost/index.php?q=user    ->  http://d7test.localhost/?q=user 
http://localhost/d7test/index.php           ->  http://localhost/d7test/ 
http://localhost/d7test/index.php?q=user    ->  http://localhost/d7test/?q=user

Sidenote:
Anarcat just mentioned (here at Drupalcon) that it might be possible (preferrable?) to get rid of the RewriteCond line and put it all in a single RewriteRule. That would create something like this: #RewriteRule ^index\.php(.*) %1 [L,R=301]. However I couldn't get that to work without requiring the user to explicitly set the rewrite base, which I don't want.

marcvangend’s picture

Status: Needs work » Needs review

Status: Needs review » Needs work

The last submitted patch, 711650_48-htaccess_redirect_index_php.patch, failed testing.

marcvangend’s picture

Status: Needs work » Needs review

OK, now I remember what the problem with REQUEST_URI is. It also kicks in when you request /admin for instance, and redirects to the base url, because the last line of the .htaccess adds index.php again (as an internal redirect). The reason that THE_REQUEST did work, but REQUEST_URI doesn't, is that THE_REQUEST does not change after an internal redirect, while REQUEST_URI does. A possible fix (without reverting to THE_REQUEST) is this:

  # Redirect www.example.com/index.php to www.example.com. A URL ending with
  # index.php was allowed in earlier Drupal versions. This rule prevents
  # existing links on external sites from breaking.
  RewriteCond %{ENV:REDIRECT_STATUS} !=200
  RewriteCond %{REQUEST_URI} (.*)/index\.php(.*)
  RewriteRule ^ %1/%2 [L,R=301]

But I'm not too happy with that. Basically, we just need a rewrite condition that checks if we're dealing with the original request, of with the internal redirect to index.php. I'll follow up later.

[edited: better explanation of what's happening]

marcvangend’s picture

marcvangend’s picture

OK, so here is a new patch that actually works :-)
I'm back at using THE_REQUEST because it turns out to be the most reliable and clean method to evaluate the initial request, even after an internal redirect. Code now looks like this:

  RewriteCond %{THE_REQUEST} ^[A-Z]{3,10}\ /(.*)index\.php
  RewriteRule ^ /%1 [R=301,QSA,L]

What the RewriteCond does is: step over the request method (3-10 uppercase characters followed by a space), remember the drupal base path and see if it can find "index.php" after it. The RewriteRule then redirects to the base path. This no longer explicitly specifies the http protocol, so I think it should also work on https, but I didn't have an opportunity to test that yet.

marcvangend’s picture

New patch, improved regex for readbility, improved inline documentation.

david3605’s picture

I filed a new related bug: "renaming index.php results in 404 error": http://drupal.org/node/906032. In Drupal 6, you could rename index.php to foo.php, and then browse to example.com/foo.php and the Drupal home page would still render correctly.

chx’s picture

Hm, the question in #55 is: why are we hacking htaccess if the problem per #711650-7: When index.php appears in the URL (or is automatically added by the server) users get a "page not found" message #7 was in request_path() and the first patches fixed that?

Damien Tournoud’s picture

Title: www.example.com/index.php returns 404 error » www.example.com/index.php should redirect to /

Visiting index.php returns a 404. This is by design.

It's a change of behavior from D6, but it is actually the correct behavior. This issue is about making sure old URLs redirect.

chx’s picture

Title: www.example.com/index.php should redirect to / » www.example.com/index.php returns 404 error

#15 said that maybe it's better to implement such a 301 -- no it's not. I verified that index.php?q=user works still and there is no problem in there.
I agree that a test should check against the URLs in #15 -- actually #11 contains a patch and #12 the same (i think) list of URls so it just needs to be rolled into a test and done.

chx’s picture

There is no reason for that redirect. #11 fixed the problem, why redirect? Edit: I believe fixing this is better than redirecting.

Damien Tournoud’s picture

Title: www.example.com/index.php returns 404 error » www.example.com/index.php should redirect to /

Again, this is by design.

Damien Tournoud’s picture

Discussed with chx, we probably should accept URLs *exactly* matching the script name.

ie.

index.php should be the frontpage
index.php?q=test should be the 'test' path
index.php/test should return 404

Starting from the patch in #11 and making sure we have this behavior should allow this issue to move forward.

chx’s picture

Title: www.example.com/index.php should redirect to / » Handle index.php in the URL
marcvangend’s picture

Chx, Damien, it's nice to see progress here.

So there has been a change in direction; from a 301 redirect in .htaccess back to accepting index.php in bootstrap.inc. Personally I'm okay with that; IMHO search engines shouldn't complain about duplicate content when www.example.com equals www.example.com/index.php. However search engines don't seem to care about my humble opinions. If returning a 200 for www.example.com/index.php causes any kind of problem (a real one, not just theoretical) we should do a 301 redirect. So far, I have not seen proof of such a problem. If returning a 200 does not cause problems, let's just apply a fix to bootstrap.inc (and fix #906032: renaming index.php results in 404 error while we're at it).

marcvangend’s picture

New patch.

- index.php returns the front page
- index.php?q=user returns the 'user' page
- index.php?page=3 returns the front page with the correct sub page
- index.php/user returns 404

When copying index.php to foo.php, all of the examples above (with 'index.php' replaced by 'foo.php') have the same return values.

Note: This patch does not take care of anything else regarding the renaming of index.php. The .htaccess will still rewrite requests to index.php if foo.php isn't explicitly requested. Things like auto-generating resized images break if index.php is not present. Manually editing .htaccess will probably solve that.

Damien Tournoud’s picture

+    if ($path == pathinfo($_SERVER['PHP_SELF'], PATHINFO_BASENAME)) {
+      $path = '';
+    }

You can simply use basename() here. Can we add the tests outlined in #64 somewhere in the bootstrap tests?

david3605’s picture

The patch in #64 solves the Dreamweaver CS5 problem with Drupal 7, described in #27, #30 & #55. The new Dreamweaver CS5 CMS feature "Dynamically-Related Files" now works with Drupal 7 Alpha 6 and the patch. Thanks!

marcvangend’s picture

David, thanks for the feedback.

O've never written tests before but I'll try to do so asap. If anyone else wants to add tests, be my guest.

chx’s picture

marcvangend , simpletest help is readily available on #drupal-contribute :)

marcvangend’s picture

New patch, now using basename() and including my first (dead simple) simpletest. Yay! The test is added to modules/system/system.test but I'm not sure if that's the most appropriate place for it.

Status: Needs review » Needs work

The last submitted patch, handle-index-php-711650-69.patch, failed testing.

marcvangend’s picture

Assigned: Unassigned » marcvangend

Test failed because testbot turns my request for 'index.php' into www.example.com/?q=index.php, which obviously returns a 404. Will write a new patch later today.

marcvangend’s picture

Assigned: marcvangend » Unassigned
FileSize
3.1 KB

Go testbot!

marcvangend’s picture

Status: Needs work » Needs review

I said go! :-)

chx’s picture

You need to construct absolute URLs for drupalGet yourself, it's "$base_url/index.php".

chx’s picture

Status: Needs review » Needs work

Oh, crosspost :D Nice work, thanks. One more thing, please " This request may be using a clean URL " -- care to clarity when it uses a clean URL or when might not? Something along the lines "This request is using clean URLs or it is index.php". That index.php can be renamed is explained a few lines down.

marcvangend’s picture

We cannot be sure that it's either a clean URL or index.php. That piece of code doesn't handle just clean URLs and index.php; it handles every request we throw at Drupal, except stuff that is filtered out earlier (existing public files and requests with $_GET['q']). This includes invalid and bogus requests like www.example.com/?qq=user, www.example.com/index.php/user and www.example.com/non-existent.css.

How about this? (can't roll a patch from where I am now)

  // Handle this request as if it is using a clean URL, or it is 'index.php'.
  // Extract the path from REQUEST_URI."
chx’s picture

Much better. Any comment that says vaguely "this might be foo" and then does not do a check on foo is quite the WTF.

marcvangend’s picture

Status: Needs work » Needs review
FileSize
3.16 KB

Thanks for the continuing feedback. Feels like we're getting closer to RTBC...

chx’s picture

Status: Needs review » Reviewed & tested by the community

I think so too ;)

sun’s picture

Status: Reviewed & tested by the community » Needs work
+++ includes/bootstrap.inc	17 Sep 2010 19:57:54 -0000
@@ -2451,11 +2453,17 @@ function request_path() {
+    // Handle this request as if it is using a clean URL, or it is 'index.php'.

This reads a bit odd. At the very least, s/is/was/, but even after that, I don't fully grok it.

+++ includes/bootstrap.inc	17 Sep 2010 19:57:54 -0000
@@ -2451,11 +2453,17 @@ function request_path() {
+    // Using basename() here tolerates renaming index.php.

What kind of renaming does this tolerate? Renaming into update.php? Can we clarify what is being meant here?

+++ modules/system/system.test	17 Sep 2010 19:57:54 -0000
@@ -1883,3 +1883,36 @@ class CompactModeTest extends DrupalWebT
+class IndexPhpTest extends DrupalWebTestCase {

Class name should start with "System". See #899444: Declutter "System" test group by applying coding standards fixes to common.test

+++ modules/system/system.test	17 Sep 2010 19:57:54 -0000
@@ -1883,3 +1883,36 @@ class CompactModeTest extends DrupalWebT
+    $this->update_url = $GLOBALS['base_url'] . '/index.php';

How does "update_url" relate to the variable's value?

+++ modules/system/system.test	17 Sep 2010 19:57:54 -0000
@@ -1883,3 +1883,36 @@ class CompactModeTest extends DrupalWebT
+   * Test index.php handling. ¶

Trailing white-space.

Powered by Dreditor.

marcvangend’s picture

Status: Needs work » Needs review
FileSize
3.15 KB

Sun, thanks for your thorough review (as always). New patch attached.

I fixed all of your points except the renaming comment. The possibility to rename index.php was available in D6, so it is considered regression that D7 didn't allow this. See #906032: renaming index.php results in 404 error for the actual bug report. (Renaming into update.php is obviously not possible because that file already exists - but you knew that :-)) You do have to read it together with the line before:

    // If the path equals the script filename, the front page is served.
    // Using basename() here tolerates renaming index.php.

I'm not sure how to make that line any clearer, and I don't want to start a list here of possible reasons why people would want to rename index.php. Can you try to explain what isn't clear? Would the following be better?

    // If the path equals 'index.php', the front page is served.
    // Using basename() here tolerates renaming index.php.
david3605’s picture

When will this fix be added to the Drupal beta installation? The status is still set to 'needs review'. Thanks - David

marcvangend’s picture

David, that status is "needs review" because... it needs review! I have done my best to make this work, but Drupal is a open source project driven by volunteers and I cannot force anyone to review and commit this. As in any organization, resources are limited. If the community gives more priority to other issues, I'm afraid you can either respect that, or actively find someone to review the patch and make this RTBC.

cosmicdreams’s picture

Marc, I can test tomorrow night (Wednesday), should I test anything different than the last test I did for you?

marcvangend’s picture

cosmicdreams, thanks.

The tests and their expected results are:
- index.php returns the front page
- index.php?q=user returns the 'user' page
- index.php?page=3 returns the front page with the correct sub page (generate nodes to make sure that page 3 exists)
- index.php/user returns 404
The above must be true when Drupal is installed in the site root (www.example.com/index.php) and in a subdirectory (www.example.com/drupal/index.php). Obviously, url's that currently work (like www.example.com/user/) must not break.

Of course there is more to reviewing than just testing, as Sun showed in #80. The code also needs a review for quality, standards and documentation.

cosmicdreams’s picture

this will be my second test target for tonight. Thanks marcvangend for the added direction on the test.

cosmicdreams’s picture

Starting test this post will update as I progress...

Pre-Test: Generated 50 new page nodes so that i could use theme later in the test.

Test 1: Before the patch

/index.php : page not found
/index.php?q=user : goes to user page (matches expectation)
/index.php?page=3 : page not found
/index.php/user : page not found

Test 2:
* applied patch
* flushed cache
* turned off all 3rd party modules
* ran cron
* logged out.

/index.php : front page
* went to node/14, tried /index.php again : front page
/index.php?q=user : user login page ( matches expectation)
/index.php?page=3 : third page of front page

So all test returned expected results. I'd say this is an very good improvement that brings d7 url arguments in line with historical expectations.

cosmicdreams’s picture

+1 for RTBC, but I'd prefer for a code commander to weigh in wether to code meets standards or not.

marcvangend’s picture

Status: Needs review » Reviewed & tested by the community

cosmicdreams, thanks for testing and for the excellent test report.

Based on the facts that
- test results (automated and manual) are positive,
- chx reviewed the logic and already declared this RTBC before,
- and the code has been cleaned up, inspired by Sun's feedback,
I feel permitted to declare my own patch RTBC.

cosmicdreams’s picture

I second the motion!

Dries’s picture

Status: Reviewed & tested by the community » Needs work
+++ includes/bootstrap.inc	24 Sep 2010 20:24:03 -0000
@@ -2430,6 +2430,8 @@ function language_default($property = NU
+ * - http://example.com/index.php returns an empty string (meaning: front page)
+ * - http://example.com/index.php?page=1 returns an empty string

These sentences miss a trailing dot/point.

+++ includes/bootstrap.inc	24 Sep 2010 20:24:03 -0000
@@ -2451,11 +2453,17 @@ function request_path() {
+    // Using basename() here tolerates renaming index.php.

I don't understand this code comment. Let's try to clarify it.

Plus, I'd describe why we need the actual check.

+++ modules/system/system.test	24 Sep 2010 20:24:03 -0000
@@ -1884,3 +1884,36 @@ class CompactModeTest extends DrupalWebT
+  function setUp() {
+    parent::setUp();
+    $this->index_php = $GLOBALS['base_url'] . '/index.php';
+  }
+

I'd personally move the initialization of index_php from setUp() to testIndexPhpHandling(). It's not a given that we'd want to re-use that variable in other test functions IMO.

marcvangend’s picture

Status: Needs work » Needs review
FileSize
3.18 KB

Thanks Dries. New patch attached.

cosmicdreams’s picture

Looks like this patch will need testing. I'll put this on my list for tonight.

cosmicdreams’s picture

Status: Needs review » Reviewed & tested by the community

Yep same as before

Dries’s picture

+++ includes/bootstrap.inc	27 Nov 2010 21:49:10 -0000
@@ -2514,11 +2516,18 @@ function request_path() {
+    // Do not check for 'index.php' explicitly, but use basename() to allow
+    // renaming of index.php.

Mmm, but what is the use case for renaming index.php? Is it worth having slower code for that?

marcvangend’s picture

Dries: david3605 noted in #906032: renaming index.php results in 404 error that it used to be possible in D6, but it's not in D7. That's why it was decided to fix that (small) regression in this patch. Personally, I don't have a use case for renaming index.php (maybe installing Drupal in the same directory as another index.php, although I would never do that). I do not deeply care about this feature and IMO we could sacrifice it for performance.

Out of curiosity, I ran a quick test to see the difference between comparing with basename() and comparing with a fixed string:

$repeat = 1;
$string = 'index.php';
$i = 0;
$ii = 1;

$start_i = microtime();
while ($i < $repeat) {
  if ($string = 'index.php') {
    $i++;
  }
}
$end_i = microtime() - $start_i;
print "$repeat string comparison(s) took $end_i ms.<br>\n";

$start_ii = microtime();
while ($ii < $repeat) {
  if ($string = basename($_SERVER['PHP_SELF'])) {
    $ii++;
  }
}
$end_ii = microtime() - $start_ii;
print "$repeat basename() comparison(s) took $end_ii ms.<br>\n";

Running the test multiple times, the results were always similar to this:

1000000 string comparison(s) took 0.163939 ms.
1000000 basename() comparison(s) took 0.376968 ms.

This looks like performance impact would be minimal, even on a 3-year old laptop like mine.

Dries’s picture

Personally, I'm not convinced we should fix every regression. It's OK for some behaviors to change. To me, this looks like one of them. To the best of my knowledge, index.php could never be removed intentionally. Happy to discuss this more though.

marcvangend’s picture

Actually, David (who works for Adobe, it seems) clearly describes his use case for renaming index.php in the original issue (#906032: renaming index.php results in 404 error):

This is breaking the new CMS functionality in Adobe Dreamweaver CS5 where Dreamweaver is able to discover all of the dynamic include files that make up a Drupal site. Dreamweaver does this by creating a temporary copy of index.php (e.g. index-random-string.php) and adding some introspection code to the copy.

While that sounds as a valid use case, you could also argue that it's the world upside down if we start writing code to please IDE's.

Samion’s picture

#9: index_php_404_9.patch queued for re-testing.

marcvangend’s picture

@Samion: Any specific reason to re-test that patch? If we decide to check for index.php instead of basename($_SERVER['PHP_SELF']), let's just edit the patch from #92 which includes simpletests.

mgifford’s picture

Ok, D7 is out. I just noticed this bug. How to deal with this uni-intentional behavior change?

Is there a quick work around we can put into the .htacess file or in an Apache RewriteRule?

EDIT: This will have SEO implications for sites migrating to D7.

marcvangend’s picture

mgifford, you can try the patch from #54. I think it should work (at least in a Linux/Apache environment) but I can't give any guarantees.

I stopped investing time in this issue a while ago because not many people seemed to care, but I still think it would be good to get #92 reviewed and committed... your help is appreciated.

mgifford’s picture

The quick solution that we came up with was to create a URL alias that pointed to node.

However, I'd really like to see some sensible solution like #54 get into the next maintenance release. Seems like #92 is RTBC so I'm wondering if this will get in the next maintenance release.

Shyin’s picture

I have used Drupal for over two years on a gaming website that my daughter and I designed and built. I went back to college for IT / Web Design not long after putting my site up. I am just about to finish my fourth semester. I have already set up my summer and fall courses. My daughter started college last year and is just about to finish her second semester. She is studying Computer Arts. She has her fall courses set up. Like most colleges, ours teaches using the Adobe product line. I have two summer courses set up, one for Adobe Dreamweaver and Adobe Flash. In fact this fall we will be taking our first course together, an Adobe Photoshop course. I am personally taking two Photoshop courses this fall. A few weeks ago we decided to change the direction and goal of our website. Since we weren’t worried about saving any of the old website, we decided that we may as well start with a clean install of Drupal 7. We are both using Adobe Creative Suite 5 Web Premium and we would love to continue to be able to use Drupal in our learning environment, but this issue makes that fairly impossible right now. We're not sure what to do. We really love Drupal and want to continue using it. If you could find a solution, I’m sure many students just like us would be more than appreciative. Here’s hoping this issue gets resolved fairly soon.

mgifford’s picture

My solution in 103 works as a quick fix for anyone. See http://openconcept.ca/index.php

It's a hack. Better to fix it in php or .htaccess, but it definitely works.

marcvangend’s picture

mgifford, thanks for sharing your workaround. I'm not sure though if it will help the Adobe IDE problem.

mgifford’s picture

I've been doing web development now professionally for a decade without Adobe products. It's certainly possible to learn to go without it.

But ya, it's not likely to help the problem with Adobe's IDE. I do think it's in everyone's best interest to get this resolved though.

David_Rothstein’s picture

Priority: Normal » Major

It seems this patch is the solution for #787426: "Page not found" after new install and other broken links on IIS 5.1 & 6 (which I marked as a duplicate), so I'm upping the priority of this issue.

I'll trigger a retest also (since the patch is so old now).

David_Rothstein’s picture

#92: handle-index-php-711650-92.patch queued for re-testing.

rfay’s picture

Version: 7.x-dev » 8.x-dev
Status: Reviewed & tested by the community » Needs review

D8 first. Let's get these fixed on D8 and into D7.

marcvangend’s picture

rfay: I beg to differ. This is not a feature request, it's a bug report. It's a regression compared to D6, and IMHO it could have been fixed a long time ago. Fixing it now in D7 will not hurt anyone.

David_Rothstein’s picture

Status: Needs review » Reviewed & tested by the community

Well, we have to fix it in D8 first and then backport it (it won't be committed to D7 without that happening first).

But since it was RTBC for D7 I don't see any reason why it shouldn't be RTBC for D8 also... the two codebases are pretty close to identical right now. (Note: I only skimmed the patch myself so this isn't me RTBC-ing it personally, just restoring it to its previous state :)

rfay’s picture

Issue tags: +Needs backport to D7

Marking "needs backport"

mgifford’s picture

@David_Rothstein Agreed!

It passed it's last test yesterday against Drupal 8.

This should be fixed as part of the next D7 maintenance release.

marcvangend’s picture

I can confirm it's RTBC. I did a manual test, applying the patch from #92 to today's 8.x-dev, installed with the standard profile. All tests described in #85 produce the expected result.

Dries’s picture

Version: 8.x-dev » 7.x-dev
Issue tags: -Needs backport to D7

@webchick; if you want to apply this patch to Drupal 7, feel free to.

I have no intention to support this edge in Drupal 8. Happy to work with Adobe on a cleaner solution though.

David_Rothstein’s picture

Version: 7.x-dev » 8.x-dev
Issue tags: +Needs backport to D7

Did you see the link above to #787426: "Page not found" after new install and other broken links on IIS 5.1 & 6? This is definitely a legitimate bug.

It sounds like the part you object to is this:

+    // Do not check for 'index.php' explicitly, but use basename() to allow
+    // renaming of index.php.

But that's tangential to the main point of the patch.

Dries’s picture

I had not read up on #787426: "Page not found" after new install and other broken links on IIS 5.1 & 6.

Should we write something like?

 // Do not check for 'index.php' explicitly, because not all web
 // servers append 'index.php' (i.e. Microsoft IIS) and in some
 // cases 'index.php' is renamed (i.e. Adobe tools).

Or is that not an accurate summary?

marcvangend’s picture

Dries, I think that sums it up nicely, although (minor-minor-minor nitpick!) I guess that should be "e.g." ("for example") instead of "i.e." ("that is").

If I could re-roll the patch on my cellphone, I would be doing it right now...

David_Rothstein’s picture

Title: Handle index.php in the URL » When index.php appears in the URL (or is automatically added by the server) users get a "page not found" message

My understanding is that the comment is not quite correct, since the original goal of this issue ("handle index.php in the URL") as well as the more serious bug at #787426: "Page not found" after new install and other broken links on IIS 5.1 & 6 could both be fixed by hardcoding a check for "index.php" rather than using basename().

However, I don't have IIS to test that, and we already have confirmation that the existing patch fixes IIS. Plus, although the Adobe thing really isn't related to this particular issue, I'd vote for just sticking with basename() anyway, given that we saw above that it's a miniscule performance difference and overall it seems like it makes the code cleaner and more robust compared to hardcoding the string "index.php"...

But it seems like it would be preferable for the code comment not to refer to that edge case. How about this?

    // If the path equals the script filename, either because 'index.php' was
    // explicitly provided in the URL, or because the server added it to
    // $_SERVER['REQUEST_URI'] even when it wasn't provided in the URL (some
    // versions of Microsoft IIS do this), the front page should be served.
    if ($path == basename($_SERVER['PHP_SELF'])) {
      $path = '';
    }
mgifford’s picture

Sounds pretty clear to me David, thanks! Hopefully we can get this into D7 & D8.

Dries’s picture

David's suggestion works for me. Some people might be confused on the $_SERVER['REQUEST_URI'] vs $_SERVER['PHP_SELF'] thing. Let's make a patch! :)

marcvangend’s picture

Status: Reviewed & tested by the community » Needs review
FileSize
2.9 KB

My first git patch, I hope it works :-)

marcvangend’s picture

Status: Needs review » Reviewed & tested by the community

Tests passed, only comments were changed according to #121, so back to RTBC.

Dries’s picture

Status: Reviewed & tested by the community » Fixed

Committed to 7.x and 8.x. Thanks marcvangend and cha0s -- congrats on your first patch.

marcvangend’s picture

Thanks Dries! Even though it wasn't my nor cha0s' first patch, I appreciate you cheering for minor contributors like myself :-)

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

Jumoke’s picture

Version: 8.x-dev » 7.4

I have version 7.4 installed and i still get not-found when i navigate to mysite/index.php. Any ideas?

Jumoke’s picture

Status: Closed (fixed) » Active
Jumoke’s picture

I still get the not-found error on my site with patch in (i am using 7.4). Is there something else i may be doing wrong?

marcvangend’s picture

Version: 7.4 » 8.x-dev
Status: Active » Closed (fixed)

Excalibur: please open a new issue, not everyone involved here will be interested to follow your specific problem being solved. In your new issue, post as much information as possible, like the name and version of your database, PHP, webserver, OS, contrib modules and everything else that might be relevant. After that, you can post a link to the issue in this thread. Thanks.

emb03’s picture

I know this post is closed but it is one of the first links to show in Google when you google this issue. I couldn't get any of the patches to work but I found this code which did work:

http://api.drupal.org/api/drupal/includes%21bootstrap.inc/function/request_path/7

It works in DW CS6 live view too.