Note: I have reported this issue to the drupal security team and was advised to create an issue.

Problem/Motivation

Drupal tries to hide its install files via robots.txt, this does not work when Drupal is installed in a sub folder since search engines only check the root of a domain for robots.txt.

Normally that is only a SEO issue but when the server is configured wrong or the user has started but not finished the installation process leaving the settings.php writeable in place it leads to a security issue on the site. A malicious user could install Drupal using an external db server and then gain php access to the server.

This is not a security issue of Drupal since it needs a mistake of the server admin or maintainer but we could save people from being hacked when we hide their vulnerability. You can find vulnerable sites using Google. I wont post the links or sites here in public but send links via mail on request. Total one can find a few dozen sites easily.

Proposed resolution

We cannot protect the user from all possible follies but the vulnerable sites shouldn't be searchable via a search engine. Google advises to use the noindex meta tag or x-robots-tag to make sure a site is not indexed:
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=156449
Here is an overview about the support of other search engines regarding the two tags:
http://michaeljaylissner.com/blog/support-for-x-robots-tag-http-header-a...

Attached is a patch that adds the noindex meta tag. I chose the meta tag way since it has the best support among the SE although http would be more flexible for other content types than html and needs fewer LOC. Both ways work and prevent the site from being indexed. I have setup test sites to validate the patch here:
http://d7-dev-test.a7n.de/
As you can see only the unpatched site appears in google:
https://www.google.de/search?q=site:d7-dev-test.a7n.de

Remaining tasks

Review patch.

User interface changes

None.

API changes

None.

Follow-up issues

Maybe unmake settings.php writeable afer 24 hours automatically.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

s.Daniel’s picture

Status: Active » Needs review
FileSize
816 bytes

One more try.

Status: Needs review » Needs work
Issue tags: -Security

The last submitted patch, install.php-noindex-meta-tag-1760330-1.patch, failed testing.

s.Daniel’s picture

Status: Needs work » Needs review
Issue tags: +Security
cosmicdreams’s picture

http://support.google.com/webmasters/bin/answer.py?hl=en&answer=93710 provides a more concise description.

Please namespace the key for the installer. The key currently used could easily be trampled on by an unsuspecting dev.

s.Daniel’s picture

Issue tags: +Novice
FileSize
849 bytes

Thanks, I changed the variable and key name to be more descriptive.

sun’s picture

Status: Needs review » Needs work

Great work! :)

+++ b/core/includes/install.core.inc
@@ -751,6 +751,15 @@ function install_full_redirect_url($install_state) {
+  // Prevent install.php from being indexed when installed in a subfolder.

1) This already describes what is being done to some extent, but it would be great to have some additional explanation for why this is needed in this comment here (and e.g. cannot be achieved via robots.txt); i.e., let's transfer some more details from the issue summary into the code comment.

2) Minor: Can we add a blank line before and after this code block?

+++ b/core/includes/install.core.inc
@@ -751,6 +751,15 @@ function install_full_redirect_url($install_state) {
+  drupal_add_html_head($noindex_meta_tag, 'noindex_nofollow_install_drupal');

Let's rename the key to 'install_meta_robots'

s.Daniel’s picture

Status: Needs work » Needs review
FileSize
1.12 KB

Thanks for the review. :)

I made the proposed corrections.

sun’s picture

Status: Needs review » Reviewed & tested by the community

Thanks! :)

droplet’s picture

Status: Reviewed & tested by the community » Needs work
+++ b/core/includes/install.core.incundefined
@@ -751,6 +751,21 @@ function install_full_redirect_url($install_state) {
+      'content' => 'noindex, nofollow',

why nofollow ? I don't mean any SEO issue. but for this thread purpose, should not here ?

sun’s picture

Status: Needs work » Reviewed & tested by the community

I guess nofollow is there to prevent the crawler from following any links on the page?

Unless I'm mistaken, noindex only affects the page it is on, but doesn't affect any subsequent resources that can be reached from that page.

In any case, having both is safer, and I don't see a reason to hold up this patch in case noindex might semantically be unnecessary.

s.Daniel’s picture

Unless I'm mistaken, noindex only affects the page it is on, but doesn't affect any subsequent resources that can be reached from that page.

Right. For all other internal install pages I see no reason to have google crawel them so I think it's better to not let google follow the links.

We could use the first install page for SEO purpose and link to d.o. Then we could consider removing the nofollow and put it on individual internal links. However from all I know that might as well be treated as link spam by Google creating a negative effect and will in any case have no big positive effect since the drupal install sites are usually new with low page rank and link value. That would be a different issue with a lot of things to discuss though.

catch’s picture

Status: Reviewed & tested by the community » Patch (to be ported)
Issue tags: +Needs backport to D7

This seems reasonable. It's silly to have vulnerabilities you can just google for, a bit like the PHP filter.

Committed/pushed to 8.x, seems like we could backport this to D7?

jbrown’s picture

Is it possible to test the installer?

jfhovinne’s picture

Version: 8.x-dev » 7.x-dev
Status: Patch (to be ported) » Needs review
FileSize
829 bytes

Here is the patch for D7.
Tested installation with patch applied - install works as expected and new meta tag is correctly added to the install pages.

<meta name="Generator" content="Drupal 7 (http://drupal.org)" />
<meta name="robots" content="noindex, nofollow" />
jfhovinne’s picture

Issue summary: View changes

typo

parthipanramesh’s picture

Issue summary: View changes
Status: Needs review » Reviewed & tested by the community

Tested. Looks fine to me and this is a good approach to solve the "issue".

David_Rothstein’s picture

Why is the backported D7 patch so different from the D8 patch (including missing all those code comments)?...

The D8 patch literally applies directly to D7 (just without the core/ directory) and seems to work fine, so I'm posting that instead and will commit it if it passes tests.

David_Rothstein’s picture

Status: Reviewed & tested by the community » Fixed
Issue tags: +7.25 release notes

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.