Hi,

I've just installed the module, enabled and configured it. Then I moved the old static robots.txt away. If I access the robots.txt now it displays the frontpage of my site instead of the configured robots.txt. Clean urls are enabled. Do I miss something?

best regards,
Oliver

CommentFileSizeAuthor
#23 robots_404.png4.01 KBhass

Comments

avpaderno’s picture

Title: robots.txt is not displayed, instead the frontpage of my site is shown » Error in the declaration of the menu

The error is caused by the wrong declaration of the menu for robots.txt. If you want to make it work sinply change the robotstxt_menu() function to the following code:

function robotstxt_menu() {
  $access_config = array('administer site configuration');
  $access_content = array('access content');

  $items['robots.txt'] = array(
    'page callback' => 'robotstxt_robots',
    'access arguments' => $access_content,
    'type' => MENU_CALLBACK,
  );
  $items['admin/settings/robotstxt'] = array(
    'title' => 'RobotsTxt',
    'description' => 'Manage your robots.txt file.',
    'page callback' => 'drupal_get_form',
    'page arguments' => array('robotstxt_admin_settings'),
    'access arguments' => $access_config,
  );

  return $items;
}

The difference is in 'access callback' changed in 'access arguments'. The first would define the callback used to check if a user has access to the menu (which by default is user_access()), while the second defines the arguments to pass to such function.

Without that correction the module doesn't work, as it uses the value TRUE while Drupal expects a string.

magenbrot’s picture

Hi,

thanks for the answer, but it doesnt work. I still get the frontpage of the website instead of the configured robots.txt

regards,
Oliver

avpaderno’s picture

To see it working you need to disable, and re-enable the module.
If you change the declaration of a module menu by manually changing the code, Drupal will not update the definition it saves on its own database tables. To force Drupal to update the list of menu it keeps, you must go to the modules list page, and disable and re-enable the module of which you changed the menu definition; it also works if you go to that page and click on save.

hass’s picture

Status: Active » Closed (works as designed)

#1 is wrong code. We do not make robots.txt depend on 'access content'. robots.txt must be shown every time - also on sites that have content disabled. We have had there an issue ~1-2 years ago while D6 version was in early dev and I missed this requirement.

See http://drupal.org/node/109157 and jump to

We support 'access callback' => TRUE (and FALSE of course).

This is the current and correct code:

  $items['robots.txt'] = array(
    'page callback' => 'robotstxt_robots',
    'access callback' => TRUE,
    'type' => MENU_CALLBACK,
  );
hass’s picture

Category: bug » support
Status: Closed (works as designed) » Postponed (maintainer needs more info)

@magenbrot: Are you using any type of language detection? Moving first as support until we are sure what's going wrong.

hass’s picture

Title: Error in the declaration of the menu » robots.txt is not displayed, instead the frontpage of my site is shown
avpaderno’s picture

All times I used 'access callback' => TRUE, all I got was that the menu was not available.
To make it appears, I changed the line to 'access arguments' => TRUE.

I cannot say if it's a bug in Drupal, or a bug in Drupal documentation, but I can say that 'access callback' => TRUE doesn't work.

hass’s picture

Very strange that robotstxt module works for me on my 6.8 box and all previous versions... never seen this problem myself. Could you investigate if this is a menu caching issue?

avpaderno’s picture

I checked the core code which handles the menu callbacks, and the code is the following (the code is truncated to the first lines):

function _menu_check_access(&$item, $map) {
  // Determine access callback, which will decide whether or not the current
  // user has access to this path.
  $callback = empty($item['access_callback']) ? 0 : trim($item['access_callback']);
  // Check for a TRUE or FALSE value.
  if (is_numeric($callback)) {
    $item['access'] = (bool)$callback;
  }

Therefore passing TRUE for 'access callback' in the menu callback definition should work like expected (and how declared in the documentation).
I also changed the menu callback definition of a module I use on my test site to use TRUE for 'access callback', and it works.
If there is a problem, then it's not caused by the menu callback definition.

hass’s picture

Thank you for investigating this. Good to know code is correct.

avpaderno’s picture

I installed the module on my test site, and I am able to see the robots.txt file.
It is surely not a bug of the module, or I should have the same issue magenbrot has.

hass’s picture

Status: Postponed (maintainer needs more info) » Closed (won't fix)

Cannot repro, no feedback from magenbrot, must be menu path caching or update.php has not executed cleanly in site maintenance mode.

magicyril’s picture

I'm having the same issue.

It seem to come from the dot in the menu item path : it's annoying ! If I change it to robots_txt it work, but if I keep robots.txt nothing happen, and the frontpage is displayed.

hass’s picture

Clear your caches... try to disable and re-enable the module... also check your webserver logs... this may be an Apache mod_security issue or something like this. On my Apache 2 and IIS 5.1 it works well for sure.

magicyril’s picture

I cleared the cache, I uninstalled / resinstalled the module, rebuild menu via devel, the default robots.txt file has been renamed, but I steel have the frontpage instead of a robots file.
I checked my log :
"Negotiation: discovered file(s) matching request: //robots.txt (None could be negotiated)."

magenbrot’s picture

Hi,

I'm sorry for the late response.

I've just upgraded to the new release from 2009-Mar-07. It was a fresh install, I've disabled the module and removed the database entries via deinstallation.
This is what I've done:
1. Downloaded the module and untarred it to sites/all/modules
2. chowned user and group to apache
3. put the 3 sites which will use this module in maintenance mode
3. enabled the module in the administration interface for all sites
4. configured the module for all sites
5. disabled maintenance mode
6. Tried to access the robots.txt -> It displays the frontpage of the sites

The language detection on the site is disabled regarding to your older question it. But all 3 sites have german as default language.

I've disabled the module on my mainpage again, but you can have a look at http://test.magenbrot.net/robots.txt and http://www.ovtec.de/robots.txt to see the results.

Let me know, if I can help further.

regards,
Oliver

hass’s picture

@magenbrot: No need to go into maintenance mode for install/uninstall. This is only required for upgrades. I checked your server and it gives me a plain 404 error and than the site shows me the front page! So it's save that you may have cluttered your installation - this is not the standard Drupal behaviour!

Some questions:
1. Have you installed any 404 error handler modules like CustomError module?
2. Do you have the robots.txt in the websites root folder deleted?
3. Are you using the standard .htaccess from Drupal or have you altered the file?

I wonder - why http://test.magenbrot.net/?q=robots.txt works well! Please also provide a list of all enabled modules. Try yourself by disabling one by one, until the issues goes away...

There are a few possible reasons:
1. Drupal Menu caching issue (I'm not aware about such a bug)
2. Apache mod_rewrite rules are wrong
3. A bad module is installed (I expect this is your issue)
4. Application firewalls in action

It's soooo time wasting to analyze this issue... If the above gives you no idea, feel free to contact me for a remote session on your server/site.

Please report back your findings.

hass’s picture

A quick look to customerror modules source code and a request to your site path customerror showed me you have customerror installed. Disable it as very first action, please.

hass’s picture

Project: RobotsTxt » Customerror
Category: support » bug
Status: Closed (won't fix) » Active

Moving to customerror issue queue.

magicyril’s picture

I haven't customerror installed on my website and I have the frontpage instead of the robots.txt file. It's not a cutomerror issue.

magenbrot’s picture

Category: bug » support
Status: Active » Closed (won't fix)

@hass: where did you get this 404-error? I can't reproduce this error here.

regarding your questions:
1. as you already noticed I've customerror installed. I just disabled all installed modules on http://test.magenbrot.net except for robots.txt, still no display of the robots.txt.
2. I've renamed the old robotx.txt to robotx.txt.old in the webroot.
3. I'm using the standard .htaccess but I have some RewriteRules in the apache config in conf.d:

RewriteEngine On
RewriteLog logs/magenbrot.net-rewrite_log
RewriteLogLevel 1

RewriteCond %{HTTP_HOST} !^www\.magenbrot\.net [NC]
RewriteRule ^/(.*) https://www.magenbrot.net/$1 [L,R=301]

# deactivate TRACK and TRACE
RewriteCond %{REQUEST_METHOD} ^(TRACE|TRACK)
RewriteRule .* - [F,L]

the following modules are enabled for my sites:
AdSense with:
AdSense Click Tracking
AdSense core
Managed ads

Amazon API

Core with:
Blog
Color
Comment
Contact
Database logging
Help
Locale
Menu
OpenID
Path
PHP filter
Ping
Profile
Search
Statistics
Syslog
Taxonomy
Tracker
Trigger
Update status
Upload

Counter

Image

Advanced help
Better Formats
Comment Subscribe
Custom Error
FCKeditor
Forward
IMCE
Lightbox2
PROG Gallery
robots.txt

CAPTCHA
Image CAPTCHA

Google Analytics

Views
Views UI

XML Sitemap
XML Sitemap: Engines
XML Sitemap: Node

magenbrot’s picture

Project: Customerror » RobotsTxt
Category: support » bug
hass’s picture

StatusFileSize
new4.01 KB

If you enter http://test.magenbrot.net/robots.txt in Firefox and take a look to Firebug you see the 404 error (see attached). The page you see should come from CustomError! I expect that CustomError is the source of this issue and I will try to test this later on my own box. As you are having a testsite - are you able to install a plain vanilla Drupal 6 with only robotstxt module enabled, please? Than switch on the other modules - step by step and see when it breaks or go the other way and turn all your modules off - step by step. Also please remove your robots.txt at all from website root or rename it to "robots.rem" - not sure if this helps... but this is how I run my sites. Make sure you use only default .htaccess files...

I'm also running many of the modules you have installed... but do not have:

AdSense with:
AdSense Click Tracking
AdSense core
Managed ads
Amazon API
Better Formats
Comment Subscribe
Custom Error
Counter
Forward
IMCE (not sure if active, but on my box)
Lightbox2 (not sure if active, but on my box)
PROG Gallery
CAPTCHA
Image CAPTCHA

magenbrot’s picture

omg... I've found it...
In the apache config are "MultiViews" enabled for all of my sites. I don't know if this was configured by me or if it is enabled by default.

This is from the apache documentation:

----------------------------
A MultiViews search is enabled by the MultiViews Options. If the server receives a request for /some/dir/foo and /some/dir/foo does not exist, then the server reads the directory looking for all files named foo.*, and effectively fakes up a type map which names all those files, assigning them the same media types and content-encodings it would have if the client had asked for one of them by name. It then chooses the best match to the client's requirements, and returns that document.

The MultiViewsMatch directive configures whether Apache will consider files that do not have content negotiation meta-information assigned to them when choosing files.
----------------------------

So it did not help to simply rename robots.txt to robots.txt.old or robots.rem.. I had to rename it to something completely different. Now it works as it should. To never let this happen again I've disabled the MultiViews option in my apache configs.

Thanks for all your efforts in helping me (and stealing your time, I'm sorry)! I think this is something to note in the README.txt.

regards,
magenbrot

vm’s picture

Category: bug » support
Status: Closed (won't fix) » Fixed
hass’s picture

Title: robots.txt is not displayed, instead the frontpage of my site is shown » Apache with MultiViews configured: robots.txt is not displayed, instead the frontpage of my site is shown

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.