I read a good approach on how to manage robots.txt in this post:
http://drupal.org/node/22265#comment-98197
I followed the directions and damned if it didn't break my cron job. Really.
I had a problem and posted a reply there, but I think the thread may have run dry. SO.. I'll risk being redundant in hopes to get some experienced insights on the subject.
Background
We're using multi-site, single code base with clean URLs.
Disclaimers - my coding skills are very rusty. Sure I used to use vi - but that was over 10 years ago and I don't know php. SO maybe it's just something stupid (usually is right :-).
I created a "page", turned off rich text marked the input as type php. Then I pasted the following into it (without the code tags of course):
<?php
Header('Content-type: text/plain');
?>
User-agent: *
Crawl-Delay: 10
Disallow: /tracker
Disallow: /comment/reply
Disallow: /node/add
Disallow: /user
Disallow: /search
Disallow: /book/print
<?php
die();
?>
I used a path alias of robots.txt
I hit save. Then when I navigate to mydomain.com/robots.txt I see the appropriate robots text just fine.
But I started getting cron errors right after that. I tried navigating to:
mydomain.com/cron.php and what do you know, it showed me the robots.txt text.
YET when I went on the fileserver and looked at cron.php it was there, it was correct.