Hi,

does anyone know something about the MSN Search Engine Bot behaviour? Well in comparison to other bots, is has sucked about 20 times that much traffic from my site and has apparently great spidering ambitions, but it has BARELY listed my site when i try to search at search.msn.com. I have read about it pretty much omitting any robots.txt settings and not behaving nicely.

I thought we could share few words about it, because I at least do not feel good seeing the Microsoft bot sucking traffic from my site witnout any obvious reason.

Jan

--
StandyWorld.org - Don't let M$ bastard the standards!

Comments

frjo’s picture

Have you tried to add "Crawl-delay" to your robot.txt? Crawl-delay instructs bots that support it to wait a number of seconds before retrieving another page from that host.

#Google Search Engine Robot
User-agent: Googlebot
Crawl-delay: 10

#Yahoo! Search Engine Robot
User-Agent: Slurp
Crawl-delay: 10

#Microsoft Search Engine Robot
User-Agent: msnbot 
Crawl-delay: 10

User-agent: *
Disallow: /user
Disallow: /profile
etc...
kozuch82’s picture

Frjo,

sure i have a robots.txt file - i hope well configured, I DISALLOWED MSNBot and if he omited the ban, i've set his Crawl-Delay to 100 seconds...

I just read few topics about MSN Search on Webmasterworld.com and the opinions on the bot there are not wery positive. The worst thinkg is nobody really knows how to ban that evil bot except banning his IPs from your site (but not everyone can do this)...

Jan

--
StandyWorld.org - Don't let M$ bastard the standards!

libre fan’s picture

I stop as much M$-BigBrother as I can. I added this rule in my robot.txt
User-agent: msnbot
Crawl-delay: 1000
Disallow: /

I'm not sure I need the crawl delay if I disallow all crawling.

What's its IP: that's the trouble M$ has so many servers used as spies, serveral of them at least being unknown.
---
Libres-Ailé(e)s (Association for Linux and libre software) (France, Cévennes)