Lots of page not found in logs

By ukheather on 20 Jul 2007 at 14:12 UTC

I am getting lots of log entries saying page not found. Anyone know why?

page not found 07/20/2007 - 13:06 feed/index.php/0/taxonomy/term/taxonomy/0/index.php Anonymous
page not found 07/20/2007 - 13:05 term/index.php/0/taxonomy/term/taxonomy/0/feed Anonymous
page not found 07/20/2007 - 13:05 taxonomy/term/index.php/0/taxonomy/term/taxonomy/0/feed Anonymous

Comments

Yes, I for some reason

Ev0 commented 20 July 2007 at 14:17

Yes, for some reason I'm receiving a few similar messages, where the url just repeats itself. Are you receiving all of the messages from the same ip address? For some reason they're all coming from Amazon.com... after looking up the ip address.

it can well be that these

jase951 commented 20 July 2007 at 14:29

it can well be that these are crawlers or even SPAM bots trying to guess URLs.

Repeating URL's

JirkaRybka commented 20 July 2007 at 17:18

Some time ago, I had lots of such messages, and it was an issue with URL-format somewhere (a picture in my custom theme, but that's not relevant here).

You can have two styles of URL's:
--- Absolute: http://domain.com/something/something_else, or just /something/something_else
--- Relative: something/something_else

The absolute ones are based on your document root, while relative ones are - well, relative, to the current document. You should always avoid relative URL's, as there are problems with URL-variables.

In my case, no clean URL's used, the current document was /drupal/?q=something and the relative path seen in my theme was images/foo.gif. Any smart browser builds the real URL like /drupal/images/foo.gif, which is correct, but there are also stupid ones (perhaps crawlers, yes) sending requests to /drupal/?q=something/images/foo.gif. See what happens? The relative path appended to the whole string, including the variable, so it's Drupal who gets the images/foo.gif part to deal with, not the webserver. In my case, the resulting "not found" page contained the image again, making it even worse: The crawler appended another occurence of the string, then one more... I ended up with thousands messages in watchdog, each repeating the string almost endlessly.

Okay, it was just a stupid crawler, not stripping the variables from URL. But you seem to be using clean URLs, and in THIS case the browser may not know what's a variable and what's just subdirectory. The problem with relative paths is then unavoidable... So make sure that all your URL's start with at least a slash (or better full "http://"). In a theme, you can just print base_path() in front of your path, to make it absolute.

Maybe the case!

jamesclarke commented 18 September 2011 at 17:49

I have been having a similar problem . . . I think it has actually been causing performance issues on a relatively low traffic site! So, here is hoping I can figure where this repetitive loop is!

I'll try to remember to report back if it worked.

It was

jamesclarke commented 18 September 2011 at 17:51

I'm not sure if this has solved all of my drupal problems, but it seems to have addressed a lot of them. I had one relative link with an extra / and that seemed to send a crawler into fits . . . I also think that this was causing db problems because of all the 404 entries into the watchdog table. Time will tell.

Subscribing.

suzanne.aldrich commented 26 September 2007 at 19:23

Subscribing.

Lots of page not found in logs

Comments

Yes, I for some reason

it can well be that these

Repeating URL's

Maybe the case!

It was

Subscribing.

New forum topics

News items

Our community

Documentation

Drupal code base

Governance of community