Relative & absolute adresses
orilivni - May 14, 2008 - 17:55

Hi evrybody,

on most drupal sites I notice that the internal links are all relative and not absolute.

This is very bed for seo because

/anyfolder (relative link)

and

www.domainname.com/anyfolder (absolute link)

when linking there are two other diffrent page that google index!!! the credits splits!

We want to avoid duplicate data on our sites, the solution will be pretty simple for someone
who knows php i guess.

Does anyone else notice this? am I wrong? there is a solution that I don't know of?

Thanks

Comments

Garrett Albright’s picture

Web clients, including search engine spiders, cannot operate on relative paths alone. The entire concept of relative paths is more for the benefit of us developers than anything else. In order for a client to do anything useful with a relative path, it must convert it to an absolute path first anyway.

That is to say that Google is going to see a link to "/anyfolder" on the "example.com" server as a link to "example.com/anyfolder" anyway.

On a semi-related note, anyone interested in the handling of image and link paths in Drupal content should check out my Pathologic module.

or’s picture

yes, but:

example.com/anyfolder

is not like:

www.example.com/anyfolder.

when you link to it from some other page they will mostly write:

www.example.com/anyfolder.

and it's not the same page like: example.com/anyfolder so the credit splits between those pages,

as far as we talk about googlebot/yahoo.

your module sounds intresting but does it solves that problem?

dman’s picture

Like most SEO pundits, you need to try learning about URL resolution before guessing about it.

If there is a link to /anyfolder on any page in http//example.com/ then it is identical to the link http://example.com/anyfolder
And reciprocally, the same link found on the page http://www.example.com/ means exactly http://www.example.com/anyfolder

... Unless you actually chose to prefer one over the other by setting your $base_url. Once you do that, they will all go to the one place. Is that what you want?

Have you tried reading the instructions in your .htaccess ? Try it.
Discussed here http://drupal.org/node/44404 and in lots of places.

If you use the 301 redirect as instructed, Google etc treats them as the same.

You have the choice to use either with or without www - it's a choice thing. Without slightly adjusting the server behaviour (the mod_rewrite option shown in your .htaccess) both will work the same. You can choose, try it.

Also, FWIW, /anyfolder is not a relative link. It's root-relative (or server-relative) but not truly relative (context-relative) because it's anchored to the top.
../over/file is relative, as is just down/path/file
Terminology counts at this level.

.dan.
How to troubleshoot Drupal | http://www.coders.co.nz/

or’s picture

First I want to say thanks because every thing works fine now.

second, It's not a guess at all, I don't know much of programming

drupal and php but I'm learning seo from

one of the strongest seo companies in israel. the fact of

uniform of url of pages is well known to them, teached and being

carfull of "lazy" programmers and the use of relative url's.

Good luck

dman’s picture

This is very bed for seo because
/anyfolder (relative link)
and
www.domainname.com/anyfolder (absolute link)
when linking there are two other diffrent page that google index!!! the credits splits!

This statement is totally false when talking about links found on the same domain.
That's what makes me think that your strong SEO company has a bit more learning to do.

If your real problem was domain duplication and not relative links, then that's a different issue.

Root-relative links are better code than ramming fully-qualified URLs everywhere, and are not 'lazy' at all, it's a little bit harder if anything. In fact, seeing a site that has fully-qualified self-references all through it is a sign that it's been designed by a coder who didn't know how to do web properly or it's been attacked by a misguided SEO hack with a thinking problem.

... If I'm tuning my own search engine and found pages with more fully-qualified URL links in it than local ones - I would rank it lower as a matter of principle. Even if the F.Q. URLs resolved to the local site. Simply because that style of design is almost always a sign of a link farm.
At the very least, it's poor design.

Maybe you are lucky I'm not working at Google this year. :-)

.dan.
How to troubleshoot Drupal | http://www.coders.co.nz/

Garrett Albright’s picture

If by "fully-qualified URLs" you mean absolute URLs, there is at least one good to use absolute ones everywhere in content, especially on the kind of sites that Drupal is often employed to use; syndication. It's possible to write a feed reader which would know what to do with a path like "/foo/bar" coming from a feed at "http://example.com", but things get a little sketchy if it's not a root-relative path -- and very sketchy if the feed is more than one hop away from its source; if it's passed through a service like FeedBurner, for example. Or what if the content is syndicated via email?

That's the main reason why my aforementioned Pathologic filter always outputs absolute paths, even if the input was relative. (Had to get another plug in there…)

dman’s picture

True that.
Syndication and HTML-emails do need that sort of context injected into it! They are designed to be taken out of context.

But if your search engine spider can't figure out URL resolution based on the base URL it was given, it needs work.

I've just moved too many sites between dev/staging/test and sections between directories and IP addresses between hosts and secure/insecure and platform/platform and intranet/extranet/intranet and firewalls and proxys ... I believe that website pages need only link to each other, not their one 'unique' address in the belief that this page will never ever change.

.dan.
How to troubleshoot Drupal | http://www.coders.co.nz/

stevethewebguy’s picture

dman is 100% correct. A link pointing to: "/page" is interpreted in exactly the same way, by the search engine spiders, as the fully qualified: http://www.domian.com/page. GoogleBot is not seeing a "local version" of the page when it crawls a relative URL on a website, and then splitting the authority between them.

Now whats really interesting is that our Israeli friend is not even using the much more arguable (yet, still incorrect) logic (that I hear promoted once in a while at the SEO company I work for); that the algorithms factor in absolute links from anywhere on the internet (including your site) in a different (more favorable) way then they factor in relative links between the pages on a site. Because the algorithms are just advanced enough to crawl the entire internet and return well ranked, highly relevant results based on its automated collection and analysis procedures, but they just never got around to patching a bug that logs absolute paths as originating from outside of a website boosting the authority.

There's no SEO value in absolute linking between pages that are part of the same website.

For syndicated content, I believe you can send a <base href="http://www.domain.com/" /> tag out within the feed or whatever, so all your paths will work the same remotely as they would if you had absolute paths.

Now, if you want to use absolute paths, its fine, it wont hurt your site. I just see them as a sign of poor site construction & something that I need to fix before I start working on a site locally. I must set up a testing environment for a new client every month or so, if you are not using root relative paths you are no doubt wasting time and money (if not now, soon, trust me). At this point I would probably still use root relative linking for my sites even if they were slightly less desirable to the search engines then absolute paths.

Oh and hey guys, could you check out the new version of my website portfolio for me?... If you have a sec. It's my first Drupal Site (I'm seriously in love with this management system!), But if you see anything amateurish I'm doing, or not doing, I would really appreciate the feedback.

Thanks all, I hope I wasn't to harsh, or long winded, but root relative paths rule!
Great Thread,
Steve T.
```````````````````````````````````````````````````````````````````````

Garrett Albright’s picture

The company I work for also does some SEO work… and don't tell my boss, but a lot of it seems to me to be cargo cultism; of doing ritualistic things hoping for the desired outcome even if they make little sense. I won't claim to be an expert, but it seems to me that the best way to do well in search rankings is to have a well-structured, easily-crawlable site featuring (most importantly) awesome content. Any "optimization" beyond that may help to a certain extent, but surely there's a law of diminishing returns in effect.

With regards to your site: The menu makes it look like the Portfolio page is always the current page, and please don't force Arial upon us Mac users when we could be looking at Helvetica instead; specify Helvetica before Arial in your CSS, or just specify "sans-serif" so our browsers will pick Helvetica automatically. (Google, Yahoo, et al make this annoying mistake too…) Other than that, I think it's fine, but kind of "flat" for a Drupal site.