Hi,

Per default drupal, at search engines "eyes" use a file (eg site.com/test) instead of a folder (eg site.com/test/) so its very dificult to block the search pages with htaccess.

One turn arround is adding a rel=nowfollow to urls generated by facet api.

Could you please tell where this urls are generated? (function/file)

Thanks

Comments

cpliakas’s picture

Category: support » feature

Hi PedroMiguel.

This is interesting, and it may be a good change to explore either as a setting or as the OOB behavior of Facet API links. Changing to a feature request to explore this further.

Regarding your question, the links are generated in the theme_facetapi_link_active() and theme_facetapi_link_inactive() theme functions respectively, which you can override in your theme's template.php file.

Thanks,
Chris

cpliakas’s picture

Note that these functions have changed between the beta8 and dev versions of the module, so make sure your overrides are reflective of the module version you are using.

PedroMiguel’s picture

Thanks,

Changing the 2 functions do the trick:

function theme_facetapi_link_active($variables) {
  $accessible_vars = array(
    'text' => $variables['text'],
    'active' => TRUE,
  );
  $accessible_markup = theme('facetapi_accessible_markup', $accessible_vars);

  // Sanitizes the link text if necessary.
  $sanitize = empty($variables['options']['html']);
  $link_text = ($sanitize) ? check_plain($variables['text']) : $variables['text'];

  // Resets link text, sets to options to HTML since we already sanitized the
  // link text and are providing additional markup for accessibility.
  $variables['text'] = t('(-) !markup', array('!markup' => $accessible_markup));
  $variables['options']['html'] = TRUE;
  $variables['options']['attributes']['rel'] = 'nofollow'; // added the rel="nofollow"
  return theme_link($variables) . $link_text;
}
function theme_facetapi_link_inactive($variables) {
  $accessible_vars = array(
    'text' => $variables['text'],
    'active' => FALSE,
  );
  $accessible_markup = theme('facetapi_accessible_markup', $accessible_vars);

  // Sanitizes the link text if necessary.
  $sanitize = empty($variables['options']['html']);
  $variables['text'] = ($sanitize) ? check_plain($variables['text']) : $variables['text'];

  // Adds count to link if one was passed.
  if (isset($variables['count'])) {
    $variables['text'] .= ' ' . theme('facetapi_count', $variables);
  }

  // Resets link text, sets to options to HTML since we already sanitized the
  // link text and are providing additional markup for accessibility.
  $variables['text'] .= $accessible_markup;
  $variables['options']['html'] = TRUE;
  $variables['options']['attributes']['rel'] = 'nofollow'; // added the rel="nofollow"
  return theme_link($variables);
}

It will be nice to have a option on admin menu to "turn nofollow on or off".

cpliakas’s picture

Issue tags: +low hanging fruit

Yes, that approach seems correct. For now you can add those functions to your theme's template.php file replacing "theme" with the name of your module and you should be all set. To get this into core Facet API, I will need someone to add an option to the FacetapiWidgetLinks plugin and have the value passed to the theme functions listed above. If the setting is TRUE, then add the rel="nofollow" attribute to the links. Adding to the "low hanging fruit" list.

cpliakas’s picture

On second thought... if the setting is implemented in the widget then we won't have to touch the theme functions. We can simple check the setting logic and add the option to the $variables array in the FacetapiWidgetLinks::buildListItems() function.

cpliakas’s picture

Title: How to add "rel=nofollow" to facet links? » Implement a setting to add "rel=nofollow" to facet links
Category: feature » task
Priority: Normal » Major

The article at Faceted Search = SEO Death supports this conversation, so marking this issue as a task since I feel this is pretty important from and SEO standpoint.

PedroMiguel’s picture

Yes, its critical, I ask about this because on the past I already have problems with googlebot indexing lots and lots of pages. Was adiviced by a ex google quality (spam) team member to add the nofollow tag.

Before that have the same imput on google webmaster forums (http://www.google.com/support/forum/p/webmasters/thread?tid=2594f5a8e956... -is not on english the discussion-) and a google employer says I should add a to the page (that we can do easy with meta tags module) but In my opinion its not necessary because we already have that on links.

So, the correct is having the canonical tag to search page index and have the rel='nofollow' on links.

The canonical is provided by core, so the only change on this module is the rel='nofollow' tag.

Sorry about my english ;)

cpliakas’s picture

Thanks for that link. Your English is great! All the information is clear and concise, so thanks.

cpliakas’s picture

cpliakas’s picture

Status: Active » Needs review
StatusFileSize
new3.41 KB

The attached patch adds the setting and defaults to "checked" meaning rel="nofollow" is added by default.

cpliakas’s picture

Issue tags: -low hanging fruit

Interesting approach by David at #197783: Module makes database balloon in size - avoid logging the guided searches, where it is suggested to only add nofollow when more that one facet is selected. Therefore the top level searches are crawled, but the crawling loops are still prevented.

PedroMiguel’s picture

That depend of use cases...

If you use faceted to taxonomy search you end up with 2 taxonomy page versions (duplicate content), the facet result and the taxonomy page itself.

As an option, its a good feature to have, disabled by default, because fit on more use cases.

In realtion to patch:

git apply -v facetapi-1370342-10.patch                  
facetapi-1370342-10.patch:19: trailing whitespace.
    
Checking patch facetapi.admin.js...
Checking patch plugins/facetapi/widget_links.inc...
Applied patch facetapi.admin.js cleanly.
Applied patch plugins/facetapi/widget_links.inc cleanly.
warning: 1 line adds whitespace errors.

I dont know if this error is do my reverted changes or the patch itself.

The patch works and all chekboxs are on widgets and nofollow added to urls.

cpliakas’s picture

Status: Needs review » Postponed

Thanks for testing. Not sure if the whitespace error is a problem here, as long as the functionality works we are all set. I am going to hold off on the option mentioned in #11, I think the rel="nofollow" is a great first step and I don't want to try to boil the ocean.

Marking as postponed pending completion of #593658: Make the current search block more configurable. That change will introduce additional links which would benefit from this issue as well, so the options should be applied there as well.

cpliakas’s picture

Status: Postponed » Needs review
StatusFileSize
new7.48 KB

Revised patch with functionality applied to the changes made in #593658: Make the current search block more configurable.

cpliakas’s picture

StatusFileSize
new7.48 KB

Language change in the checkbox label from "Prevent crawlers from indexing ... links" to 'Prevent crawlers from following ... links'

cpliakas’s picture

Status: Needs review » Reviewed & tested by the community

Based on the testing in #12 I think this is good to go. The other additions are fairly minor.

cpliakas’s picture

Status: Reviewed & tested by the community » Fixed

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

giorgio79’s picture

Here is an idea, so we may be able to remove nofollow, and get seo love

#2018449: Ability to specify facet weight, so we can allow search engines to index facet pages

fago’s picture

If you want to use facet links for SEO, you should probably use #2183757: Hierarchical rel-nofollow on links for search engines.