Hi

We use Google Custom Search Elements for our site so the accuracy of Google's index is very important. We have found that, when we unpublish content (which we do as part of our versioning, worklow etc), the links aren't removed from Google's index. On investigation, I realised that the redirect 403 to login gives a '302 temporararily moved' status code. This means that Google won't remove it from its search index. I guess this is expected behaviour but it is really bad for us as it means that out of date content remains in google's search index so is shown to users when they search our site, even within the site. Is anyone else experiencing this? Does anyone have any suggestions?

Thanks
Chris

Comments

deekayen’s picture

Category: bug » support

Are you proposing a change?

macman’s picture

Hi

Sorry for the delay in getting back to you. It seems this module is working as designed. I just wanted to share the issues that it causes us with our setup and hopefully stimulate some discussion on how to solve this.

Cheers

michalmajzlik’s picture

Title: Redirect 403 to login creates 302 http status code which means links aren't removed from google index. » Redirect 403 to login creates 302 http status code which means links aren't removed from google. 302 needs to be changed to 301
Version: 6.x-1.4 » 7.x-1.x-dev
Category: support » feature
Priority: Normal » Major
Status: Active » Needs review

Solved this by modifying one line on r4032login.module, where is hardcoded "302", so I changed it to "301".

michalmajzlik’s picture

StatusFileSize
new673 bytes

Here is a patch.

deekayen’s picture

Status: Needs review » Needs work

Since I don't entirely agree with you, how about changing the patch to make the redirection code configurable? Then admins could pick 301, 302, or 307.

michalmajzlik’s picture

StatusFileSize
new1.92 KB

Good idea. Here is updated patch.

michalmajzlik’s picture

Status: Needs work » Needs review
greggles’s picture

@deekayen can you clarify why you think 302 is the right error here? It seems that Google's interpretation of 302 results in clearly undesirable behavior in this scenario.

deekayen’s picture

Status: Needs review » Fixed

I don't feel strongly enough to block this any longer. Committed #4. I wasn't a fan of #6's ability to insert any random text into the HTTP header.

Committed to 7.x-1.x, 6.x-1.x, and 5.x-1.x.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

pwolanin’s picture

Category: feature » bug
Status: Closed (fixed) » Active

I think this change was a mistake. Yes, for unpublished content you'd want a 301 perhaps. However, if the goal is to e.g. help an admin to log in if they got logged, you don't want a 301 which may be cached in their browser.

deekayen’s picture

I would entertain inserting some conditional checks to do alternations between 301 and 302.

pwolanin’s picture

Status: Active » Needs review
StatusFileSize
new3.39 KB

Here's a 7.x version that allows you to set a default code and match paths for the other code.

pwolanin’s picture

oops, that has a typo in the code comment. I'll fix after rolling the 6.

pwolanin’s picture

StatusFileSize
new3.26 KB

here's the corresponding 6.x patch

deekayen’s picture

Since we're injecting a user-defined HTTP code, is there any point in adding extra #validate to the settings form to make sure a valid 301 or 302 setting was actually submitted vs someone firebugging the form?

pwolanin’s picture

@deekayen - normally the Drupal Form API only allows you to post the options defined in the form. If you modify the options Drupal gives this error message:

"An illegal choice has been detected. Please contact the site administrator."

pwolanin’s picture

StatusFileSize
new3.4 KB

Here's the 7.x patch with typo fixed.

pwolanin’s picture

StatusFileSize
new4.91 KB

And with support for the Variable module.

pwolanin’s picture

Title: Redirect 403 to login creates 302 http status code which means links aren't removed from google. 302 needs to be changed to 301 » Allow coice of 301 or 302 http status so links can be purged from search engines
Category: bug » feature

updating title. Since this wasn't released yet, moving back to feature.

pwolanin’s picture

Title: Allow coice of 301 or 302 http status so links can be purged from search engines » Allow choice of 301 or 302 http status so links can be purged from search engines

committed to 7.x

pwolanin’s picture

Status: Needs review » Fixed

and 6.x

greggles’s picture

Status: Fixed » Needs work

This fix lacks real documentation of when to use which code or why one might be better than the other for a particular path. If we're going to give people this kind of interface I think we owe it to them to explain when they would want 302 vs. 301. I think it would be ideal to add some documentation before a releases explaining which one people should use and the conditions that they might want the other.

Mostly this makes me think the underlying issues are not understood well enough and we should explore the problem and potential solutions some more.

I think we need a bigger solution (that might be bigger than this issue). What about returning the 403 and putting the login form inside the page instead of doing a redirect. Is there a reason the module doesn't do that?

pwolanin’s picture

I agree I'm still not sure this is the right solution.

Maybe we should change this so instead of letting you have pages with a different redirect code, we instead skip the redirect and so serve the 403.

That should be effective in removing e.g. outdated content from search engines. We could have a default node/* pattern

pwolanin’s picture

Status: Needs work » Needs review
StatusFileSize
new4.25 KB

this changes it so we skip the redirect on match, and by default don't redirect any node pages.

greggles’s picture

StatusFileSize
new2.59 KB

How about this?
It keeps the 403 header so that search engines can do the right thing.

deekayen’s picture

I'm alright with keeping 403.

pwolanin’s picture

StatusFileSize
new4.69 KB

Combines the 2 approaches

greggles’s picture

This works in testing of a few scenarios.

pwolanin’s picture

Ok, thanks - since this provides everything the OP wanted, and also leaves the default behavior unchanged w.r.t. the prior release, let's include it.

pwolanin’s picture

Version: 7.x-1.x-dev » 6.x-1.x-dev
StatusFileSize
new4.11 KB

committed to 7.cx, here's the corresponding patch for 6.x

pwolanin’s picture

Status: Needs review » Fixed

tested locally and committed.

pwolanin’s picture

Status: Fixed » Needs review
StatusFileSize
new458 bytes

follow-up to delete variables in uninstall

pwolanin’s picture

Status: Needs review » Fixed

committed to 6 and 7

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.