Problem/Motivation

  • Create a redirect from: foo to: sites/default/files/foo%20bar.pdf.
  • When the redirect is followed, it will be double-encoded as sites/default/files/foo%2520bar.pdf, leading to a 404 error.
  • Entering a literal space produces the error "The redirect path sites/default/files/foo bar.pdf is not valid.".

There is a workaround for the To URL: by entering a full http:// URL, and substituting some uppercase letters in the domain name, you can trick the module into thinking it's an external URL, and it will not double-encode the percent characters.

The workaround for the From URL appears to be replace each %20 with a space.

Comments

dave reid’s picture

I can't remember - can you enter an 'to' URL of 'foo bar' (not encoded)?

EDIT: Nope it doesn't work.

dave reid’s picture

Status: Active » Postponed (maintainer needs more info)

Also, if 'foo bar' is an alias for something then it should work automatically. Otherwise you're redirecting to an invalid path?

grendzy’s picture

Status: Postponed (maintainer needs more info) » Active

I should have specified - the redirect was intended for a file; sites/default/files/foo%20bar.pdf.

grendzy’s picture

Issue summary: View changes

typo

dave reid’s picture

Status: Active » Needs review
StatusFileSize
new652 bytes

Ah thanks for clarifying! Please test the following patch. I should probably write a follow-up issue that redirects should be able to point to stream wrapper destinations like public://file.txt or private://file.txt.

grendzy’s picture

I tested the patch, and it does allow redirecting to files with a literal space character. This seems like a usability problem though - users expect to be able to copy / paste a URL into the "add redirect" form. With this approach users will have to sometimes manually decode the URL after a paste, and it's hard to predict when. Here are the use cases I've tested:

  • http://example.com/foo%20bar - requires encoding
  • foo bar (path alias) - requires literal
  • node/1?baz=foo%20bar - accepts both literal and encoded versions
  • sites/default/files/foo bar.pdf - requires literal
onco_p53’s picture

subscribe

j0rd’s picture

Status: Needs review » Reviewed & tested by the community

#4 + @DaveReid. It's important that this patch get added to the redirect module. I have files with colons and spaces in them, and I'm unable to add them to redirect module currently due to the double encoding.

#5 + @Grendzy While I agree with you in theory that this is a usability quirk, but currently with out the patch, neither method will work:
* Pasting with out the encodings creates a not valid error.
* Pasting with the encoding, creates a double encoding problem and you get a 404.

The problem comes down to the fact that redirect module passes the redirect through url() function, which encodes the path through rawurlencode. This means that in the database and via this form, the "TO" always has to be stored un-encoded.

---

As you mentioned the problem will be when the user copy's and pastes the URL from their Browser into this "TO" field. It will most likely be encoded in that point.

The only way to resolve this problem, would be to add some ajax which when something is entered, compares the current version vs. the current version decoded to see if they are different.

If they are different, the AJAX should convert the pasted string into the decoded version. And prompt the user that "All URLs must be stored un-encoded. We've decoded this URL automatically for you. Is this OK? [X]".

Should the user uncheck the checkbox, the AJAX will revert it back to the original pasted URL and allow the value to be saved un-decoded.

This could also be done via a two step submit.

--

So my recommendation is, add this patch as it stands, as it does resolve one problem.

Then work on a method to deal with the already encoded URL issue as mentioned by @Grendzy in #5

j0rd’s picture

There's another problem which relates to redirects & files actually.

When a node is deleted, any redirects to it's files are not deleted as well.

I've posted this issue here:
#1432946: Deleting file redirects, when files are deleted.

wizonesolutions’s picture

Status: Reviewed & tested by the community » Needs work

This doesn't fix double-encoding for regular paths.

If I put in cr%C3%A9ation, it still comes out with the percent signs encoded. I can't redirect to création directly because that doesn't pass valid_url().

I'll look in the issue queue a bit more to see if there's a patch for what I need. If I have inappropriately changed status, then change it back and probably make the title file-specific.

wizonesolutions’s picture

Yeah, this still happens. I think redirects need to be saved unencoded in the DB, but they should be checked for validity in encoded form (see patch in #1451868).

For now, though I think we should decode them right before passing them to url(). I don't know if any other modules depend on the way Redirect stores its data, but I don't want to make too big of a change if avoidable.

Gonna try re-rolling this patch with that functionality added.

wizonesolutions’s picture

Priority: Normal » Major
Status: Needs work » Needs review
StatusFileSize
new4.02 KB

I'm daring to bump this to major. I know that I have an unusual use case (redirecting to paths that don't exist — long story), but the module should still work appropriately in that case.

This patch gets it closer. I don't think it's the best, and it's not exhaustive, but it does pass tests. I don't know if it needs new ones to accommodate this case. Let me know if so, though I can't guarantee I'll be able to be the one to do it.

Status: Needs review » Needs work

The last submitted patch, 1238418-redirect-check-if-file-plus-decode-11.patch, failed testing.

Anonymous’s picture

Issue summary: View changes

clarified use case

leewillis77’s picture

leewillis77’s picture

There's a patch over on the linked issue (https://www.drupal.org/node/1451868#comment-8984133) that addresses the "can't add redirects where the "TO" address contains spaces" case. Not sure if the issues should be merged or not - I can't tell if they're 100% the same?

vmettem’s picture

Excellent, this workaround mentioned worked for me. But permanant solution is better

Talkless’s picture

Any progress with this?

I have migrated my site from Lithuanian to transliterated characters in paths, but there are still links in net that try to open nodes using percent encoding, and redirecting them to "right" paths does not quite work...

jh sio’s picture

delete.

Infoloko’s picture

this issue seems to have fixated on spaces .. however, the same error occurs for accented characters.

For example, é in a URL becomes %25C3%25 .. try redirecting this to an unaccented version of the same URL and the URL Redirect page rejects the redirect as being to itself (ie TO and FROM are identical) even though the FROM field contains %25C3%25 where the TO field contains an unaccented e

(and before anyone suggests it, yes the accented URI's were an error on my part but now I can't redirect to the correct unaccented URI's)

charles belov’s picture

Issue summary: View changes
charles belov’s picture

Issue summary: View changes

Edited workaround to clarify it is available for the To URL only. This is an issue for me as I need a workaround for both URLs.

charles belov’s picture

Issue summary: View changes

I spoke/edited too soon. It appears to work in the To URL if I replace the %20 with a space. Not ideal, but it works.

charles belov’s picture

@j0rd #7

Re: "If they are different, the AJAX should convert the pasted string into the decoded version. And prompt the user that 'All URLs must be stored un-encoded. We've decoded this URL automatically for you. Is this OK? [X]'."

I think this will be confusing for non-technical users who are just copying and pasting from a browser address bar. Just store it the way you need to store it and don't say anything. What would be their option if the answer was "No, it's not okay," how would they know that this was the case, and what could they do instead?

summit’s picture

Hi,
I implemented this patch #11, is this ok for now?
Greetings, Martijn

leksat’s picture

Status: Needs work » Needs review
StatusFileSize
new1.77 KB

Here is another approach.

1. In my case, file redirects do not work always. Besides the double-encoding issue, the language path prefix is also added to the redirect URL because it's set for the default language.

2. User may add the redirect URL both encoded and not encoded. The reason of this is that browsers may behave different when you copy URL from the navigation bar. For example, if you copy the following URL
http:/example.com/sites/default/files/file(1).jpg
in Firefox, it becomes
http://example.com/sites/default/files/file%281%29.jpg
in the clipboard, but if you do it in Chromium it remains
http:/example.com/sites/default/files/file(1).jpg

The attached patch addresses both issues. However it does not change redirect_url() function, so it needs some more work.

srclarkx’s picture

What is wrong with decoding the (source or redirect) url, if it isn't empty, in an edit_form_alter hook before the redirect is saved?

leksat’s picture

StatusFileSize
new2.84 KB
new3.1 KB

What is wrong with decoding the (source or redirect) url, if it isn't empty, in an edit_form_alter hook before the redirect is saved?

@srclarkx, you absolutely right!
Inconsistent browsers behavior confused me and I went a wrong direction in #24. Yes, browsers behave differently, but anything copied from the address bar should be considered as raw URL.

Issues with URL-encoding/decoding:
1. Drupal 7 core stores paths (and path aliases) in the database in URL-decoded form.
2. Redirect module can accept paths, and in this case, they should be stored in the way Drupal core does this, so URL-decoded.
3. Redirect module can accept absolute URLs, and in this case they only can be stored in raw form.
4. Users enter raw URLs in most cases (for example, by copying them from the browser address bar).

Current module behavior: store URLs/paths in the form they were entered by user.
Attached patch changes this so that paths are stored URL-decoded and absolute URLs are stored as is. (And it handles the file-URLs issue)

revathi.b’s picture

#26 I have applied this patch.Its working but I faced the small issue.In the end of url the double encoded to be applied.Like in the end of url %25 to be applied.

G42’s picture

This patch works, is there a reason #26 has not been committed? Are there unintended side effects?

chris matthews’s picture

Status: Needs review » Needs work
Issue tags: +Needs reroll

The patch in #26 no longer applies to the latest 7.x-1.x-dev and may be too old to re-roll, but I went ahead and tagged the issue accordingly.

Checking patch redirect.admin.inc...
Hunk #1 succeeded at 505 (offset 12 lines).
Hunk #2 succeeded at 539 (offset 12 lines).
Checking patch redirect.module...
error: while searching for:
 */
function redirect_goto($redirect) {
  $redirect->redirect_options['absolute'] = TRUE;
  $url = url($redirect->redirect, $redirect->redirect_options);
  drupal_add_http_header('Location', $url);
  drupal_add_http_header('Status', redirect_status_code_options($redirect->status_code));

error: patch failed: redirect.module:1121
error: redirect.module: patch does not apply
wylbur’s picture

Status: Needs work » Closed (outdated)

Closing this as Outdated as Drupal 7 is EOL.