I ran into issue when source url was like this: en/products/tshirts/black+and+white/ from old site and was need to redirect to new store. But there were problem with this because Drupal received url without these "+" just with spaces between. Therefore hash that is stored in DB did not matched with source urls hash..

I created really quick and easy solution in patch, everyone welcome to create better and more Drupal way solution. --This has been already improved.

Problem/Motivation

The Redirect module fails to match redirects when the source URL contains encoded characters such as
+, %20, or other special characters.

When a request comes in for a URL like en/products/tshirts/black+and+white/, Drupal decodes
the URL before processing, converting + to spaces. This causes a hash mismatch between the
hash stored in the database (generated from the encoded source path) and the hash generated from the
decoded incoming request. As a result, the redirect is never matched and the user receives a 404.

This affects any site migrating from a legacy platform where old URLs contain encoded characters that
need to redirect to new paths.

Steps to reproduce

  1. Create a redirect with source path products/tshirts/black+and+white
  2. Visit /products/tshirts/black+and+white/ in the browser
  3. Expected: User is redirected to the configured destination
  4. Actual: The redirect is not matched and the user gets a 404 page

Proposed resolution

Normalize the source path using urldecode() before generating the hash in
Redirect::generateHash(). This ensures the stored hash matches the hash generated from the
decoded incoming request.

A post-update hook is included to regenerate hashes for existing redirects that contain encoded
characters.

Remaining tasks

  • Fix PHPCS failures in src/RedirectRepository.php from href="https://git.drupalcode.org/issue/redirect-2884630/-/pipelines/585283/fai...">pipeline #585283:
    • Line 266: Missing newline after closing brace
      (Drupal.ControlStructures.ControlSignature.NewlineAfterCloseBrace)
    • Line 307: Missing blank line after last function
      (Squiz.WhiteSpace.FunctionSpacing.AfterLast)
    • Line 308: Missing empty line before class closing brace
      (Drupal.Classes.ClassDeclaration.CloseBraceAfterBody)

    All 3 are auto-fixable via phpcbf.

  • Review and test the latest patch (#36) against redirect 1.12.x
  • Address performance concerns for sites with large numbers of redirects during the post-update hash
    regeneration
  • Ensure no duplicate hash collisions after normalization

User interface changes

None.

API changes

Redirect::generateHash() now applies urldecode() to the source path before
generating the hash.

Data model changes

No schema changes. Existing redirect hash values are regenerated via a post-update hook to reflect the
new normalized hash calculation.

Issue fork redirect-2884630

Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Comments

maris.abols created an issue. See original summary.

amaisano’s picture

Same issue (I think):

Source (old site):
/files/directory/file%20name.pdf

To:
/sites/default/files/new-file.pdf

The %20 is messing things up in the source - I'm getting a 404 when trying to test the redirect.

amaisano’s picture

Since this affects %20, the above patch doesn't work. I will try and create a patch that covers both situations!

amaisano’s picture

Here is one solution.

Given the following params...

  $source_path = 'uploadedfiles/Business_Center/DBE/DBE%20Policy%20Statement%20-%20sign%20by%20GM%20on%209.22.2017.pdf';
  $source_query = [];
  $language = 'en';

the vanilla Redirect code generates a hash of:

_i_oMuvBlA4Qsn-VEClBOj-_Tt8iMZsvRENKZyg3s7E.

This however does not match the hash generated by trying to compare/find a match, which ends up being:

m42bDy29-bg4mbmKGJfTZoNlpFbnuS031Wb63k-n2Zo.

With this new patch, the hash that is initially stored/created in the database now matches the lookup comparison hash in these situations.

The other solution would to do the inverse, and change the lookup/comparison hash check to *en*code the path prior to hash generation.

amaisano’s picture

@maris.abols

If you substitute rawurldecode with urldecode, it will also fix situations with "+" spaces, like your sample URL. I did not include that however because of RFC 3986, which is the current standard. The issue with this is there may be non-conformant encoding/decoding going on elsewhere in the Redirect module or in core.

idebr’s picture

Status: Active » Needs review

Status: Needs review » Needs work

The last submitted patch, 4: 0001-align-hash-generation-with-hash-matching.patch, failed testing. View results

idebr’s picture

Status: Needs work » Needs review
StatusFileSize
new816 bytes

Reroll against 8.x-1.x after #2995075: Unicode methods are deprecated in Drupal 8.6.0 was committed.

berdir’s picture

Status: Needs review » Needs work
Issue tags: +Needs tests

Shouldn't be hard to extend our test coverage to cover this?

akshayadhav’s picture

Status: Needs work » Reviewed & tested by the community

Patch #8 works as expected.
Successfully redirecting URL aliases with spaces to mentioned URLs.

Steps:
1. Applied patch.
2. Checked the URL redirection.
3. The redirection was not working. It was showing 404 page.
4. Deleted the old redirect.
5. Cleared cache.
6. Added URL redirect again.

And now the URL redirection for URL with spaces works.

paulocs’s picture

Status: Reviewed & tested by the community » Needs work

It needs work because of #9.

paulocs’s picture

Btw I'm not sure if it works because rawurldecode() does not decode '+' into spaces.

See: https://www.php.net/manual/en/function.rawurldecode.php#refsect1-function.rawurldecode-notes

paulocs’s picture

StatusFileSize
new1.33 KB

Here is a patch that proves that the solution #8 does not work. The tests will fail.

paulocs’s picture

Status: Needs work » Needs review
Issue tags: -Needs tests
StatusFileSize
new1.33 KB

Here is a patch that will work using urldecode()

The last submitted patch, 13: 2884630-with_tests-12.patch, failed testing. View results
- codesniffer_fixes.patch Interdiff of automated coding standards fixes only.

marcos_lima’s picture

Status: Needs review » Reviewed & tested by the community

Hi everyone. I did a review of patch #14. I reproduced the error before applying it, using spaces in the redirect, both with %20 on the source and with a + on it. Then I applied the patch and tested the redirects, as well as created new ones to test. It worked well in all cases.
I think it is important to state that the old redirects (created before the patch) won’t automatically work. You have to update them by going to the edit page and then saving, or deleting and creating them again.
Then I did a code review of the changes in the redirect entity class and in the test, and everything looks good to me.

paulocs’s picture

berdir’s picture

Status: Reviewed & tested by the community » Needs work

Can we include the functional test from #2364881: Redirect not working when path contains a (+) symbol here (note that it needs some coding standard improvements). And maybe test some other special characters too. This looks like something that could result in regressions.

paulocs’s picture

Status: Needs work » Needs review
StatusFileSize
new2.93 KB

Addressed #18.

hchonov’s picture

StatusFileSize
new2.9 KB

re-roll only.

lucienchalom’s picture

Assigned: Unassigned » lucienchalom

i will review it

lucienchalom’s picture

Assigned: lucienchalom » Unassigned
Status: Needs review » Reviewed & tested by the community

I reviewed checking the CS, no new problem added.
just like #10 and #16, creating redirects before and after the patch.
tested with space, + and %20 in both fields To and From, and all work fine after the patch.

As was said on #16 the ones created before the path need to be saved again, deleted, or edited.
maybe add a note about it on the page update?

Anyway, all looks good

codebymikey made their first commit to this issue’s fork.

codebymikey’s picture

Addressing #22, I've added a post update hook which regenerates the redirect hash for existing content.

It also includes a helper static function which can be reused for future updates to the redirect hash.

codebymikey’s picture

codebymikey’s picture

Title: Source url with spaces » Add support for encoded URLs (with spaces or other special characters)
paulocs’s picture

Looks good @codebymikey!
Thanks!

dpagini’s picture

+1 for this patch, it solves the same problem I'm facing... Url's in the "From" field with a special character, like `path/to/a%20file.pdf` won't work.
This issue has been RTBC for almost a year now, is there anything left the maintainers need on this issue? I see that, it appears, the last requested changes by Berdir were completed, so I think this just needs eyes again from an maintainer.

berdir’s picture

Status: Reviewed & tested by the community » Needs work

I'm concerned about the upgrade path. I maintain sites with tens and hundreds of thousands of redirects, a forced upgrade path that will recalculate the hash for all of them is going to take time and could possibly mean quite a long downtime for a site of that size. What about adding a button to the settings page to do recalculate hashes, and the update function could just add a message that people might want to run that? We could reuse that if other changes need to happen.

The cache tag handling is also not correct, the list cache tag is mostly just for render caching, the entity storage will still cache those entries, should either invalidate them all through EntityStorageInterface::resetCache() or alternatively the specific entities that are being changed here.

+++ b/tests/src/Kernel/RedirectAPITest.php
@@ -62,6 +62,10 @@ class RedirectAPITest extends KernelTestBase {
     $this->assertEquals(Redirect::generateHash('some-url', ['key1' => 'val1'], Language::LANGCODE_NOT_SPECIFIED), $redirect->getHash());
+    // Update the redirect source with a + sign in it.
+    $redirect->setSource('some+url');
+    $redirect->save();
+    $this->assertEquals(Redirect::generateHash('some url', [], Language::LANGCODE_NOT_SPECIFIED), $redirect->getHash());
     // Update the redirect source path and check if hash has been updated as

this isn't testing much because it would also pass on HEAD, it's just testing that the hash is generated. We probably want to hardcode the hash for that case.

richardbporter’s picture

Rerolled #25 for 8.x-1.9.

cmd87’s picture

Uploading the patch for version 8.x-1.10

dark05’s picture

Thanks for patch! It works for me.

joao.ramos.costa’s picture

Hi @cmd87, I tend to agree in what @berdir concerns, also I'd like to add that currently the upgrade doesn't take care when a given hash already exists.

esod’s picture

Here's a patch reroll for Drupal 10.5.x, redirect v1.12.

anicoto’s picture

Issue summary: View changes