Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
Hi,
I think I found an issue when using Pathauto and Redirect modules.
See more on #1287238: Conflict between an old redirect (with Redirect module) and a new automatic alias (with Pathauto)..
Thanks for any idea or solution.
Comments
Comment #1
nagba CreditAttribution: nagba commentedPathauto can be set to create alias for a given entity - lets call this ALIAS-A. The alias may change (ALIAS-B) on update and at this point the redirect module may create a redirect from the old alias to the node. Now when a new entity is being created and it gets ALIAS-A then due to the existing redirect loading ALIAS-A will get the visitor to ALIAS-B, so suddenly two aliases lead to the same page even though the entities are completely different.
I'm attaching a patch for consideration that would delete existing aliases when a path would take on it. The other option (or one that could complete this approach) is to make sure that when pathauto generates a new path the alias would be altered when the above mentioned case would happen - considering current hook possibilities it might not be the prettiest though.
Comment #2
nagba CreditAttribution: nagba commentedbit the bullet in the end, new patch which acts as if pathauto detected that the new path already exists
Comment #3
azinck CreditAttribution: azinck commentedThis can happen in other ways, even without pathauto. The root problem is that it's possible to have a url alias that collides with a redirect but which each point to different endpoints.
Redirect:
mypathname => node/10
While at the same time there's an alias:
mypathname => node/5
Currently the redirect is preferred over the alias. But I'd argue the alias should be preferred over the redirect. I'm not sure of the best place to put the logic to detect this...perhaps redirect_can_redirect()? Or in redirect_redirect() itself? What say others?
By the way, this logic could exist alongside the patch in #2. #2 avoids one way of creating this situation in the first place but there are still other ways it can happen (e.g., by manually creating aliases or using other modules besides pathauto which might create aliases) which is why I think we need the logic I've outlined above.
Comment #4
heddnI had to update patch from #2 because it wasn't working. The logic check for if a redirect already exists wasn't working.
I started writing up a patch to address the concerns in #3. It would incur a serious performance hit to check for this small edge case. Even if we did a variable_get, we'd have to call off to the DB on every page render to get that variable. Are we sure this is an important enough scenario to account for? At this point, I've left it out of the attached patch.
For posterity's sake, here's the logic if we wanted to include a check for #3:
Comment #5
heddnOK, here's a better approach. Almost the same as previous #4 but it adds some logic to remove a redirect if it would clash with a pathauto alias. It then lets pathauto proceed with creating the alias.
Comment #6
nagba CreditAttribution: nagba commentedpersonally i dont agree with #3.
consider SEO or UX. The redirect is in place because we want the visitors to find the page which has moved - a url could have been bookmarked, etc. If someone creates a new content and you blindly kill the redirect cos you think the new content should get priority then confusion will rise for users.
heddn: hm. weird. i think #2 works on Drupal Gardens.
Comment #7
azinck CreditAttribution: azinck commentedHi nagba: there are downsides to either way of handling such a conflict but in my mind, a URL alias is more authoritative than a redirect.
Consider this situation again:
Redirect:
mypathname => node/10
While at the same time there's an alias:
mypathname => node/5
In this case, any links to node/5 that get passed through the url() function will get re-written to 'mypathname', but then a user attempting to use that link will end up viewing node/10. To me, that's far worse than any SEO considerations.
That said, heddn's patch looks like a great step in the right direction. It deletes redirects if they exactly match aliases and prevents collisions between automatically generated aliases and redirects. It looks to me like it solves the problem well for any collisions that might have happened "automatically". So if there are logic problems that remain at least they're confined to situations where someone has explicitly created conflicting aliases.
Comment #8
nagba CreditAttribution: nagba commentednot just SEO but also user and user confusion.
and also consider timing.
if mypathname => node/10 was there first it should stay there and any AUTOMATIC pathalias generation should not reuse the alias, but rather use something else.
if the user tries to alter node/5 to use mypathname as alias then 1) alert him 2) make him delete the redirect . Automatically deleting a redirect is bad, cos you are deciding what is best for the user instead of lettimg him clear up the situation.
Comment #9
azinck CreditAttribution: azinck commentednagba: If you read my comment in #7 carefully I think you'll notice that I agree with you about not allowing automatically generated paths to collide with redirects and about not deleting redirects. I think the only point of disagreement regards whether to prefer a redirect or path alias when two are in direct conflict with each other.
As it is, that discussion doesn't really belong on this issue so I won't muddy it up any further. heddn has provided a patch in #5 that looks like it should address the problem at hand. It needs review (I'm not able to do proper testing on it right now myself due to rather all-consuming life events at the moment).
Comment #10
heddnLast patch (hopefully). This one fixes an undefined index error when no redirect exists on a page.
Comment #11
Dave ReidI think #1909370: Allow other modules to chime in when checking if a to-be-generated alias is already reserved might help clean this up a bit. Let's get that in first.
Comment #12
michaelfavia CreditAttribution: michaelfavia commentedOur 2c:
While solving the pathauto side of this is important it is equally important to solve the higher level path.module issue.
Pathauto is just an automated version of what a user might accidentally do by hand via the standard path alias system and that is where the problem should be solved if possible.
I agree that the redirect module should block off the paths previously claimed by the path module. It should also not allow the alias module to reclaim them.
There are lots of solid arguments for why this should be the case but i dont think there is sufficient hookage in path_save(). It doenst have an module_invoke_all() for "alter" or "validation".
If it isnt possible to prevent path.moudle from creating the problem aliases, then redirect should be prevented from overriding known path aliases.
This took us a really long time to unwind today and it has and will bite others too. Thanks for the great work.
Comment #13
thtas CreditAttribution: thtas commentedI've fixed this using the patch #1 for the reasons outlined in #12 - We needed the fix for both automatic aliases and manually entered aliases
Comment #14
narendraR2: redirect_maintenance_on_new_path-1288768-2.patch queued for re-testing.
Comment #15
narendraR4: redirect_maintenance_on_new_path-1288768-4.patch queued for re-testing.
Comment #16
donquixote CreditAttribution: donquixote commented@michaelfavia (#12) is right.
Imo, the problem should be solved like this:
Ideally, whatever was first should win:
- If the alias was first, don't create a conflicting redirect.
- If the redirect was first, don't create a conflicting alias.
However, since we don't have any hooks in path_save(), and this function is called all over the place, this ideal situation won't ever happen.
So, what we should do instead:
If we get this far, path has won, and all we can do is leave.
This should be in addition to (1.). So if this stuff all fails, point (1.) kicks in as a fallback.
Imo we should implement step (1.) asap and then talk about step (2.)
Comment #17
Pere OrgaThanks for the comments #12 and #16. However, what happens with languages? Redirects (and aliases?) can be created for specific languages or for all languages, so it may happen that an alias exists for a specific language, and the same path exists a redirect for all (other) languages. In that case it would make sense to have both aliases and redirects for the same paths, ideally giving more priority to the alias/redirect that has more specificity.
And couldn't we create a hook in Drupal core path_save()?
Comment #18
Pere OrgaI've just closed #1459800: Re-assign old alias leads to duplicate as a duplicate of this one
Comment #19
donquixote CreditAttribution: donquixote commentedA new idea:
Create an entry in url_alias for every redirect.
The alias target url would be a path like redirect/%, with a page callback saying "You are not supposed to be here.".
Or the page callback could already do the redirect..
But since redirect does its magic in hook_init(), it does not really matter what the page callback does.
hook_init() redirect will only happen if the entry in url_alias is present and points to a redirect/% url.
Comment #20
Pere OrgaI like that idea, that's very different of what is being done.
I'm closing #1988450: From (source) should not change aliases to node numbers as a duplicate of this one.
Comment #21
mian3010 CreditAttribution: mian3010 commentedI have recently run into a problem with this patch.
I have applied patch from #10, but the condition that deletes a redirect, if the generated alias is the same will never be executed.
The function
redirect_load_by_source
returns an array, indexed by the rid, and therefore$redirect->redirect
will never be set.I have attached a revised patch from #10 that fixes this.
Comment #22
Dave ReidNow that #1796596: Fix and prevent circular redirects is resolved, how does that affect this issue? Could we get a test-only patch to confirm what the problem is?
Comment #23
mian3010 CreditAttribution: mian3010 commentedAnd i do not seem to have taken into account when no redirects are found. I have corrected this in attached patch.
Comment #24
mian3010 CreditAttribution: mian3010 commentedI can see that i have not been focused enough while solving this problem. The if-sentence below is now never true. Fixed in attached patch.
Sorry for all the rookie-mistakes ;-)
Comment #25
smustgrave CreditAttribution: smustgrave commentedHas anyone made any progress on this issue? I believe this is the same issue I'm currently having where a redirect is created but in the redirect list it's displaying as a node number. I made a small tweak in the redirect.module file at line 385
$path['source'];
to$path['alias'];
. What this seems to do is store the alias in the database under the 'redirect' column vs storing the node number. Thoughts anyone?Comment #26
smustgrave CreditAttribution: smustgrave commentedPlease review and give any feedback. I'm hoping this helps solve the problem
Comment #27
smustgrave CreditAttribution: smustgrave commentedComment #29
smustgrave CreditAttribution: smustgrave commentedComment #30
smustgrave CreditAttribution: smustgrave commentedUpdated the patch I previously submitted. Now a the redirect_alias table will get a value when a redirect is added between two different nodes as well as when a node changes it's url. The failure the patch is having is because there is no default value but I'm not entirely too sure how to solve that one.
Comment #31
smustgrave CreditAttribution: smustgrave commentedComment #32
smustgrave CreditAttribution: smustgrave commentedComment #33
jyraya CreditAttribution: jyraya at European Commission and European Union Institutions, Agencies and Bodies commentedHello,
I read carefully each proposal of this issue and I think we should not automate so much the treatment of this case.
From feature point of view, we should treat this edge case as is:
Why should we go against the site functional requirements?
I think that we could display a message on the "new" page that uses the "old" path of the redirection as current path.
This message could look like: "The page presents a new content. If you want to consult the old content, please follow this [URL]."
The message is more than perfectible. :-) but the idea is there.
So, we give enough flexibility to the webmasters to choose the right solution compared to their constraints.
Comment #34
scm6079 CreditAttribution: scm6079 as a volunteer and commentedSince this ticket was first created, workflow has been introduced into Drupal core and has a very real impact - there will be more webmasters wanting to create redirects from nodes that are archived. In that case, the pathauto aliases will exist, and users will be fed a 403 error instead of a redirect from this module that may exist at that path.
It seems worthwhile to consider this potentially common use-case. Brainstorming possible workarounds, we could use the EventSubscriberInterface and alterResponse to check for nodes that are not published and perform a check for the redirect there only if the status code is 403 - allowing pathauto to remain untouched and handling the case where a node may become active again based on workflow.
I'm not sure this use-case justifies a unique ticket, but it may? What are everyone's thoughts?