Problem/Motivation
Documentation on hook_pathauto_alias_alter() suggests that the source path can be altered for a pattern. This works when creating the initial alias, but subsequent saves of the entity create duplicate aliases because the source hasn't been changed yet when checking for an existing alias.
Steps to reproduce
Implement hook_pathauto_alias_alter(), change $context['source'], and save an entity twice. Two duplicate aliases are created.
Proposed resolution
Move "This can be altered by reference." documentation for $source to hook_pathauto_pattern_alter(), and ensure $source is passed by reference.
API changes
Both hooks change, but it wouldn't affect anything that was previously working.
| Comment | File | Size | Author |
|---|---|---|---|
| #5 | pathauto-2907491-3187945.patch | 15.36 KB | akalam |
| #2 | pathauto-3187945-2.patch | 2.81 KB | akalam |
Issue fork pathauto-3187945
Show commands
Start within a Git clone of the project using the version control instructions.
Or, if you do not have SSH keys set up on git.drupalcode.org:
Comments
Comment #2
akalam commentedComment #4
akalam commentedHere is a patch combining patches from #2907491-3: Creating multiple aliases for a single entity and #3187945-2: Fix hooks and documentation to alter source path
Comment #5
akalam commentedThe previous patch #4 was missing a call to getPatternByEntity, while the new method is called getPatternsByEntity(). Here is a patch fixing it.
Comment #8
karlsheaI don't think a new alter hook is needed, I just think there's a bug here.
The documentation for
$context['source']onhook_pathauto_alias_alter()states "This can be altered by reference." That is true, altering there will save an alias with a different source. However, re-saves of the same entity will create multiple aliases because the source hasn't been altered when checking for an existing alias.Moving that documentation to
hook_pathauto_pattern_alter()and passing$sourceby reference to that hook works through multiple saves.Comment #10
mably commentedCode review of MR #69
The commit fixes the issue where altering the source path in
hook_pathauto_alias_alter()caused duplicate aliases on subsequent saves.Root cause
In
PathautoGenerator::createEntityAlias(), the flow is:$sourceis set to the entity's internal path (e.g./node/1)$contextarray is built with$sourcehook_pathauto_pattern_alter()fires$source(loadBySource($source))hook_pathauto_alias_alter()fires$sourceBefore the fix,
$context['source']was passed by value at step 2, and only became a reference to$sourceright before step 6. This meant that if a module altered$context['source']inhook_pathauto_alias_alter()(e.g. changing it from/node/1to/custom/1), the alias was created with the altered source at step 7. But on the next save, the existing alias lookup at step 4 still used the original/node/1, could not find the alias created with/custom/1, and created a duplicate.The fix
The
$sourcevariable is now passed by reference ('source' => &$source) from the start when building the$contextarray. This means source alterations should now be done inhook_pathauto_pattern_alter()(step 3), which fires before the existing alias lookup (step 4). On subsequent saves, the same alteration happens again before the lookup, the existing alias is found, and no duplicate is created.The redundant
$context['source'] = &$sourceline beforehook_pathauto_alias_alter()is removed since the reference is already established at the top.Documentation changes
In
pathauto.api.php:hook_pathauto_pattern_alter(): signature changed fromarray $contexttoarray &$context, and "This can be altered by reference" is added to thesourceparameter — this is now the correct place to alter the source path.hook_pathauto_alias_alter(): "This can be altered by reference" is removed from thesourceparameter — altering source here is too late for the existing alias lookup and leads to duplicates.The fix is correct and does not break existing code that alters source in
hook_pathauto_alias_alter()— the reference is still there — but it steers developers to the right hook where the alteration has the intended effect.Comment #11
mably commentedTest added for source path alteration
Added a kernel test (
PathautoSourceAlterTest) and a test module (pathauto_source_alter_test) to verify the fix. The test module implementshook_pathauto_pattern_alter()to rewrite the source path from/node/Nto/custom/N. The test creates a node, verifies the alias is stored with the altered source, then re-saves the node and asserts that no duplicate alias is created. Without the fix, the test fails because the source alteration is ignored and the alias is created with the original/node/Nsource.Comment #12
mably commentedComment #13
mably commentedWon't we have the same problem with
$context['language']?Example code from
pathauto.api.php:Here is what says Claude about it:
The same duplicate alias problem also applies to language changes in
hook_pathauto_alias_alter().The documented example in
pathauto.api.phpshows:This has the same root cause as the source alteration issue:
loadBySource()is called with the original$langcodebeforehook_pathauto_alias_alter()runs. On resave, the lookup uses the original language again, fails to find the alias (which was saved with the altered language), and creates a duplicate.Example scenario:
en, hook changes$context['language']tofr, alias saved with langcodefrloadBySource('/node/1', 'en')— looks forenalias, doesn't find it (was saved asfr). Fallback only checksundand default language, notfr. So$existing_aliasis NULL.fragainfralias created — duplicateThe special case of
undhappens to work becauselookupBySystemPath()always falls back tound. But any other language change would produce duplicates on resave.The fix in MR !69 (passing
$sourceby reference early in the$contextarray) should probably also move the language alteration documentation fromhook_pathauto_alias_alter()tohook_pathauto_pattern_alter(), where it would run beforeloadBySource(). Note that$context['language']is already passed by reference ('language' => &$langcode) so it would work inhook_pathauto_pattern_alter()without any code change — just a documentation update.Comment #14
mably commentedPushed a voluntarily failing test that generates a duplicate alias when language is modified in
hook_pathauto_alias_alter()(as shown in PHP doc) to emphasize that this needs fixing too.Comment #16
anybodyTests are failing
Comment #17
mably commentedTests are voluntarily failing as per #14 to emphasize that fixing the problem for
sourceis probably not enough and that we must also fixlanguagehere.Comment #18
anybodyThanks @mably and sorry, I didn't see #14
Comment #19
mably commentedNo pb :) Would still be happy to have your feedback on this.
Comment #20
mably commentedComment #21
mably commentedComment #22
berdirI'm not quite sure what the use cases are for altering the source path. What would that achieve, doesn't the source path need to match the entity path?
We only generate one alias per entity, there's an issue that would allow to set up multiple patterns, for example one for the edit path.
Also, what happens with the widget, bulk update processes, those wouldn't know about a changed path and would not show existing/create duplicates?
Maybe @karlshea or @akalam can shed some light on their use case?
Similar to language. There are definitely use cases there, lots of people would like to have language undefined aliases across translations. But that too has a long and complicated history, again conflicts with the widget and so on (I always pushed back on pathauto supporting this scenario as long as the core path widget doesn't support that).
Comment #23
mably commentedThe problem is not really about allowing changing source and language here, but rather allowing it to be done after having loaded the existing entity, generating a risk to create duplicates, the new created alias being then different than the existing one.
What's your point of view on this @berdir?
Comment #24
berdirThe question to me is whether we are fixing something that is meant to work or not and what the implications of that are. I'd rather make it clear (with docs updates and possibly code changes, maybe with deprecations) that you're *not* supposed to do that and that it's not supported, if it doesn't properly and consistently work in the scenarios that I mentioned.
That's why I'd like to hear what the use cases here are.
Comment #25
karlsheaHey @berdir, I'm using this on a project where we use pathauto normally for all of our content entities, but we have a custom entity where there were two issues:
1. The entity is used to alter some branding on the site, and we're using Pathauto to create the aliases for that entity, but that entity isn't meant to be viewed. Instead, we're pointing the alias for that entity to a custom controller that handles that altering/session setting.
Pathauto was basically the easiest solution to "We need users to create entities where they can type in a path + more info, and when you hit that path it calls a controller to get that entity's info back" because it's already doing the alias -> source mapping, editing, and cleanup.
2. Admins needed to enter exactly the alias that they wanted on the entity form and I'm using
hook_pathauto_alias_alter()so we don't runcleanTokenValues()in that particular case:I don't remember exactly the specifics without debugging through everything again, but I think without this patch something wasn't quite right? I think we were getting multiple aliases on every entity save? Probably because the alias alter wasn't seeing the change from the pattern alter. Or later on in the process after the pattern alter something wasn't linked together? Something like that.
Also having tried to work through all of this it just didn't seem to make sense to not allow the source alter in
pattern_alter()but to allow it inalias_alter(). The only place altering it works end-to-end is if you do it in the first one, not the second one.Comment #26
karlsheaSorry, edited my comment with more context if you're only seeing the email.
Comment #27
berdirThanks for the feedback.
I think for your scenario, I'd consider not relying on pathauto at all. I see how themodule_pathauto_pattern_alter() wouldn't work without this change, but I can't see how this would result in a consistent state. If you have a pathauto widget then how should that know that the source path is '/thesite_custom_brand/' . $brand->id();? How should the bulk update feature know that and figure out which aliases already exist? Why not point the canonical link template of your custom entity to that? If you hardcode stuff like this, does pathauto really add a benefit or just complexity?
I think for pathauto, the canonical source path must be consistent, it can not be consistent when just using this hook.
Also, for the second part, did you consider using the safe tokens setting in pathauto? That's the thing that allows more complex tokens that typically are existing paths with / and other special characters to bypass the sanitization. We use that for a similar use case in our projects.
Even though it's possible to generate code now that satisfy practically every use case, I think it's our responsibility as maintainers to balance use cases against code complexity and consistency. *If* we were to support it, I think it should need to be backed by a functional test to ensure something does work consistently. Which I don't think it can using this approach.
So, instead, I'd propose that we deprecate the ability to do what you do and instead suggest different approaches. The way to do that would be to detect if the source path was attempted to be altered in a hook and trigger a deprecation message or even warning in that case (as you only see deprecations in tests, which are unlikely to be triggered in custom code).
A possible alternative would be #3217830: Allow patterns to affect non-canonical entity URLs.
Comment #28
mably commented@berdir, how do you suggest we proceed with this issue?
How do we fix the risk of alias duplication if
sourceorlanguageis modified in thepathauto_alias_alterhook?Comment #29
berdirI propose that we do not allow to alter them in the first place.
Comment #30
mably commentedOK, I’ll create a new MR for this.
Should we prevent
sourceandlanguagefrom being passed by reference in$context, or simply trigger a deprecation notice / warning if hooks modify them?The example below in
pathauto.api.phpwill also need to be updated to reflect this change:Comment #32
mably commentedI tried to implement something in MR 164.
Deprecated altering
$context['source']and$context['language']by reference in bothhook_pathauto_alias_alter()andhook_pathauto_pattern_alter(), with runtimeE_USER_DEPRECATEDwarnings and updated PHPDoc with@deprecatedtags, to be removed in pathauto:2.0.0.Comment #33
karlshea> Why not point the canonical link template of your custom entity to that? If you hardcode stuff like this, does pathauto really add a benefit or just complexity?
I... don't know why I didn't think of that. I updated our code and that is totally working, I was able to remove a bunch of hacks (I guess I had also needed to create a custom AliasType probably to fix one of the other issues you brought up).
Thanks for the tip! We no longer need this patch.
> Also, for the second part, did you consider using the safe tokens setting in pathauto?
Not without that setting affecting all of the regular content entity types, and we needed to allow *anything*. These were promo paths for other companies used on their marketing materials and couldn't be changed, and were imported when I originally built the site on Drupal 7.
Comment #34
berdirDidn't carefully review yet, but yes, the MR is what I had in mind.
#33 confirms that at least for that use case, there are better solutions.
In regards to the second part about the safe tokens, the setting is global but it's also per token. If the field that holds those values isn't already unique, you could add your own token like [entity-type:special-safe-whatever-token] and then you only allow "special-safe-whatever-token" and that should work? And if it's already something that's unique to your entity type, you could just use that. Note that the examples are just a single token part like alias, but the setting should work if you use the complete token name, such as entity_type:field_name, see \Drupal\pathauto\AliasCleaner::cleanTokenValues.
Comment #35
karlsheaOh sure enough, I can remove that other hook too. Thanks for the tip!
Comment #36
berdirDidn't test this myself, but we have confirmation that those alter hooks and tricks aren't needed and there are better solutions.
Comment #38
mably commented