Problem/Motivation
The D8 builtin language system was a huge step forward and adding a fallback system to core an important appreciation that we need something like fallbacks. There are some reports about inconsistencies, which at first sound quite different, but may be foundation for improving / fixing the API. Here's a well-shuffled list:
- EntityRepository::getTranslationFromContext returns the passed-in entity translationif it does not find a fallback. This makes "Entity does not have a fallback" indistinguishable from "Entity fallback is the passed-in entity translation.
- Also: In the absence of a fallback, the returned fallback is the passed-in entity translation (which may be random).
- The above leads to the fact that results depend which of 2 translations is the original and which the translation. Which is at least highly counter-intuitive.
- The above makes it impossible to "distinguish between "current translation is a fallback $langcode" and "$langcode does not have a fallback"" (axel.rutz #16)
- The above leads to the fact that adding an unpublished translation makes that language go from "fallback to original" to "access denied" (which is at least counterintuitive).
- "content_translation removes fallbacks for entities that the current user does not have access to. But access [should not] be mixed with fallbacks" It torpedoes e.g. getting fallbacks for search_api indexing. (axel.rutz #18)
- #2972308: Allow users to translate content they can edit "shows that accessing an entity through an entity reference field vs. its route has a completely different behavior, that again differs based on whether the entity has its own status that it checks too vs. just a translation status."(@berdir #2978048-10: Unpublished translations should fallback to the original language)
- #3067999: ProcessedText filter doesn't pass in correct language when using language fallback
- Last and minor: EntityRepository::getTranslationFromContext assumes the first fallback of a language is the language itself. While there may or may not be use cases for dropping that assumption, if we lay hands on this anyway, there's no reason to hinder the flexibility of dropping that assumption.
Some exegesis on how this came into being and how to proceed:
- "the main goal [of the fallback system] was to keep the navigation structure intact for highly "symmetric" multilingual sites". (@plach #9) As of the above, implementing "asymmetric" sites where some languages do not have a fallback currently is not possible. Fixing is not easy "since the language fallback API does not support returning no candidates"(@plach #9)
- "I spoke with Gabor about this and he reminded that at the time we introduced this functionality we were planning to let contrib take care of all the possible fallback alternative behaviors. Of course we are both fine with performing any core change needed to enable this to happen in contrib." (@plach #13)
Proposed resolution
The big picture:
- Provide a setting and upgrade path that makes legacy sites behave as they did. So the following only applies to sites that are new or actively opt in.
- Make all fallbacks explicit. Translations without fallback return 403 (@plach #9, satisfies @sdewitt #8)
- Add the option to "let everything fallback to original translation" (@plach #9, satisfies @berdir #7)
- To add that option, we can simply add a fallback UI to core. The code is rather trivial which led to the unfortunate sutiation that we currently have at least 3 contrib modules for this. See #2552663: Consider deprecating language_fallback in favor of entity_language_fallback. This should satisfy >90% of all fallback use cases, everything more complex can easily be done via a custom hook implementation.
Concrete steps:
(TBD)
Remaining tasks
* Sort out the problem(s)
* Create consensus on how to proceed with it
** Especially: What is the semantics of fallback and how does it relate to entity access)
** And: Do we need separate "entity_view" and "entity_upcast" operations? [#]
* Do it
User interface changes
None.
API changes
TBD
Data model changes
None.
Release notes snippet
TBD
Original report
Problem/Motivation
DESCRIPTION
Moderated content created in a source language using the default Editorial workflow is accessible live using all language prefixes as soon as it is published and prior to actually adding any translations for the node.
REPRODUCTION STEPS
Setup
- Install Drupal 8.5
- Enable content translation
- Enable content moderation
- Install drush via composer
- Run
drush cr - Login to Drupal
- Navigate to
/admin/config/regional/language - Add French and German languages
- Navigate to
/admin/config/regional/content-language - Enable "Content" for translation under "Custom language settings"
- Enable "Article" for translation under "Content"
- Navigate to
/admin/config/workflow/workflows - Edit the "Editorial" workflow and apply it to the "Article" content type
- Navigate to
/admin/structure/types/manage/article - Under "Publishing options", uncheck "Published" and "Promote to front page" and save
Execute
- Navigate to
/admin/content - Create a new Article node in "Draft" workflow state
- Use the moderation state widget to move the new node from Draft to Published
Verify
- In an incognito window or another browser, attempt to hit /node/nid (e.g. /node/5) and notice that it shows the expected published content.
- In the same window, change the URL to /fr/node/nid (e.g. /fr/node/5) and notice that it shows the published English (or source language) content!
- Try /de/node/nid and notice it also shows the published English (or source language) content
- Try using a language code/prefix that is not registered with Drupal yet, e.g. Arabic, /ar/node/nid and notice you get a 404 Not Found
EXPECTED RESULT
No content should be available at a language prefix URL until a default published revision of the content exists with that language code/prefix.
ACTUAL RESULT
The default published revision of the source language content is available under all registered language prefixes.
Preliminary Analysis
My cursory analysis of this problem suggests this behavior occurs due to the following sequence of events.
- A drupal request enters the kernel for /fr/node/nid
- The HttpKernel fires the REQUEST event
- The ContainerAwareEventDispatcher invokes appropriate callables for the REQUEST event
- The Symfony RouterListener invokes the AccessAwareRouter to match the request
- The Drupal routers
getInitialRouteCollection()is invoked to collect candidate routes - The Drupal route providers
getRouteCollectionForRequest()is invoked to build the candidate routes - Drupal's path processor system is run to process the inbound path
- The
PathProcessorLanguageuses theLanguageNegotiationUrlto parse the language prefix and it rebuilds the inbound path without the prefix - Candidate route definitions are determined by
getRouteByPath() - The
entity.node.canonicalroute is chosen for the request - The route enhancers are invoked
- The
ParamConversionEnhanceris invoked to resolve/convert inbound parameters - The
ParamConverterManagerinvokes converters for all matching parameters - The
EntityRevisionConverteris invoked which defers to its parentEntityConverterfor routes not loading the latest revision - The
EntityConverterconverts thenodeparameter from the URL pattern for theentity.node.canonicalroute definition to an actualNodeinstance. It is important to note that at this step, theEntityConverterloads the default revision from the node storage. In our example above, this means that it loads the published revision of the nodenid. It does not yet take into account the langcode. - Next, the
EntityConverterinvokes theEntityRespository::getTranslationFromContext()method to potentially replace the default published entity instance with a translation instance. The critical step causing the issue happens next. - The
getTranslationFromContext()method performs the following check:if ($entity instanceof TranslatableDataInterface && count($entity->getTranslationLanguages()) > 1) {At this point, the default published revision of
$entitydoes not have any translation languages so it acts as an identity transform and returns$entityunchanged. This is probably the correct and desired behavior in many situations where this method is called (which I believe is in many different places). It does not seem like a great behavior in this situation since the calling code has no idea that no translation was found for the requested entity + current language. - The default published revision of
$entityis returned from theEntityConverterand thenodeparameter is replaced by the$entityinstance for further processing down the line. - The now filtered, and enhanced route definition is returned to the router and the access is checked. At this point, since we have the default published
$entity, access checks pass, and control is returned to theContainerAwareEventDispatcherwhich subsequently invokes the rendering engine to render the page.
Proposed resolution
A few possible solutions might include the following, the impact of which would have to be carefully considered for existing behavior and BC.
- Add a flag to the
getTranslationFromContext()method that when set to TRUE (defaults to FALSE to maintain BC), returns NULL if no translation is found for the current language context. - Update the
EntityConverterto check if the$entityreturned fromgetTranslationFromContext()has a language code that matches the current language or any of it's fallbacks and if not, throw aResourceNotFoundException - Add a
RouteEnhancerthat performs some similar check as above. - Could we use route requirements somehow so that
handleRouteRequirementswould report a mismatch?
Remaining tasks
User interface changes
API changes
Data model changes
Issue fork drupal-2951294
Show commands
Start within a Git clone of the project using the version control instructions.
Or, if you do not have SSH keys set up on git.drupalcode.org:
Comments
Comment #2
cilefen commentedI feel like putting this into the content moderation issue component (in the issue metadata), but I can see how it's difficult to categorize.
Comment #3
sdewitt commented@cilefen I actually selected that initially and then changed my mind. I am certainly fine with wherever you all think it should go.
Comment #4
timmillwood@sdewitt Thank you for an exemplary issue summary.
I'm not sure I'll have the time this needs over the next couple of days. It'd be awesome to see a test replicating this bug. I also wonder if this is possible to reproduce via the API without Content Moderation? Is it another one of these issues that content moderation just exposes and doesn't cause?
If I understand correctly this is causing unpublished content to be viewed by users who don't have permission to see it. Therefore should we bump this to critical?
Comment #5
cilefen commentedI don't think this is critical on the basis that what is being wrongly shown on what should be translated paths is already published in the source language.
Comment #6
catchThis looks like it would happen without content moderation too - can you try with a clean install of the standard profile?
There's no information disclosure here, just content shown for language prefixes you're not expecting it to be shown for, I think it might possibly be by design even to show English as a fallback in the absence of a translation.
Comment #7
berdirI recently also noticed this in a slightly different context. And yes, it actually doesn't matter if you use content_moderation or even forward revisions at all.
Interestingly, it doesn't happen when you *reference* an entity and display it through that. The reason is that we use two different contexts for getTranslationFromContext(), \Drupal\Core\ParamConverter\EntityConverter::convert() uses 'entity_upcast' while \Drupal\Core\Entity\EntityViewBuilder::viewMultiple() uses the default entity_view context.
This is relevant because the access logic for translations happens in content_translation_language_fallback_candidates_entity_view_alter(), as you can guess from that name, it only applies to the entity_view context.
So the fix should be as simple as removing the entity_view context or applying the content_translation logic to both (or all?) context strings.. I honestly don't quite understand why we need that context feature, maybe @plach can help with that.
I also mentioned this to him a while ago but forgot to create an issue.
Comment #8
sdewitt commented@berdir @catch Fair enough w/r/t content_moderation. The problem we initially had started with us seeing this behavior with nodes in draft and needs_review which is maybe why this left a taste of content_moderation.
Also I agree this is not critical which is why I opened it as Major. There is no information here that is disclosed at an inappropriate time. The issue is that routes appear to exist that really shouldn't.
*If* it is "by design", would it be possible to make it an option. We do not want to have routes available that we don't actually have content for. This is especially detrimental to SEO since it would appear as though we have the same content on multiple URLs.
@berdir I don't understand the "entity_upcast" vs "entity_view" discussion. However, if, as you suggest, it is an easy fix, I'm happy to try and roll a patch for it if you can point me in the right direction. We certainly need a "fix" or a new option (if it's by design) or a workaround soon.
One more thing. In the EntityConverter case, does it really matter what is passed to getTranslationFromContext() in the $context parameter since in the scenario I described above, the following guard statement will keep us out of the logic that even uses the $context, no?
Thanks all for your input.
Comment #9
plachI investigated this issue and I believe @Berdir and @sdewitt are describing two different, albeit related, scenarios.
First of all I would like to confirm that current core behavior is by design: we do want to have a valid route for a node (or any other entity) regardless of the current (content) language. This could be seen as a form of language fallback, if you will, but I believe that the main goal was to keep the navigation structure intact for highly "symmetric" multilingual sites, regardless of language. The SEO concerns @sdewitt brought up should be mitigated by the fact that we have a
canonicalURL link in the page metadata pointing to the original language URL, when no translation is available for the current language.That said, I can see why the current behavior can be deemed problematic and/or inconsistent:
/node/1and/it/node/1return a 200./node/1returns a 200 and/it/node/1returns a 403.If I understand correctly, what @Berdir is suggesting would make case 2 behave as case 1: CT's language fallback would kick in and the English translation would be displayed also on
/it/node/1.OTOH what @sdewitt suggests is to ensure that no "fallback logic" is applied also for case 1, which would mean that in case 1
/node/1returns a 200 and/it/node/1returns a 404 (403 would not be appropriate, since there is no Italian translation in that case).Given that we need to retain the current core behavior, because there are most certainly existing sites relying on it, I think the only solution to address this issue is to add an option to the Content Translation module allowing to pick one of these three language fallback modes.
While implementing @Berdir's suggestion is indeed trivial, I think supporting @sdewitt's use case is a bit trickier, since the language fallback API does not support returning no candidates. One possible workaround could be to throw
ResourceNotFoundExceptionfromcontent_translation_language_fallback_candidates_entity_upcast_alter(), but this may have unwanted side-effects when checking route access while generating links. A POC patch would be welcome to check whether anything breaks.Comment #10
plachComment #11
berdirSorry, yes, I confused those things, my scenario only happens when you have unpublished translations in the default revision in which case they are then not filtered out on /node/ID and you then get an access denied but inside an entity reference, it falls back and shows a matching translation.
I think it is conceptually related though because I think it should behave consistently between having no translation and an unpublished one and between node/ID and an entity reference (worth pointing out that for a list of entities in a view, the behavior is already configurable).
Comment #12
mbovan commentedI created an issue for the case described in #7.
Comment #13
plachI spoke with Gabor about this and he reminded that at the time we introduced this functionality we were planning to let contrib take care of all the possible fallback alternative behaviors. Of course we are both fine with performing any core change needed to enable this to happen in contrib.
Comment #16
geek-merlinAs of proposed solution #1:
Add a flag to the getTranslationFromContext() method that when set to TRUE (defaults to FALSE to maintain BC), returns NULL if no translation is found for the current language context.I came across this when implementing #1960684: Allow filtering by "Language (with fallback)" in search_api. From contrib i can not use getTranslationFromContext() to distiguish between "current translation is a fallback $langcode" and "$langcode does not have a fallback" (nor do i have another API). But for correct search API indexing, we need this.
Patch flying in that implements this. Not sure how to document the new paramter though, and what to do with the interface.
(Meta: Is this a hijack? Feel free to split this into its own issue.)
Comment #17
geek-merlin@plach in #13
> I spoke with Gabor about this and he reminded that at the time we introduced this functionality we were planning to let contrib take care of all the possible fallback alternative behaviors. Of course we are both fine with performing any core change needed to enable this to happen in contrib.
Matching this, i could not find any EntityRepository::loadEntityByConfigTarget unit tests for that fallback behavior. (but quite some webtests that may cover this)
Comment #18
geek-merlinThis is a complex issue. Contemplating a bit on this.
As of @berdir #11, i think the original sin is content_translation_language_fallback_candidates_entity_view_alter.
It removes fallbacks for entities that the current user does not have access to. But should access be mixed into fallbacks? I strongly feel no. It will create all kinds of inconsistencies.
And beyond all: Thinking of above case where we want to index entities. We surely do not want the index to depend on the indexing user. neither do we want to embody user1 for the indexing operation. Yes we can add an "index" operation (and curse that noone knows it). But the problem does not arise if in the first place we separate fallbacks from access.
Comment #19
geek-merlinComment #20
geek-merlin(#18...) Let's see what breaks on removal of content_translation_language_fallback_candidates_entity_view_alter()
EDIT: Also see @berdir's comments on that function in #2972308-22: Allow users to translate content they can edit
Comment #21
geek-merlinInteresting. We should investigate if the above is result of lack of tests for this code path or if this code path is indeed unnecessary as access is checked elsewhere too (which my gut feeling is).
As of patch #16:
> Add a flag to the getTranslationFromContext() method that when set to TRUE (defaults to FALSE to maintain BC), returns NULL if no translation is found for the current language context.
As we have no base class i can not see how we can do this in a BC way with an extra parameter (am i missing something?).
But OTOH we have a $context array where we can overload it there. Patch flying in that does this and removes said hook implementation.
Comment #22
geek-merlinCarried together arguments in the IS. See there why i deem this a bug (but do not want to bikeshed about that one).
Comment #23
geek-merlinComment #24
geek-merlinComment #25
geek-merlinComment #26
geek-merlinAlso i'd say this is broader than content_translation.module
Comment #28
joseph.olstadTHANK YOU ALL!!!
I have been searching for a fix to access_unpublished working with latest revision drafts for content that has a published revision. Without this core patch only the default language works correctly in this scenario (hashkey access using access_unpublished)
however with this core patch, it resolves my issue.
Prior to finding this I was trying to run backtraces but those seem to crater my server that only has 16 gigs of ram, I think I need about 64 gigs of ram just to be able to do a proper backtrace.
Honestly I spent all weekend working on this, and 2 nights this week, I was debugging like crazy trying to figure out what the heck was going on, didn't even really know what to look for, then I just started trying core patches that had the word 'moderation' and 'translation' in it.
This is the right fix!
Bumping up the priority on this, this fix is going to be used on a Drupal site for some very important and internationally famous persons!
Million thank yous! I tested this fix to work on Drupal 9.0.x dev. I'm going to be using this on a production system using either 8.7.10 or 8.8.0.
Comment #29
joseph.olstadComment #30
joseph.olstadto explain at a very high level how this resolves issues, during my testing without this core patch using the access_unpublished module with content moderation and pending draft revisions, the route and route_match in the latest revision check class of access_unpublished loads the entity based on route and route_match but the default language is loaded instead of the correct language. I'm not sure why but this access check processes the same entity several times in this scenario and which probably explains why a get translation won't work even if hard code loading the expected revision. there's several entities involved not just node entities that get processed, so this is why this core patch is needed in this case.
There's likely more fixing that this does internally, with this patch the access checks refer to the correct language revision and works as expected however without this patch no contrib code change would do. It's too deep into the internals of core.
I highly recommend this core patch. We just need tests for it.
Comment #32
joseph.olstadStill needs tests.
This patch is critical to many of my clients, for now it is in all of our builds
Comment #34
nwom commentedHere is a re-roll for 9.1.x.
Comment #36
anybodyThis might be related #2964383: Access denied on content translation and unpublished content with non-english fallback language so I tried Axels patch from #21 but it didn't change the situation. Anyway both issues seem to be closely related?
Comment #38
joseph.olstad@TODO:
review whether or not we still need the patch for Drupal 9.1.10 or the latest D9.2.x branch or 9.3.x
#2972308: Allow users to translate content they can edit
Comment #39
joseph.olstad@TODO:
review whether or not we still need the patch for Drupal 9.1.10 or the latest D9.2.x branch or 9.3.x
#2972308: Allow users to translate content they can edit
Comment #40
joey-santiago commentedIt surprises me a lot this issue hasn't had that much followers?
I'm afraid we are experiencing this in all of our sites, but just never realised. We just noticed it when doing some deeper testing than usual and at first we were thinking it was a problem with the content_access module that we are using. Instead it seems it's a wider thing, so we'll test the patch from #21 on 8.9.17.
If i'm reading the tests correctly, something should be added to web/core/tests/Drupal/KernelTests/Core/Entity/EntityTranslationTest.php in order to test this properly? I'll try doing my best there.
Comment #41
joey-santiago commentedI hope i managed to write a test for this? Here's my effort. So now i'll try patching :)
web/core/tests/Drupal/FunctionalTests/Entity/EntityTranslationPublishTest.php
The test fails for me ATM... i suspect because of a bug somewhere?
Comment #42
joey-santiago commentedSorry the test i added was definitely wrong. This is a better version i hope :)
Comment #43
joey-santiago commented... And finally a complete, real patch with this tests.
Comment #44
suresh prabhu parkala commentedTried to fix custom failures. Please review.
Comment #47
seanbI think #9 is a perfect summary of what is happening.
It would be great to have an option to select what kind of fallback behaviour is appropriate for a specific site. Site editors probably need different rules though. If a user has access to an unpublished/draft translations, we can probably not simply ignore any fallbacks.
Comment #48
murzI've figured out a problem with language fallback logic, which it always returns the default language, even if fallback to this language is disabled and this language is absent in the fallback candidates array.
Here is my issue about this: #3308838: EntityRepository::getTranslationFromContext() function forcibly adds default entity language to fallback candidates
The proper solution for that problem is just return
NULLfromEntityRepository::getTranslationFromContext, but this breaks a lot of other Drupal Core functions (and most likely Contrib too) that doesn't expect theNULLresponse from this function.Comment #49
anchal_gupta commentedRerolled patch 9.5x
Comment #51
joseph.olstadStill needing and using this patch in a few hundred Drupal 8 and 9.x sites, three years into this now.
Comment #52
d34dman commentedNOTE:
We observed similar issue when writing test for decoupled menu. @bbrala took a deep dive into what is happening and came up with a summary here https://www.drupal.org/project/drupal/issues/3227824#comment-14757962
Comment #53
needs-review-queue-bot commentedThe Needs Review Queue Bot tested this issue. It either no longer applies to Drupal core, or fails the Drupal core commit checks. Therefore, this issue status is now "Needs work".
Apart from a re-roll or rebase, this issue may need more work to address feedback in the issue or MR comments. To progress an issue, incorporate this feedback as part of the process of updating the issue. This helps other contributors to know what is outstanding.
Consult the Drupal Contributor Guide to find step-by-step guides for working with issues.
Comment #54
joseph.olstadtry this, changes to the automated tests as per the bot suggestions.
Comment #55
joseph.olstadComment #56
joseph.olstadComment #57
joseph.olstadComment #58
joseph.olstadcode sniffer.
Comment #60
anybodyAdded #3067999: ProcessedText filter doesn't pass in correct language when using language fallback
Comment #62
kristiaanvandeneynde+1 for somehow making this configurable.
Currently we have a site where I would like to see @sdewitt's suggestion as explained by @plach in #9. I.e: No translation -> 404, unpublished translation -> 403
But I can definitely see websites where you want a fallback at all cost and then it would make more sense to have a fallback in both the above cases. The downside to a toggle is that if you flip it on a live website with existing data, all of a sudden your access logic gets turned upside down.
Query access seems to be fine, as missing or unpublished translations don't show up anyway (fulfulling scenario 1's desires) and the fallback always shows up for both scenarios.
Comment #63
kristiaanvandeneyndeWe ran into another demand for disabling language fallbacks, adding that here to emphasize the need to be able to do this. Currently, entity repository adds the default language as a fallback and there is no way to stop it from doing so as it runs after the fallback hooks.
We are currently relying on the buggy behavior explained in #2978048-33: Unpublished translations should fallback to the original language and #2978048-35: Unpublished translations should fallback to the original language as at least it is some way to make Drupal not use fallbacks. All we have to do then is make sure that whenever a new content entity is published, we create unpublished translations for it.
It would be far superior for Drupal to be consistent across the board and to allow developers to choose whether or not they even want fallbacks to happen. It's perfectly valid to say: "Deny access to translated versions of a node until it's translated and ready". Especially in legal contexts where you are not allowed by law to show certain content outside or inside of certain countries.
Our current work-around doesn't work perfectly, because ER field, the entity_upcast operation and the default entity_view operation all behave differently.
Comment #64
rcodinaPatch on #58 works like a charm on Drupal 10.3.5. In my case the problem was paragraph related: we have a paragraph published in Catalan and unpublished in English . However, before applying patch, it was still showing publicly in English.
Comment #66
joseph.olstadThanks @rdocina, only one fail now
https://git.drupalcode.org/issue/drupal-2951294/-/jobs/4065933
Comment #67
markwittens commentedComment #68
rcodinaWith latest merge from 11.x, all pipeline tests are all green.
Comment #69
a.dmitriiev commentedMR doesn't have all the changes from patch #58. Now the hook
content_translation_language_fallback_candidates_entity_view_alteris in the class, so MR actually does not remove it, but patch - does.Comment #70
norman.lolTrue, now the hook is in src/Hook/ContentTranslationHooks.php?ref_type=tags#L382-415 and the 11.x MR still has it, while the 10.x patch has the hook removed.
But all tests are green 🤔
Are the tests wrong? Or could the hook still stay?