Currently hreflang_page_attachments() acts on all routes. This is nice because it fills in gaps left by content_moderation module in regards to hreflang tag generation, for instance on view pages.
However, this also means that this module overrides the work that the content_moderation does when determining when to set hreflang tags on content entities. The work done in that module (see content_translation_page_attachments() ) is more nuanced than what is currently in this module. For instance it takes into account page access, which is quite a nice feature.
I propose that this module should expressly not act on content entities as that logic is handled by content_moderation. This is good because we're cutting down on duplicative logic, and also because content_moderation has more eyes on it, which means we should continue to get nice features, such as the aforementioned route access checking.
To reproduce this issue:
- Set up a site with more than one language, say English and Spanish.
- Install hreflang module
- Create a node with a single English translation
- View the node detail page and note that hreflang tags exist for both Spanish and English
- Disable hreflang module (or apply patch attached here) and reload the node detail page. You should only see an hreflang tag for english now.
| Comment | File | Size | Author |
|---|---|---|---|
| #28 | interdiff-25-28.txt | 1.05 KB | mfb |
| #28 | hreflang-2995378-28.patch | 9.23 KB | mfb |
Comments
Comment #2
zweishar commentedHere's the patch. This logic is borrowed directly from content_translation_page_attachments() and retrofitted for our needs here.
Comment #3
zweishar commentedHere's an updated patch that doesn't throw warnings when no parameters are available.
Comment #4
mfbI would say this issue should be addressed upstream in Drupal core.
This module uses the language switch links API provided by core to get the URLs for the page with a different user interface language. These links are also used for the language switch block (hreflang tags are more or less a machine-readable version of the language switch block).
If a core module or custom module needs to modify the language links, it can implement hook_language_switch_links_alter().
Comment #5
zweishar commentedWhen content_tranlsation module generates hreflang links it does not rely on the language switching link API. Instead it looks for existing translations of an entity and generates links based on that.
IMHO the issue with the language switching links API is that it generates links to translations regardless of whether or not the translation actually exists, but the way that API works is not the concern of this module.
Translation existence is something that can be checked for content entities. Deferring to core on that matter and filling in the gaps for other pages seems like a nice use of this module. As is this module clobbers what content_tranlsation does.
I know that "links to non-existent translations" is something that has come up in the issue queues for this module before; I'm not looking to drag up that topic if I can help it.
I just think that letting core do core and filling in the gaps is preferable, but it's up ultimately up to you as the maintainer.
Comment #6
mfbMany sites don't translate all content into all languages, so it's valid to link to a page with the user interface in a given language, but some content on the page using a fallback language. That's basically what the language switcher block ends up doing.
We could perhaps add a configuration to control this behavior if it's causing problems. But that configuration should arguably be in core since the language switcher should use the same logic.
Comment #7
mfbOh and... if the language URL with untranslated content is linked some other way aside from language switcher, e.g. views, or manually created links, then you have real SEO problems if the pages do not have hreflang tags.
Untranslated pages showing up in Google results is exactly why I had to make this module.
So really, the unwanted page would need to give a 404 [or 301] status to avoid SEO issues, not just disappear from language switcher.
Comment #8
zweishar commentedI guess the problem here is that there's no right or wrong answer to that. It comes down to the requirements of the site in question. I created this patch for a site where that's not acceptable. Instead they prefer to redirect to the homepage in the requested language if a node translation does not exist. This has a nice side effect of not allowing google to index untranslated pages.
I think a configuration option is a decent way to handle it. I also agree that that this configuration should really be in content_moderation as I believe that's the module that provides the language_switcher.
Comment #9
mfbThe language switcher is provided by Language module, see https://cgit.drupalcode.org/drupal/tree/core/modules/language/src/Plugin...
Comment #10
Adam_M commented@zweishar's substantive issue is getting lost underneath patches and opinions:
I would like to second this approach. Making Hreflang add on to gaps of core's hreflang handling makes much more sense than duplicating core's logic, only to then suppress core's output.
It would also obviate the need for proposals like #2978112 Remove Core hreflang tags.
In fact, in its capacity as an enhancer of core, Hreflang could help avoid the whole "show vs. don't show links to non-existent translations" argument by making this a setting. It could offer a checkbox to "Add links to untranslated content in language set". (Personally, I would never check this, but that's the beauty of it).
Comment #11
mfbThe problem is that on some sites, depending on the configuration of menu module, views module, etc., links to untranslated content are automatically generated, and therefore show up in search engine results.
The hreflang tags are then required so that search engines know how to index these URLs.
If a site can guarantee that there are no links to untranslated content then yes, we could defer to core here.
So at best, we could add a "defer to core" configuration which you could set to true, if desired. The default configuration would have to continue the status quo so there are no unexpected SEO problems re: untranslated URLs.
Comment #12
mfbSetting to "needs work" as I believe this needs a configurable "defer to core" option.
Comment #13
mfbComment #14
nedsbeds commentedThis patch does not work now that #2978112: Remove Core hreflang tags has been merged.
An updated patch is attached which reverts #2978112 so that the core functionality is not unset.
Comment #15
mfbbtw I created a task at #3245010: Workaround for content translation issue can be removed once support for Drupal 8.9 is dropped to remove the suppression of content_translation hreflang tags, as it should no longer be needed in core ≥ 9.1 thanks to #2796399: Problematic hreflang links when using entity as front page.
Comment #16
mfbok #3245010: Workaround for content translation issue can be removed once support for Drupal 8.9 is dropped is in, so this no longer applies
Comment #17
mfbAdding a related issue where Hreflang module and Content Translation module have different logic for adding hreflang tags: Hreflang module includes any query string in the hreflang tags; Content Translation module removes any query string from the hreflang tags.
Comment #18
mfbBy the way, now that this module has configuration (thanks to another issue), it would actually be super easy to add configuration and tests for this issue as well, if anyone is interested :)
Comment #19
bahbka commentedRerolled last patch against latest 8.x, with few additions:
Please review.
Comment #20
bahbka commentedComment #21
bahbka commentedI've just realised that x-default tag, for content, was missing when defer_to_content_translation option was enabled.
Here is an updated patch with tests that fixes this gap.
Comment #22
mfbIt seems ok to add x-default even if the content only has one language, as you might want the page to show up in search results for folks in other languages, if that is the only available translation? E.g. if someone with a different primary language was searching for this issue page and I hadn't translated it yet :) In any case, the logic is wrong - you need to wrap it in count() - but I think it'd be ok to remove this clause.
Comment #23
mfbBy the way, it's possible that folks will request more control over the x-default tag when they do have multiple translations - e.g. maybe they want it to point at their site's default language rather than the content's original language. But we could probably wait on that until someone files a feature request? :)
Comment #24
bahbka commentedRemoved language count check. I agree that there should be an option to set x-default to the site default language and not to the content source language, but it should be a part of the separate ticket.
Comment #25
mfbThis is the same patch but changes some strings. I tried to add some more explanation of the setting based on my understanding.
Comment #27
mfbIt looks like there is some kind of concurrency issue re: sqlite ("database is locked") so I modified CI to use postgres.
Comment #28
mfbI couldn't see a reason to call
setOption('language', $entity->getUntranslated()->language())twice so I removed one of the calls. Also changedtoUrl('canonical')totoUrl()because canonical is the default as far as I understand. Also fixed indentation typo.Comment #29
bahbka commentedLooks good for me.
Comment #30
mfbDoes anyone have thoughts about what the defer_to_content_translation setting for new installs should be? I'm fine with leaving it at FALSE. Could also enable it if there is a sense that most sites using Content Translation want it enabled.
My personal experience on a few sites was that it took a while to translate all content, but we did translate some parts of the interface, and since links to the untranslated pages were automatically generated by views and the site navigation, it was useful to have hreflang tags on not-yet-translated content.
Comment #32
mfb