I found this bug using search_api_solr module but I think the problem should be fixed in search_api module.

Problem/Motivation

Taxonomy terms are not indexed as expected for multilingual sites.

Steps to reproduce:
1) Enable search api + search api solr + views modules + Language module + Content Translation...
2) Add a new language (i.e. Spanish).
3 ) Create a server and the index (Datasource Content).
4) Add a new term for the Vocabulary Tags(foo EN).
5) Translate the tag (foo ES).
6) Add the field Tags » Taxonomy term » Name to the index + the node title.
7) Create 1 node in English and add the tag "foo EN".
8) Add the translation for this node and add the tag "foo ES".
9) Index the content.

The title is indexed correctly:

EN
"ss_5f_title":"test EN"

ES
"ss_5f_title":"test ES"

But the name of term is not indexed as expected:

EN and ES
"ss_5f_name":"foo EN"

I think that the term should be indexed using the active langcode.

Proposed resolution

Utility::extractFields() for nested values will use x-default as active langcode because when loading the term from $item_nested->getTarget() the active langcode by default is x-default. Therefore, the next time this item(Term) is the argument (self::extractFields()) we will obtain the value in EN because getTranslatedField() will use the active langcode from the Term(x-default) and not from the Node.

The attached patch fixes the problem but I'm not sure this is the best way. I'm trying to get the langcode from parents and other possibilities without success for the moment...

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

marthinal created an issue. See original summary.

marthinal’s picture

Issue summary: View changes
borisson_’s picture

Status: Needs review » Needs work

I think this will need additional test-coverage, I also found a couple smaller things that should be changed.

  1. +++ b/src/Utility.php
    @@ -184,8 +184,16 @@ class Utility {
    +   * @param string $active_langcode
    +   *   The active langcode obtained from the parent item.
    

    string|NULL

  2. +++ b/src/Utility.php
    @@ -184,8 +184,16 @@ class Utility {
    +  public static function extractFields(ComplexDataInterface $item, array $fields, $active_langcode = null) {
    

    This should be NULL instead of null.

  3. +++ b/src/Utility.php
    @@ -184,8 +184,16 @@ class Utility {
    +    if(!$active_langcode) {
    +      $language = $item->getValue()->language();
    +      $active_langcode = $language->getId();
    +    }
    

    I think it's better to do if ($active_langcode !== NULL) {. The $active_langcode can't be FALSE according to the @param

  4. +++ b/src/Utility.php
    @@ -215,16 +223,21 @@ class Utility {
    +          if ($active_langcode) {
    +            $entity = $item_nested->getValue();
    +            $entity = $entity->getTranslation($active_langcode);
    +            $item_nested->setValue($entity);
    +          }
    

    I don't think we need to do this if-statement, it should be always set.

drunken monkey’s picture

Thanks for creating this issue!
You're right, this does seem like a problem. Thanks for also already creating a patch!

+++ b/src/Utility.php
@@ -184,8 +184,16 @@ class Utility {
+  public static function extractFields(ComplexDataInterface $item, array $fields, $active_langcode = null) {

I'd just call it $langcode, I don't think the adjective is necessary.

More importantly, though, this new code will fail (unless I'm very much mistaken) for non-entity items, where getValue() doesn't have to have a language() method – it doesn't even need to be an object. So I fear the code has to get a bit more complicated.
Maybe there is even a common interface for things which know their language (or, idealally, even for those with getTranslation()), so it can at least be more or less clean, even if complicated.
Also, if we do #2641392: Review our language/translation support, we could just always pass the language code when calling this, instead of having to magically determine it in the outer-most call.

However, all in all this does kinda look like a Core bug. Would you agree?
I mean, if I have an Italian node and want its related taxonomy terms, why would I ever want them in English?
So maybe we should open an issue in Core for this and see how that goes. (Probably we'll need to do at least a work-around, though, in any case. Slim chance it'll get fixed soon, I'd say.)

drunken monkey’s picture

I've created a Core issue for this now: #2702909: Entities accessed through properties are always in the current content language. Let's see whether someone responds.

I also noticed that the problem is a bit better and a bit worse simultaneously: the indexed language is actually the current page's content language, so if you use a Dutch UI to edit Dutch content (and items are immediately indexed) the indexed data should be fine. However, on the other hand, this makes the bug additionally unpredictable, since it might work for some items but then not for others, depending with which page request language they're indexed.

Finally, here is an overhauled patch, with increased stability and also some kernel tests. It needs to be applied on top of #2641392-12: Review our language/translation support, though (I think).

drunken monkey’s picture

Status: Needs work » Needs review

Status: Needs review » Needs work

The last submitted patch, 5: 2684465-5--translated_related_entities.patch, failed testing.

The last submitted patch, 5: 2684465-5--translated_related_entities.patch, failed testing.

drunken monkey’s picture

The last submitted patch, 9: 2684465-9--translated_related_entities--tests_only.patch, failed testing.

Status: Needs review » Needs work

The last submitted patch, 9: 2684465-9--translated_related_entities.patch, failed testing.

drunken monkey’s picture

The last submitted patch, 12: 2684465-12--translated_related_entities--tests_only.patch, failed testing.

zalak.addweb’s picture

Issue tags: +taxonomy terms
borisson_’s picture

Status: Needs review » Reviewed & tested by the community
Issue tags: -taxonomy terms

  • drunken monkey committed e591b33 on 8.x-1.x
    Issue #2684465 by drunken monkey, marthinal: Fixed indexing of related...
drunken monkey’s picture

Status: Reviewed & tested by the community » Fixed

Thanks for reviewing (and cleaning up the issue tags)!
Committed.
Thanks again, everyone here!

Boobaa’s picture

Status: Fixed » Needs work

Sorry, I have to reopen this. Using drupal-8.1.x (as of a734cfae5dba958e3922b9291795b72b50a89f5a, which is equivalent of 8.1.9), search_api-8.x-1.0-beta1+10-dev (from 2016-09-17), search_api_solr-8.x-1.0-alpha6+8-dev (from 2016-09-19) and facets-8.x-1.0-alpha4+16-dev (from 2016-09-12, with the patch from #2794745-39: Use Search API's display plugin to fix facets) I cannot translate nodes any more. All I get is an exception:

Uncaught PHP Exception Drupal\\Core\\Entity\\EntityStorageException: "Invalid translation language (hu) specified." at …/d8/core/lib/Drupal/Core/Entity/Sql/SqlContentEntityStorage.php line 770, referer: http://d8.local/hu/node/3/translations/add/en/hu

After quite a bit of debugging it turned out this was caused by this change in search_api/src/Item/Item.php:

-             Utility::extractFields($this->getOriginalObject(), $fields_by_property_path);
+            Utility::extractFields($this->getOriginalObject(), $fields_by_property_path, $this->getLanguage());

With this single change to this single line reverted, I could translate nodes. (I suppose having search_api_solr and facets modules in the picture doesn't affect this, but haven't confirmed this just yet.)

audriusb’s picture

I get this error with no languages installed.
InvalidArgumentException: Invalid translation language (und) specified. in Drupal\Core\Entity\ContentEntityBase->getTranslation() (line 745 of core/lib/Drupal/Core/Entity/ContentEntityBase.php).

ksavoie’s picture

Prev was working, then updated to latest
D8.1.10
Search API 8.x.1.0-beta2
Search API Exclude Entity 8.x.1.0-alpha1
Search API Solr Search 8.x.1.0-alpha6

Indexing fails with same error audriusb is receiving.

Type: search_api
InvalidArgumentException: Invalid translation language (und) specified. in Drupal\Core\Entity\ContentEntityBase->getTranslation() (line 745 of /var/www/html/my_site/core/lib/Drupal/Core/Entity/ContentEntityBase.php).
Severity: Error
audriusb’s picture

yeah I rolled back to beta1 ... uninstalling module and then re-configuring it again did not solve the problem, therefore I do believe this isn't just lacking upgrade path.

drunken monkey’s picture

Could you provide steps for reproduction from a clean install of the latest module versions?

Dropa’s picture

Status: Needs work » Needs review
FileSize
587 bytes

This should fix problem with undefined or not applicable languages.

david.gil’s picture

  • drunken monkey committed 2563d84 on 8.x-1.x authored by Dropa
    Issue #2684465 by Dropa, david.gil, drunken monkey: Fixed indexing of...
drunken monkey’s picture

Status: Needs review » Fixed

Oh, yes, looks like we indeed need to check for that. Just because, e.g., a node has been translated to German, it doesn't mean (all of) its taxonomy terms have been translated, too.
I'd say the same check is needed a few lines above, too, though.

Made that change and committed. Thanks a lot, both of you!

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.