This fixes an issue where a Search API Processor is adding fields from multiple entities to the index, but the related entity's field definition is returning a cardinality of 1.

Possibly fixes #3050475 in 4.x if the hook is implemented for Reverse Entity References.

Comments

KarlShea created an issue. See original summary.

karlshea’s picture

Status: Active » Needs review
StatusFileSize
new2.21 KB
mkalkbrenner’s picture

Status: Needs review » Needs work
+++ b/src/Plugin/search_api/backend/SearchApiSolrBackend.php
@@ -2084,7 +2084,15 @@ class SearchApiSolrBackend extends BackendPluginBase implements SolrBackendInter
+                    $cardinality = $this->getPropertyPathCardinality($property_path, $datasource_definitions);
+
+                    // Let other modules alter cardinality.
+                    $cardinality = $this->moduleHandler->alter('search_api_solr_field_cardinality', $cardinality, $property_path, $datasource_definitions);

The hook should be invoked within function getPropertyPathCardinality()

Two more things should be dicussed in general.
Isn't it better to introduce events instead of new hooks.
Can you provide a concrete example when this "hook" is needed? I ask myself if getPropertyPathCardinality() itself could be improved.

karlshea’s picture

Hey thanks for taking a look! #3059170-6: Referencing of User Entities via "Group Membership" using referencing entity is an example of where I'm hitting this issue (and the linked sandbox project shows how I'm handling it with the hook), and the patch at #3050475-21: SearchApiSolrBackend is setting incorrect prefix to Search API reverse entity references is hard-coding a fix for reverse entity references (since this is broken again in 4.x).

In summary, let's say a custom processor is adding an entity type property (exactly like ReverseEntityReferences). Even if is_list is set on that property, getPropertyPathCardinality() is only looking at the target field storage definition to determine cardinality. So then this happens:

[indexed entity] ---> [multiple entities from processor] ---> [single-value field]

getPropertyPathCardinality() returns "1" in this situation for the final field, even though when indexing actually happens there are multiple entities added by the processor, so Solr errors out giving an error about needing a multi-valued field.

> The hook should be invoked within function getPropertyPathCardinality()

I initially thought that too, but getPropertyPathCardinality() is recursive and in order to fix the problem you need to look at the full property path, not the nested path.

It could probably also be an event? I was just following the existing patterns in the module.

One thing that occurred to me just now is that maybe getPropertyPathCardinality() needs to short-circuit any time it hits an is_list property and immediately return CARDINALITY_UNLIMITED? That would probably fix the issue without the hook.

karlshea’s picture

> One thing that occurred to me just now is that maybe getPropertyPathCardinality() needs to short-circuit any time it hits an is_list property and immediately return CARDINALITY_UNLIMITED? That would probably fix the issue without the hook.

I just tried this and it didn't work. It looks like getPropertyPathCardinality() only runs before the processors, so it never actually checks anything a processor adds and always returns "1".

Maybe it needs to run again (or only after) all the processors with the "add_properties" stage have run?

mkalkbrenner’s picture

Category: Feature request » Support request
Status: Needs work » Fixed

You can simply use the existing hook hook_search_api_solr_field_mapping_alter(). Just prefix the field name as required. For example "sm_" stands for "String Multiple":

**
 * Change the way the index's field names are mapped to Solr field names.
 *
 * @param \Drupal\search_api\IndexInterface $index
 *   The index whose field mappings are altered.
 * @param array $fields
 *   An associative array containing the index field names mapped to their Solr
 *   counterparts. The special fields 'search_api_id' and 'search_api_relevance'
 *   are also included.
 * @param string $language_id
 *   The language ID that applies for this field mapping.
 */
function hook_search_api_solr_field_mapping_alter(\Drupal\search_api\IndexInterface $index, array &$fields, string $language_id) {
  $fields['foo'] = 'sm_foo';
}

karlshea’s picture

Category: Support request » Feature request
Status: Fixed » Active

Sorry, but that does not work:

function my_module_search_api_solr_field_mapping_alter(\Drupal\search_api\IndexInterface $index, array &$fields, string $language_id) {
  // Full field path is "search_api_reverse_group_content_entity_references_group_content__entity_id:gid"

  // $fields['gid'] is 'its_gid' here.

  $fields['gid'] = 'itm_gid';
}

Solr output:

{
  "its_uid":58,
}

This is an option to fix #3050475 without hard-coding "search_api_reverse_entity_reference" in `getPropertyPathCardinality()` so that other modules can do the same thing Reverse Entity References (which is currently broken) are trying to do.

mkalkbrenner’s picture

uid?

karlshea’s picture

Oops, copied the wrong field! It does look like it's working, it has both!

{
  "its_gid":3,
  "itm_gid":[3],
}

It looks like that hook is also called for a view, so I'm assuming the view is using the correct re-mapped field as well?

It sounds like your suggestion to fix #3050475: SearchApiSolrBackend is setting incorrect prefix to Search API reverse entity references is:

function search_api_solr_search_api_solr_field_mapping_alter(\Drupal\search_api\IndexInterface $index, array &$fields, string $language_id) {
  foreach ($index->getFields() as $key => $field) {
      $propertyPath = $field->getPropertyPath();

      // Look for strings starting with "search_api_reverse_entity_references_" and alter each field's cardinality by replacing "s" with "m" in each prefix?
  }
}

I spent a lot of time months ago figuring out what was going wrong with Reverse Entity References and also a module I was writing that did something similar, and it seemed like there needed to be a better way to fix the issue with child fields. I'd be happy to change how this patch works (or how a patch for #3050475 would work) but I'm out of ideas about what the right approach is from the perspective of the module maintainers.

If this is the direction you're suggesting I can change the patch, but it just seems incredibly fragile.

karlshea’s picture

This is what I've got to fix Reverse Entity References—is this what you were thinking?


/**
 * Implements hook_search_api_solr_field_mapping_alter().
 */
function search_api_solr_search_api_solr_field_mapping_alter(\Drupal\search_api\IndexInterface $index, array &$fields, string $language_id) {
  foreach ($index->getFields() as $search_api_name => $field) {
    $property_path = $field->getPropertyPath();

    if (0 === strpos($property_path, 'search_api_reverse_entity_references_')) {
      $type = $field->getType();
      $type_info = Utility::getDataTypeInfo($type);
      $pref = $type_info['prefix'] ?? '';

      if (strpos($pref, 't') !== 0) {
        $fields[$search_api_name] = Utility::encodeSolrName($pref . 'm_' . $search_api_name);
      }
    }
  }
}


mkalkbrenner’s picture

Title: Hook to alter field cardinality » Improve field cardinality detection
Status: Active » Needs review
StatusFileSize
new4.42 KB

The hook implementation seems to be right. But obviously it is an individual solution. It would be better to have a generic solution within the module that also doesn't depend on a hardcoded name.

By reviewing the code, I wonder if we just need to change the order and give "lists" precedence over the storage cardinality.

mkalkbrenner’s picture

StatusFileSize
new1.68 KB

Uploaded wrong patch :-(

Here is the right one.

karlshea’s picture

This is the issue I'm seeing now:

[$key, $nested_path] = SearchApiUtility::splitPropertyPath($property_path, FALSE);

// $key: 'search_api_reverse_group_content_entity_references_group_content__entity_id'
// $nested_path: 'gid'

if (isset($properties[$key])) {

So isset($properies[$key]) always fails.

What's strange is that $properties['gid'] is never true, but "its_gid" shows up in the index. It's almost as if this is never run after the processor adds the fields.

karlshea’s picture

Maybe a little closer:

Custom processor properties are added in search_api's Entity\Index::getPropertyDefinitions($datasource_id).

The only place getPropertyPathCardinality() is called is in SearchApiSolrBackend ($this->getPropertyPathCardinality($field->getPropertyPath(), $datasource->getPropertyDefinitions())).

However, search_api's Plugin\search_api\datasource\ContentEntity::getPropertyDefinitions() (which is what gets passed to getPropertyPathCardinality() doesn't return the property list from Entity\Index. It only returns field definitions from Entity Field Manager—none of the processor's fields will ever be there.

That must be why the other issue's patch was just hard-coding the key path name, because the field definition from the processor will never exist in the array.

karlshea’s picture

And this seems to be where the logic may need to be improved. SearchApiSolrBackend::formatSolrNames:


// $field here is 'gid'. The field itself in the content entity is NOT a list, so the data definition will return false. It should only be a list because because ReverseEntityReferences returns multiple values when it's indexed.

if ($field->getDataDefinition()->isList() || $this->isHierarchicalField($field)) {
  $pref .= 'm';
}
else {

  // Falls through.

  try {
	$datasource = $field->getDatasource();
	if (!$datasource) {
	  // Be paranoid: getDatasource() should have thrown the
	  // exception already.
	  throw new SearchApiException();
	}

        // $datasource->getPropertyDefinitions() is from ContentEntity, which does not include processor fields, so there can't be a check.

	$pref .= $this->getPropertyPathCardinality($field->getPropertyPath(), $datasource->getPropertyDefinitions()) != 1 ? 'm' : 's';
  }

The other issue is that even if the field were there (the definition for 'gid' is in $fields, which is being looped over), it's checking 'gid', which is NOT a list. The 'is_list' => TRUE is set in ReverseEntityReferences, which 'gid' is nested through, but that property doesn't appear to be checked anywhere.

It almost seems like where I commented "Falls through" that entire try/catch where it checks by datasource property definitions needs to be replaced with

$index_properties = $index->getPropertyDefinitions($field->getDatasourceId()); // Returns the correct list of field definitions including processor-added properties.
$pref .= $this->getPropertyPathCardinality($field->getPropertyPath(), $index_properties) != 1 ? 'm' : 's';

If I replace that I get

{
  "its_gid":14,
  "itm_gid":[14],
}
karlshea’s picture

StatusFileSize
new3.29 KB

And here's the full patch.

mkalkbrenner’s picture

StatusFileSize
new2.76 KB

It almost seems like where I commented "Falls through" that entire try/catch where it checks by datasource property definitions needs to be replaced with

$index_properties = $index->getPropertyDefinitions($field->getDatasourceId()); // Returns the correct list of field definitions including processor-added properties.
$pref .= $this->getPropertyPathCardinality($field->getPropertyPath(), $index_properties) != 1 ? 'm' : 's';

This seems to be the missing piece. Good catch!

But your patch doesn't take into account that $index->getPropertyDefinitions() could throw the same exception.

This should be the right patch.

mkalkbrenner’s picture

StatusFileSize
new3.15 KB

Unfortunately the tests failed:
https://github.com/mkalkbrenner/search_api_solr/runs/1405216683

The problem is that all base fields are "potential" lists:

class BaseFieldDefinition extends ListDataDefinition

So this "safety net" check needs to go last as it was before.

  • mkalkbrenner committed 16a992a on 4.x
    Issue #3153385 by mkalkbrenner, KarlShea: Improve field cardinality...
  • mkalkbrenner committed d6e4212 on 4.x
    Issue #3153385 by mkalkbrenner, KarlShea: Improve field cardinality...
mkalkbrenner’s picture

Wether if the last patch solves the initial issue or not, it is already an important improvement. I therefore committed it.

Any further patch schould base on the current dev version.

karlshea’s picture

Category: Feature request » Bug report
Status: Needs review » Fixed

This does fix the problem for me, so I think this can be closed. I'll also update the other issue.

Thanks so much for working through this with me, glad we figured it out!

mkalkbrenner’s picture

Thank you!

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

johnlutz’s picture

Uploading a patch to backport for 8.x-1.x version. This is needed by us because we are on acquia and to my knowledge they don't yet support the 4.x version.