Symptom

Filtering for referenced entities stops working.

Problem

After updating from search api 7.13 to 7.18 on a site (due to the security release) we ran into a regression. The site uses the elasticsearch connector module.
After some debugging, I figured the reason is that a single entity reference field gets now indexed as JSON object (thus having keys+values) and not as JSON array (having only the IDS without keys). The field is indexed as list of integers in search api although it's a single entity reference based on a field. Thus, the normal entity wrapper, just returns the single entity ID. However somewhere in the search API processing side it converts it to a list of integers, but it does so without creating a proper numerically indexed array BUT a associative array with the entity ID being the key as well.

Unfortunately I did not figure where search api does that conversion. So attached patch just works-a-round the issue by post-processing the values - but that fixes the problem.

Interestingly, single entity references based on an entity property seem to work still.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

fago created an issue. See original summary.

fago’s picture

Status: Active » Needs review
FileSize
848 bytes
fago’s picture

Issue summary: View changes
drunken monkey’s picture

Are you using the "Index hierarchy" data alteration, could that be what changes the indexed values?
Then it would seem we just need to apply that fix there.
On the other hand, I think an indexed field's values should never be an associative array, so your patch might still make sense – even minus the $is_entity in the condition. However, it's of course a larger change then, and thus more likely to break some special use cases somewhere else.
Maybe the Elasticsearch backend should instead just ensure that all field values are properly indexed?

In any case, I'm happy to see we already thought of that in the D8 version and make sure all field values have their normal numerical indexes.

drunken monkey’s picture

Status: Needs review » Postponed (maintainer needs more info)
Island Usurper’s picture

Status: Postponed (maintainer needs more info) » Active

I have the same issue, and I believe you're right about the "Index hierarchy" alteration being the problem. My nodes have several term references, but only the ones with that enabled cause problems when indexing.

Using git bisect I was able to determine that the commit that broke things for me was 21e7d623d6e3fb7d913cc8c9031ffa2f52bb0dfd which is the follow-up patch on #2450333. I'm really not sure how to fix it, though.

drunken monkey’s picture

Status: Active » Needs review
FileSize
482 bytes

OK, thanks for finding that out, already good to know! Even though this seems like the "Index hierarchy" data alteration isn't the problem after all … ? So, it also worked for you before the initial commit of #2450333: Increase performance of indexing entity references? Then maybe just reverting that whole issue would be the best option after all. Seems that caused more problems than it solved, on the whole.

And I guess you are also using Elasticsearch? I still think that module just ensuring the data looks like it expects it to (since we don't document/guarantee anywhere (as far as I remember) that the values will be numerically indexed) would be the best solution, also safest regarding future changes or other unexpected scenarios.

Also: Does the patch in #2 solve the problem for you, too?
And what about the attached patch?

In any case, thanks for helping me fix this!

Island Usurper’s picture

Yes, I'm using Elasticsearch. I did wonder if the fix would be best in search_api or there, but then I found this issue. The patch in #7 is what I came up with myself, and it it seems to work very well. I haven't tried #2, but it looks like the same fix, just in a different place.

I would support moving this issue to Elasticsearch Connector, since its use of JSON means the data format is more finicky than other implementations that would just loop over it.

Island Usurper’s picture

As for reverting #2450333, that might make sense too. I expect it would be rare to index a nested entity's ID but not any of its other fields. To index any other fields, you'd have to load the nested entity anyway. On my site, I've converted taxonomy pages into search pages, that automatically filter on the term reference ID. But I'm also indexing the terms' names so that keyword searches work too.

  • drunken monkey committed a5ba346 on 7.x-1.x
    Issue #2749963 by drunken monkey: Fixed "Index hierarchy" not having...
drunken monkey’s picture

Status: Needs review » Fixed

OK, thanks a lot for testing! Committed #7.
Regarding the ES Connector module: Please just create a new issue for that, saying that they should check that any value arrays are numerically indexed before converting to JSON.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.