When the Highlights Processor is set to get excerpt from Rendered HTML output (rendered_item), the search result in view is empty. Excerpt works for Title and Body field.

I have seen two related issues - https://www.drupal.org/node/2703915 and https://www.drupal.org/node/2305021 which are committed in latest dev.

I have debugged till this function in the file src/Plugin/ProcessorPluginBase.php

protected function extractItemValues(array $items, array $required_properties, $load = TRUE) {

I dont know how rendered_item is extracted in here but it is empty.

I have attached screenshots of the my Highlights Processor config and Rendered HTML config also.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

zuhair_ak created an issue. See original summary.

zuhair_ak’s picture

Issue summary: View changes
zuhair_ak’s picture

Issue summary: View changes
drunken monkey’s picture

Component: General code » Plugins
Issue summary: View changes

rendered_item cannot be extracted/computed without an associated field, so it won't work at query time (i.e., for highlighting). Same, e.g., for "Aggregate field".
However, I guess there's not really any reason not to just use the first field with that property and its configuration. For most users, the distinction won't be clear in any case.

However, I'm currently too busy trying to get this module to Beta, and this is something that can even be fixed after a stable release, so I'll have to postpone this a bit.

However, I'd be glad to give anyone pointers who wants to tackle this. Basically, in this code in \Drupal\search_api\Processor\ProcessorPluginBase::extractItemValues():

          if ($property instanceof ProcessorPropertyInterface) {
            $field_info = array(
              'datasource_id' => $datasource_id,
              'property_path' => $property_path,
            );
            if ($property instanceof ConfigurablePropertyInterface) {
              $field_info['configuration'] = $property->defaultConfiguration();
            }
            $processor_fields[] = Utility::createField($this->index, $combined_id, $field_info);
            $needed_processors[$property->getProcessorId()] = TRUE;
          }

Just look for a field with that property on the index first and only create a dummy field object if none can be found.
Then, please also add a test case (to highlight test and/or aggregated fields test) to check whether this is working correctly. (That's actually probably the longer part, and why I can't just implement this right away.)

zuhair_ak’s picture

Small Clarification - Does Rendered HTML index the whole Node HTML rendered in the browser with the given view mode in DB/Solr? Or is it only the fields which we add in the Search API UI?

drunken monkey’s picture

Small Clarification - Does Rendered HTML index the whole Node HTML rendered in the browser with the given view mode in DB/Solr? Or is it only the fields which we add in the Search API UI?

It uses the selected view mode, renders the node (or other entity) using that and then indexes that HTML.
So, for instance, it would also be possible to only enable this one field in an index, and nothing else, and still get a decent fulltext search.

drunken monkey’s picture

Title: Excerpts from Rendered HTML returns empty in views » Fix extraction of configurable properties in processors
drunken monkey’s picture

Issue tags: +Release blocker
drunken monkey’s picture

This should implement the necessary change. Please test to see whether it resolves your problem!

For reviewing: I've taken the chance to refactor a bit (marked the old methods as deprecated – I think that's still OK for Beta, if there's time enough until we remove them for RC1; if anyone disagrees, please tell me so), so I've also attached a "simplified" patch which just includes the actual changes in the code. (Otherwise, it's hard to see, since I've moved the whole method.)

The last submitted patch, 9: 2782577-9--processors_property_extraction--tests_only.patch, failed testing.

borisson_’s picture

For reviewing: I've taken the chance to refactor a bit (marked the old methods as deprecated – I think that's still OK for Beta, if there's time enough until we remove them for RC1; if anyone disagrees, please tell me so), so I've also attached a "simplified" patch which just includes the actual changes in the code. (Otherwise, it's hard to see, since I've moved the whole method.)

Refactoring + marking as deprecated is still ok now I believe, so that's good!

I haven't tested the patch and I only looked at the simplified patch but that looks great.

I will not set to RTBC until @zuhair_ak confirms that this solves their problem.

zuhair_ak’s picture

I had to change line numbers of ProcessorPluginBase.php in the patch to apply it to latest dev version. Excerpts are indeed shown in the results after the patch from Rendered HTML. Thanks for your effort.

I searched for string - "admin" before and after the patch in a view page created from search index and I have attached screenshots of results below.

search

I have a couple of queries to clarify based on screenshot:
1) There are more than three content with author (generated by devel) as admin but only 3 are shown in search result. I have configured to search Rendered HTML in view, but don't know why it shows only 3?
2) When I searched for "submitted" , i got blank result. "submitted" keyword is present in every node HTML, shouldn't search result show all the nodes?

Also added modified patch which i used, a minor change in line numbers. .

Status: Needs review » Needs work

The last submitted patch, 12: processors_property_extraction-2782577-12.patch, failed testing.

zuhair_ak’s picture

Corrected file paths in patch.

borisson_’s picture

Status: Needs work » Needs review

Needs review to see what the bot thinks.

borisson_’s picture

Status: Needs review » Reviewed & tested by the community

The bots agree and so do I, RTBC.

  • drunken monkey committed b8271cf on 8.x-1.x
    Issue #2782577 by drunken monkey, zuhair_ak: Fixed extraction of...
drunken monkey’s picture

Status: Reviewed & tested by the community » Fixed

Thanks for the re-roll, looks perfect!
Committed.
Thanks again, everyone!

As for your questions: The "Rendered item" processor might use a different view mode than the result listing, and that view mode might (should, really) be configured not to contain any UI text (since that would mess up the results).
If that's not it, please create a new issue for your problem.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

tjtj’s picture

I am still getting this with the latest release:
Warning: While indexing items on search index Default content index, 1 item(s) did not have a view mode configured for one or more "Rendered item" fields.

jbd44’s picture

These are my versions:

Search API: 8.x-1.0-beta4
Solr search: 8.x-1.0-beta1

I also get:

Warning: While indexing items on search index Default Solr content index, 1 item(s) did not have a view mode configured for one or more "Rendered item" fields.

when trying to create an index.

Jody Lynn’s picture

I believe you will get this error if you have added a new content type since you configured the 'Rendered HTML Output' field in your Search Index Fields config. Go back to the Fields admin page, Edit the Rendered Item field, and re-save it.

I would prefer it just 'default' to 'default'.

ccshannon’s picture

Thanks @Jody, that did the trick. One thing to note: I hadn't added any additional content types to the index, but honestly I may have been getting that error for a long time on the one content type we index and just not known it. I definitely had changed some field processor settings along the way. But, yes, just editing the Rendered HTML Output field and re-saving all settings removed the error when I re-indexed.