Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
When the Highlights Processor is set to get excerpt from Rendered HTML output (rendered_item), the search result in view is empty. Excerpt works for Title and Body field.
I have seen two related issues - https://www.drupal.org/node/2703915 and https://www.drupal.org/node/2305021 which are committed in latest dev.
I have debugged till this function in the file src/Plugin/ProcessorPluginBase.php
protected function extractItemValues(array $items, array $required_properties, $load = TRUE) {
I dont know how rendered_item is extracted in here but it is empty.
I have attached screenshots of the my Highlights Processor config and Rendered HTML config also.
Comment | File | Size | Author |
---|---|---|---|
#14 | processors_property_extraction-2782577-14.patch | 32 KB | zuhair_ak |
| |||
#12 | before_after_patch.png | 34.04 KB | zuhair_ak |
#9 | 2782577-9--processors_property_extraction.patch | 31.96 KB | drunken monkey |
| |||
#9 | 2782577-9--processors_property_extraction--tests_only.patch | 12.9 KB | drunken monkey |
Comments
Comment #2
zuhair_akComment #3
zuhair_akComment #4
drunken monkeyrendered_item
cannot be extracted/computed without an associated field, so it won't work at query time (i.e., for highlighting). Same, e.g., for "Aggregate field".However, I guess there's not really any reason not to just use the first field with that property and its configuration. For most users, the distinction won't be clear in any case.
However, I'm currently too busy trying to get this module to Beta, and this is something that can even be fixed after a stable release, so I'll have to postpone this a bit.
However, I'd be glad to give anyone pointers who wants to tackle this. Basically, in this code in
\Drupal\search_api\Processor\ProcessorPluginBase::extractItemValues()
:Just look for a field with that property on the index first and only create a dummy field object if none can be found.
Then, please also add a test case (to highlight test and/or aggregated fields test) to check whether this is working correctly. (That's actually probably the longer part, and why I can't just implement this right away.)
Comment #5
zuhair_akSmall Clarification - Does Rendered HTML index the whole Node HTML rendered in the browser with the given view mode in DB/Solr? Or is it only the fields which we add in the Search API UI?
Comment #6
drunken monkeyIt uses the selected view mode, renders the node (or other entity) using that and then indexes that HTML.
So, for instance, it would also be possible to only enable this one field in an index, and nothing else, and still get a decent fulltext search.
Comment #7
drunken monkeyComment #8
drunken monkeyComment #9
drunken monkeyThis should implement the necessary change. Please test to see whether it resolves your problem!
For reviewing: I've taken the chance to refactor a bit (marked the old methods as deprecated – I think that's still OK for Beta, if there's time enough until we remove them for RC1; if anyone disagrees, please tell me so), so I've also attached a "simplified" patch which just includes the actual changes in the code. (Otherwise, it's hard to see, since I've moved the whole method.)
Comment #11
borisson_Refactoring + marking as deprecated is still ok now I believe, so that's good!
I haven't tested the patch and I only looked at the simplified patch but that looks great.
I will not set to RTBC until @zuhair_ak confirms that this solves their problem.
Comment #12
zuhair_akI had to change line numbers of ProcessorPluginBase.php in the patch to apply it to latest dev version. Excerpts are indeed shown in the results after the patch from Rendered HTML. Thanks for your effort.
I searched for string - "admin" before and after the patch in a view page created from search index and I have attached screenshots of results below.
I have a couple of queries to clarify based on screenshot:
1) There are more than three content with author (generated by devel) as admin but only 3 are shown in search result. I have configured to search Rendered HTML in view, but don't know why it shows only 3?
2) When I searched for "submitted" , i got blank result. "submitted" keyword is present in every node HTML, shouldn't search result show all the nodes?
Also added modified patch which i used, a minor change in line numbers. .
Comment #14
zuhair_akCorrected file paths in patch.
Comment #15
borisson_Needs review to see what the bot thinks.
Comment #16
borisson_The bots agree and so do I, RTBC.
Comment #18
drunken monkeyThanks for the re-roll, looks perfect!
Committed.
Thanks again, everyone!
As for your questions: The "Rendered item" processor might use a different view mode than the result listing, and that view mode might (should, really) be configured not to contain any UI text (since that would mess up the results).
If that's not it, please create a new issue for your problem.
Comment #20
tjtj CreditAttribution: tjtj commentedI am still getting this with the latest release:
Warning: While indexing items on search index Default content index, 1 item(s) did not have a view mode configured for one or more "Rendered item" fields.
Comment #21
jbd44 CreditAttribution: jbd44 commentedThese are my versions:
Search API: 8.x-1.0-beta4
Solr search: 8.x-1.0-beta1
I also get:
when trying to create an index.
Comment #22
Jody LynnI believe you will get this error if you have added a new content type since you configured the 'Rendered HTML Output' field in your Search Index Fields config. Go back to the Fields admin page, Edit the Rendered Item field, and re-save it.
I would prefer it just 'default' to 'default'.
Comment #23
ccshannon CreditAttribution: ccshannon commentedThanks @Jody, that did the trick. One thing to note: I hadn't added any additional content types to the index, but honestly I may have been getting that error for a long time on the one content type we index and just not known it. I definitely had changed some field processor settings along the way. But, yes, just editing the Rendered HTML Output field and re-saving all settings removed the error when I re-indexed.