Highlighter erroneously overwrites excerpt.snippets and excerpt.fragsize configs [#2881587]

Comment	File	Size	Author
#8	2881587.patch	4.25 KB	mkalkbrenner
#5	search_api_solr-2881587-5.patch	1.17 KB	les lim
#2	search_api_solr-2881587-2.patch	927 bytes	les lim

Comment #1

26 May 2017 at 04:57

Les Lim created an issue. See original summary.

Log in or register to post comments

Comment #2

les lim

he/they

English

commented 26 May 2017 at 04:58

Status:

Active

» Needs review

Status	File	Size
new	search_api_solr-2881587-2.patch	927 bytes

Patch attached.

Log in or register to post comments

Comment #3

mkalkbrenner

German

🇩🇪

commented 26 May 2017 at 14:38

Status:

Needs review

» Needs work

From a Search API perspective, snippets and fragsize are configurable for the excerpt, not the highlighting.
We try to keep the Solr backend compatible to the DB backend.
Which kind of error are you facing?

Log in or register to post comments

Comment #4

les lim

he/they

English

commented 27 May 2017 at 06:33

Status:

Needs work

» Closed (works as designed)

Oh, I misunderstood the difference between the Highlight retrieved data checkbox and the Return an excerpt for all results checkbox in the Solr server Drupal interface. Now that look at it more closely I didn't really want the highlighting, I just wanted the excerpting.

Having highlighting turned on appeared to help a problem I was having with the excerpts and stemming, but I ended up resolving that in schema.xml.

There is still a bug here, but I'll open a new issue.

Log in or register to post comments

Comment #5

les lim

he/they

English

commented 27 May 2017 at 06:49

Status:

Closed (works as designed)

» Needs review

Status	File	Size
new	search_api_solr-2881587-5.patch	1.17 KB

1 file was hidden/shown/deleted

Status	File	Size
hidden	search_api_solr-2881587-2.patch	927 bytes

Actually, no, it's this same issue.

When both highlighting and excerpting are enabled, the highlight preparation clobbers the excerpt field (f.spell) preparation, so the f.spell.hl.* parameters are never set. I still get excerpts, but they are prepared with hl.snippets=1&hl.fragsize=0, as used for highlighting purposes. That ends up showing the whole document.

The problem is $hl->setFields('*');, which unsets the previously prepared excerpt field. The new patch uses the addField() method instead against individual text fields.

Log in or register to post comments

Comment #6

mkalkbrenner

German

🇩🇪

commented 27 May 2017 at 14:14

Status:	Needs review	» Needs work
Issue tags:		+Needs tests

That seems like a good approach to work around the missing support for wildcards.
But I think we have to consider more field prefixes and special fields, for example

ts_*
tm_*
tus_*
tum_*
tos_*
tom_*
tes_*
tem_*
tws_*
twm_*

I think we should introduce a function for this. Maby getFulltextFields is part of the solution.
Additional fields should be configurable.

And we need to deal with third party indexes which we'll break if we remove '*'. See #2881369: Make the backend more robust concerning unexpected schemas.

But again, I really like the idea to get rid of this @todo in the code :-)

BTW what did you solve my modifications to the schema.xml?

Log in or register to post comments

Comment #7

les lim

he/they

English

commented 27 May 2017 at 22:30

To clarify, you want something like a `protected function getSolrFulltextFields()` that gets the Solr names of fulltext properties for an index, but uses `getFulltextFields()` instead of looking for a dynamic field prefix?

The schema.xml change was to add `SnowballPorterFilterFactory` to the textSpell field type. WIthout it, I was getting stemmed hits in my search results, but I'd only get excerpts for literal matches.

Log in or register to post comments

Comment #8

mkalkbrenner

German

🇩🇪

commented 29 May 2017 at 09:48

Title:	Highlighter does not respect excerpt.snippets or excerpt.fragsize	» Highlighter erroneously overwrites excerpt.snippets and excerpt.fragsize configs
Status:	Needs work	» Needs review

Status	File	Size
new	2881587.patch	4.25 KB

1 file was hidden/shown/deleted

Status	File	Size
hidden	search_api_solr-2881587-5.patch	1.17 KB

The problem is $hl->setFields('*');, which unsets the previously prepared excerpt field. The new patch uses the addField() method instead against individual text fields.

You're right. This is the important detail.

I extended the patch to also solve the already existing todos in this area :-)

The schema.xml change was to add `SnowballPorterFilterFactory` to the textSpell field type. WIthout it, I was getting stemmed hits in my search results, but I'd only get excerpts for literal matches.

That will break the spell / autocomplete feature. But there's already another issue for this topic: #2735881: Solr field spell is not suitable for generating excerpts

Log in or register to post comments

Comment #9

mkalkbrenner

German

🇩🇪

commented 29 May 2017 at 10:09

Issue tags:

-Needs tests

Since the existing tests still pass I'll merge this patch.
https://travis-ci.org/mkalkbrenner/search_api_solr/builds/237105725

Log in or register to post comments

Comment #10

29 May 2017 at 10:10

mkalkbrenner committed 8346f0f on 8.x-1.x

Issue #2881587 by Les Lim, mkalkbrenner: Highlighter erroneously...

Log in or register to post comments

Comment #11

mkalkbrenner

German

🇩🇪

commented 29 May 2017 at 10:11

Status:

Needs review

» Fixed

I'm looking forward to get your feedback on the commit.

Log in or register to post comments

Comment #12

12 June 2017 at 10:15

Status:

Fixed

» Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

Log in or register to post comments

Highlighter erroneously overwrites excerpt.snippets and excerpt.fragsize configs

Comments

Comment #1

Comment #2

Comment #3

Comment #4

Comment #5

Comment #6

Comment #7

Comment #8

Comment #9

Comment #10

Comment #11

Comment #12

Referenced by

News items

Our community

Documentation

Drupal code base

Governance of community