There's a common misunderstanding of (e)dismax which is the default query parser used by Serch API Solr Search. Even the multilingual backend adds more query fields using the qf parameter. But that leads to fewer or no results in some cases. Especially if people add more fulltext fields to the index and configure an exposed fulltext search in views that queries multiple fields.
See https://opensourceconnections.com/blog/2013/04/15/querying-more-fields-m...
Querying More Fields != More Results
The Search API database doesn't know anything about a dismax concept at all. And the Solr "Any Schema Backend" might hit a select handler that is configured to any kind of strange combinations of query parsers. So lets force explicit query handlers and build an edismax query the way most people understand the documentation.
Example: If we send the parameters q=(+a +b) and qf=field1^50 field2, the edismax query parser creates something like
DISMAX(field1:a^50 OR field2:a) AND DISMAX(field1:b^50 OR field2:b)
But what we and most users expect is something like
DISMAX((field1:a AND field1:b)^50 OR (field2:a AND field2:b))
That would mostly be what the database backend does with a benefit of a edismax on top.
At the moment it is very easy to end up with zero results if you simply add many fulltext fields, or if you use the multilingual backend and enter stopwords, or whenever you do something that is too sophisticated for edismax query parser.
On the other hand it will be a huge amount of work an an API break to adjust the code to use the standard query parser again including all its disadvantages.
Therfore I suggest to keep the current erroneous usage of the solarium edismax component in our hooks and to tweak the query right before sending it to Solr.
Maybe we can add a switch to turn that tweaking off for real experts that don't use Views but just the API.
| Comment | File | Size | Author |
|---|---|---|---|
| #5 | 2948469.patch | 11.13 KB | mkalkbrenner |
| #3 | 2948469.patch | 4.21 KB | mkalkbrenner |
Comments
Comment #2
mkalkbrennerComment #3
mkalkbrennerHere's a first patch. Testers welcome!
Comment #4
mkalkbrennerUnfortunately this patch only solves the issue for Solr versions up to 7.1.
For 7.2 it doesn't work anymore:
see https://lucene.apache.org/solr/guide/7_2/solr-upgrade-notes.html
But that change introduces another critical issue as well. We already use different query parsers within an edismax, especially in the location searches. These will lead to an error if you replace the currently preset
luceneMatchVersion=7.0byluceneMatchVersion=7.2.Comment #5
mkalkbrennerOK, I swapped the edismax parser for the standard lucene parser - the only only one that gives us full flexibility. Therefor I refactored the query builder. Most users shouldn't notice a difference, except that they get better(!) search results.
But for advanced users I'll re-add edismax on demand in a follow-up issue.
Comment #7
mkalkbrenner