I have installed the Module Seach API with the Database Search and the Search-Pages. The Database Service is configured to search on parts of a word.

I have added an index for taxonomy terms (to search for names of persons) in which are only included the fields name (boost 13) and description (boost 1). Because all the descriptions are empty there is only the name of the terms to search in. Searching for the string "schmid" via a search page gives me the results in the following order:

Claudia Schmid
Georg Schmidt
Manfred C. Schmidt
Silvia Schmid
Peter A. Schmidt
Jochen Schmidt
Thorsten Schmidt
Rainer Schmidt
Dagmar Isabell Schmidbauer
Bernhard Kleinschmidt
Dietmar Schmidt
Niklaus Schmid
Sibylle Schmidt
Beate Schmidt
Andreas Schmidt
Christina Schmidtke

Doing the same search with a view (Filter: Fulltext search (exposed); Sort: Relevance (asc)) yields the following order of the results:

Dagmar Isabell Schmidbauer
Bernhard Kleinschmidt
Dietmar Schmidt
Niklaus Schmid
Sibylle Schmidt
Beate Schmidt
Andreas Schmidt
Christina Schmidtke
Claudia Schmid
Georg Schmidt
Manfred C. Schmidt
Silvia Schmid
Peter A. Schmidt
Jochen Schmidt
Thorsten Schmidt
Rainer Schmidt

For another test I created another index of Type "Multible Types" in which I only included taxonomy terms with the fields name (boost 13) and description (boost 1). Searching the index with another search page I get a third order of the search results:

Manfred C. Schmidt
Dagmar Isabell Schmidbauer
Sibylle Schmidt
Silvia Schmid
Bernhard Kleinschmidt
Christina Schmidtke
Thorsten Schmidt
Claudia Schmid
Georg Schmidt
Rainer Schmidt
Beate Schmidt
Peter A. Schmidt
Dietmar Schmidt
Andreas Schmidt
Jochen Schmidt
Niklaus Schmid

So my questions are:
- How does Seach API calculate the order of the search results?
- Why is the order different depending on using searchpages/views respectively different search indexes?
- Can this order be influenced by any means?
- Especially: Wow can I get the search results in the order I would have expected:

First the best matches (those that match exactly the search string as a word):

Claudia Schmid / Silvia Schmid / Niklaus Schmid

followed by the smallest difference to the serach string:

Georg Schmidt / Manfred C. Schmidt / Peter A. Schmidt / Jochen Schmidt/ Thorsten Schmidt / Rainer Schmidt /
Dietmar Schmidt / Niklaus Schmid / Sibylle Schmidt / Beate Schmidt / Andreas Schmidt

then difference getting bigger:

Christina Schmidtke / Dagmar Isabell Schmidbauer / Bernhard Kleinschmidt

Comments

smitty created an issue. See original summary.

drunken monkey’s picture

Project: Search API » Search API Database Search
Version: 7.x-1.16 » 7.x-1.x-dev
Component: Framework » Code
Status: Active » Fixed

Sorting by relevance works (more or less) just by counting how often the search term is contained within a result item (weighted by the boost of the fields it appears in). If each result only contains the search term once (and in the same field), each result will have the same score and ordering will be completely random, depending on the internals of your database server.

It seems you have enabled the "Search on parts of a word" option. With that, any words containing the search term will be found. Due to the mechanism used to achieve this, it is not possible to order "exact" matches higher than partial matches. If you want this, you'll have to implement it yourself, probably re-implementing partial matching with some other method. As a starting point you might want to have a look at how Apache Solr's NGramFilterFactory works.

Or, for specifically this use case, you might want to do it by getting all results and then manually sorting in PHP using something like the Levenshtein distance.

In any case, this is currently not supported by the DB backend and will certainly need custom code.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.