Searching for multiple words using "OR" condition is actually impossible due to this block of code here:

        if ($mul_words) {
          $db_query->having('COUNT(DISTINCT t.word) >= ' . $var, array($var => $subs));
        }
        else {
          $db_query->having('COUNT(DISTINCT t.word) >= ' . $var, array($var => $subs));
        }

1) When multiple words are provided to filter results using an OR condition, any matching word should result in a hit.
However, this block of code requires at least $var hits, where $var is the number of words provided.
Effectively, this transforms the query into an "AND", since the only hits will be those records which match all inputs.

2) The if() condition is unnecessary since the behavior is always the same.

Comments

aaronbauman’s picture

Here is the generated query:

SELECT COUNT(*) AS expression 
FROM (
  SELECT DISTINCT t.item_id AS item_id, '1' AS score, 1 AS expression
  FROM search_api_db_drealty_search t WHERE 
  (( (t.item_id IN (
    SELECT t.item_id AS item_id 
    FROM (
      SELECT t.* FROM search_api_db_drealty_search_field_listing_property_type t 
      WHERE ( (word = 'residential') OR (word = 'rental') )) t 
    GROUP BY t.item_id HAVING (COUNT(DISTINCT t.word) >= '2') )) ))) subquery

As you can see, the inner-most query `t` uses an "OR" condition, but the `subquery` requires two hits for any match.

drunken monkey’s picture

Status: Active » Needs review
StatusFileSize
new3.38 KB

Thanks for reporting this issue!
You are doubtlessly correct about the if being useless, that really doesn't make any sense. It seems to have been introduced in the early stages of the module, with this change (3e930208):

@@ -755,12 +758,13 @@ class SearchApiDbService extends SearchApiAbstractService 
         $db_query->addExpression('SUM(t.score)', 'score');
         $db_query->groupBy('t.item_id');
       }
-      if ($conj == 'AND') {
+      if ($conj == 'AND' && $subs > 1) {
+        $var = ':subs' . ((int) $subs);
         if ($mul_words) {
-          $db_query->having('COUNT(DISTINCT word) = :subs', array(':subs' => $s
+          $db_query->having('COUNT(DISTINCT t.word) >= ' . $var, array($var => 
         }
         else {
-          $db_query->having('COUNT(word) = :subs', array(':subs' => $subs));
+          $db_query->having('COUNT(DISTINCT t.word) >= ' . $var, array($var => 
         }
       }
     }

So maybe we should just remove the DISTINCT in the else branch? It makes sense: if there aren't multiple words, there will be just a single "distinct" word. There is some weird magic with nested keys going on, though, so I'm not entirely sure. God give I'd still understand my own code … I guess, for this module I never really did.

Anyways, that being said, I can't really find any actual bug that occurs related to this. (As said, my interpretation would be that the DISCTINCT is unnecessary, but not problematic.) Attached are some additional tests (which we could of course add in any case), all of which pass without a problem (with or without the second DISTINCT).
Can you come up with a test that fails? How did you generate the above query?

drunken monkey’s picture

Issue summary: View changes
Status: Needs review » Fixed

I now committed the patch with the additional tests (and removed the DISTINCT in the else branch), can't do any harm to have them.
If you can provide more information about what you think is actually going wrong, please re-open.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.