After I upgraded if I use the word "to" in the search I get the error:
Fatal error: Call to undefined method Zend_Search_Lucene_Search_Query_MultiTerm::getSubqueries() in path \sites\all\modules\luceneapi\luceneapi.module on line 1170

I'm using the latest zend and also using did you mean if that effects it.

Using other small words such as "is, us" etc. doesn't seem to have a problem. Words that contain "to" such as "top" also seem to work fine.

Comments

cpliakas’s picture

Priority: Normal » Critical

Hi Platinum.

Sorry for your troubles, and thank for the isues. One question... are these words configured as stopwords? That would greatly help in narrowing down what is wrong. For a quick fix, check apply the patch at #724074: PHP error on search to prevent the fatal errors. It will be added in the 2.2 release, so it is forward compatible.

Thanks,
Chris

Platinum’s picture

Status: Active » Fixed

Ah yep, that seems to fix it. I was actually trying to find something similar yesterday, sorry for not noticing that thread.

drunken monkey’s picture

Couldn't this problem arise because Lucene thinks the "to" is part of a ranged query? I checked, and the same happens e.g. when you use "and" as the last search key.
I don't think that the real problem behind this can be fixed, since it stems from the ability to use the complete Lucene query syntax. If you let users utilise Lucene's full power, they have to mind the right syntax, i.e. avoid using reserved words like "to" without quotes (e.g., welcome "to" the jungle instead of welcome to the jungle works fine). Otherwise, you could just parse the query beforehand and quote all words or something like this.
Of course, the fatal error can be surpressed, but normal searches including "to", "and", etc., still won't work properly.
So, maybe displaying some information about the Lucene query syntax would be a better "bugfix" here?

cpliakas’s picture

Status: Fixed » Active

Hi drunken monkey.

Thanks for your input, but I respectfully disagree and am confident this is not the case. If you take a closer look at the Lucene query syntax, using the word "to" doesn't constitute a range query by itself. You have to wrap your range in brackets and capitalize the TO, otherwise a range query won't be detected. An example of a valid range query is nid:[1 TO 5], but the following will not be picked up as a range query: nid:1 to nid:5 nid:[1 to 5].

Furthermore, even if the query parser did interpret this as a range query, you would not see a fatal error because the query parser wraps everything in a boolean query. As a result, the getSubqueries() method would be available. The real problem here is that you search only for terms that are in the stopwords list, the analyzer will strip these out from the keys and pass an empty string to the parser. As a result, an "insignificant" query object is returned form the parser which does not have a getSubqueries() method. This is my fault, as I do not have enough defensive coding to pick up this use case.

Thanks,
Chris

cpliakas’s picture

cpliakas’s picture

Status: Active » Fixed
drunken monkey’s picture

Sorry to dig this up again, maybe it's a moot point anyway, but I still don't think your explanation covers the entire problem. At least it doesn't seem to correspond with the results I get from various queries.
For one thing, I didn't have any words in my stopword list. But if I enter e.g. "is an" as stopwords, and then search for these keys, no fatal error occurs. Instead, the warning "You must include at least one positive keyword with 0 characters or more." is displayed, which seems reasonable enough. However, "is an to", "is to an" or "is an and" always result in fatal errors (no matter whether some or all of the keys are stopwords). "[is to an]" or 'is "to" an' (with "to" quoted, which shouldn't matter for stopwords) work fine, however.
Also, with the patch applied, "is to an" doesn't display the aforementioned warning if all three keys are stopwords, which also indicates that this issue occurs before stopwords are considered.
If the Lucene query syntax allows such "incomplete" range queries to be successfully parsed with the "to" as a normal keyword, then this may be a bug in the parser and should maybe filed as an issue there. But I'm pretty certain that it doesn't depend on stopwords.
As said, maybe it doesn't matter anyways, as long as the errors are prevented, but I thought it could still be helpful for the future (or a true fix for the issue, if it really is a bug in the parser).

cpliakas’s picture

Hi drunken monkey.

Thanks for bringing it up again. If there is a problem and I am not diagnosing it correctly, I appreciate someone such as yourself bringing it to the forefront. I've done a lot of digging around here as a result of your posts, and there is something going on with the parser when executing the searches illustrated in your detailed use cases above. I am still not seeing anything with range queries (I am var_dump()ing the query object after parsing), however I am seeing that queries are NOT wrapped in a boolean query when the words "and" or "to" are searched for, which is inconsistent with how all other queries are parsed. Instead, a multiterm query is returned, which IMHO is a bug in the parser. Looking at your use cases above, it makes sense that all your searches will produce fatal errors because the multiterm object doesn't have the getSubqueries() method available. If you have some time, take a look at the query objects that are returned by the Zend_Search_Lucene_Search_QueryParser::parse() method for various searches. I think you will find it very interesting as to what it produces. I definitely had a few WTF moments when looking at the results.

In terms of the "critical bug", the fixes in commit #334096 do prevent the fatal errors from happening and was applied to the 2.2 release. It ultimately normalizes the query object so that it is wrapped in a boolean query if the words "and" or "to" are searched for. This is a very important first step, but I really like the direction you are going with your patches at #730064: Query won't execute if first key is a stopword because it handles these use cases much more gracefully and displays a more user friendly message as to what is going on. I definitely want to apply them towards the 6.x-2.3 release.

Thanks for all your hard work and input regarding this issue. It is very much appreciated.
~Chris

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.