Problem/Motivation
Follow-up to #2356991-12: PostgreSQL: Fix tests in search test group that shows that parts of the search query are run (notably: the count query is created) even when the search term has only 1 character. This sets up problems with query altering and unpredictable behaviour in tests, such as PostgreSQL problems.
Proposed resolution
Do not run search query when there are fewer then 3 characters in the search term [replace 3 with the actual setting value, of course!]. There may also be other conditions when the search query should not run; check for these as well.
The fix is to add a preExecute() method to the SearchQuery extender, which will abort queries if there is a problem. This is the standard way to abort query; SearchQuery was previously using a non-standard method that only aborted in execute() but allowed countQuery() to run.
During the course of patching this, we also noticed that it's vitally important that the Pager extender be added to queries after SearchQuery. This is a fundamental quality of Pager extender -- it needs to be last so that its counts/pagination/limits are correct. So we added a couple of lines to the Pager extender class to document this. It's slightly maybe out of scope for this issue, but it's only a couple of lines of docs and it came up during patching (one of the attempted fixes was to change the order of extenders in NodeSearch, but this didn't work, see failed patch in comment #6, and explanation of why this doesn't work in comment #16).
Remaining tasks
Get patch committed.
User interface changes
no.
API changes
SearchQuery will be more standard and more efficient about no-results cases. But it doesn't actually change the API.
Issue category | Bug because SearchQuery should not be allowing CountQuery to run when it has already determined in the preExecute() step that the query is no good. Its method for aborting execute() was non-standard. |
---|---|
Prioritized changes | It's a bug fix. Also makes SearchQuery extender more robust and more standard. |
Disruption | No disruption. |
Comment | File | Size | Author |
---|---|---|---|
#20 | interdiff.txt | 1.32 KB | jhodgdon |
#20 | 2425759-search-19.patch | 2.91 KB | jhodgdon |
Comments
Comment #1
jhodgdonThanks for filing this! Cleaning up issue summary and title.
Comment #2
jhodgdonSo... I took a look at the code in the SearchQuery class.
Under normal circumstances, I do not see how the query is being run in this case. Here's what the code should/is doing:
execute() calls prepareAndNormalize(), if it hasn't already been called elsewhere.
prepareAndNormalize() calls parseSearchExpression() to parse out the string the user submitted into search words and and/or SQL conditions.
parseSearchExpression() calls parseWord() repeatedly, to actually parse out the words in the submission. In this method, only words longer than \Drupal::config('search.settings')->get('index.minimum_word_size') are added to $this->words and returned. I have verified that the parseWord() method is the only place that anything is added to $this->words.
After calling parseSearchExpression(), prepareAndNormalize() says:
It is only after this if() statement that prepareAndNormalize would set $this->normalize to a non-zero value. I have verified that $this->normalize is never set to something other than zero outside this method.
After calling prepareAndNormalize(), execute() says:
So. It looks to me as if with the default setting where anything less than 3 characters is discarded, no query will be run if you enter a search with only 1 character words.
We do have a few tests that change that setting; in this case, the query should be run if it finds any words with length >= the minimum word setting.
So I do not think there is a bug here. I'll just set this to "postponed (maintainer needs more info)". If you can find a case where something is actually wrong, please point out where it is so I can investigate.
Thanks!
Comment #3
andypostThis code is tricky because was introduced #1366020: Overhaul SearchQuery; make search redirects use GET query params for keywords to properly display message about keywords.
Specifically NodeSearch do that in
findResults()
but the needed result is set inprepareAndNormalize()
so it seems that bug in order of execution.@jhodgdon Just debugged that, patch is just and idea, but to fix that properly I'm going to check that more
Let's see how many tests are affected by that
Comment #5
andypostPagerExtender causes a count query to execute, that's why not grouping in previous issue
Comment #6
andypostnew patch, the issue caused by count query because
PagerSelectExtender
runs beforeSearchQuery
extender for query.Patch just changes an order of extenders so no count query executed.
Steps to reproduce: try to search "a aa" setting breakpoint into
\Drupal\search\SearchQuery::countQuery()
No idea how to add tests for that
Comment #7
andypostActually search plugin should care about grouping
Comment #10
jhodgdonOhhhhh... so you're saying that the PagerSelectExtender being added after SearchExtender means that its execute() takes precedence over SearchQuery. Right!
But your patch that switches the order, in #7, doesn't work -- the paging doesn't work. So it looks like the order has to be what it was. Also we cannot remove the grouping, that was necessary to get PostgreSQL to work on the other issue.
So. Given that, I think this can still be fixed. Looking at PagerSelectExtender::execute(), it will abort if preExecute() returns FALSE. That will call preExecute() on SearchQuery and any other extenders, or the base query.
So I think all we will need to do is to add a preExecute() method to SearchQuery, which it should have anyway to make it more standard. Here's a proposed patch. Thoughts?
Comment #11
andypostNice idea!
preExecute()
exactly for that.otoh the behaviour of the order is strange and not documented!
When I changed the order of extenders it makes "search" run before "pager" - the order is FILO, check extend() method.
So we also need document this if any search plugins will try to add own extenders
Comment #12
andypostLet's fix the order with comment.
Comment #14
bzrudi71 CreditAttribution: bzrudi71 commentedNice, seems that patch from #10 makes the groupBy() obsolete. So patch should include the removal of the groupBy() and I will do a another final test run before commit to make sure.
Comment #15
andypostAdded comment and reverted change, in other order of extenders no paging displayed... somehow
Comment #16
jhodgdonIt looks like the patch in #15 is exactly the same mine in #9 except for the added comment.. but I am not sure about the comment really.
So. Let's think about this. What happens really on extenders is that when you do something like $query->execute() or $query->countQuery, the last added extender's method gets called first, and it generally adds stuff to the query and then passes the call on to the next extender, and so on down the chain until you get to the innermost select itself.
The Pager extender, during its execute(), invokes the countQuery() method to do a count and set up the query limits. When it does that, it needs to get a count on the actual, final, fully-modified query, not a partially-constructed query. So this is a fundamental property of the PagerExtender: it has to be last in the list of extenders. It is not something specific to a Search query.
So. I think maybe we should take that comment out of NodeSearch, and instead add a note to the PagerSelectExtender class that it must always be the last extender added to any query. Maybe that's a separate issue, but since it was noticed here and really only needs to be one or two lines of documentation, let's just do it now.
Also, in #14 it says we can take out the groupby that was added in the other issue. Should we do that now?
Comment #17
jhodgdonAlso fixing the title.
Comment #18
andypostI'd prefer to change PagerSelectExtender doc block here, because absence of it could cause the bugs
the main difference
The final patch for @bzrudi71 to test on postgres, I've no way now
Comment #19
bzrudi71 CreditAttribution: bzrudi71 commentedPatch in #15 just passed PG bot, nice work!
Comment #20
jhodgdonOh yeah I see we have removed the groupby. Ok then all we need to do is put docs in Pager extender. How about this? @bzrudi71 can you test on PostgreSQL? Adding tag for reference.
I am +1 on marking this RTBC, if both @andypost and @bzrudi71 agree.
Comment #21
jhodgdonOh, OK, so we are good with PostgreSQL! The new patch only differs in documentation so it should be fine too.
Either @andypost or @bzrudi71 want to review the docs changes and mark this RTBC?
Comment #22
andypostI'm +1 rtbc, comments are great!
Comment #23
jhodgdonGreat thanks! I'm adding beta eval to issue summary and updating title slightly again.
Comment #24
jhodgdonAdding note to the summary about why we are adding docs to Pager extender too.
Comment #25
alexpottNice. Committed 09aad27 and pushed to 8.0.x. Thanks!
Thanks for adding the beta evaluation to the issue summary.