Problem/Motivation

Follow-up to #2356991-12: PostgreSQL: Fix tests in search test group that shows that parts of the search query are run (notably: the count query is created) even when the search term has only 1 character. This sets up problems with query altering and unpredictable behaviour in tests, such as PostgreSQL problems.

Proposed resolution

Do not run search query when there are fewer then 3 characters in the search term [replace 3 with the actual setting value, of course!]. There may also be other conditions when the search query should not run; check for these as well.

The fix is to add a preExecute() method to the SearchQuery extender, which will abort queries if there is a problem. This is the standard way to abort query; SearchQuery was previously using a non-standard method that only aborted in execute() but allowed countQuery() to run.

During the course of patching this, we also noticed that it's vitally important that the Pager extender be added to queries after SearchQuery. This is a fundamental quality of Pager extender -- it needs to be last so that its counts/pagination/limits are correct. So we added a couple of lines to the Pager extender class to document this. It's slightly maybe out of scope for this issue, but it's only a couple of lines of docs and it came up during patching (one of the attempted fixes was to change the order of extenders in NodeSearch, but this didn't work, see failed patch in comment #6, and explanation of why this doesn't work in comment #16).

Remaining tasks

Get patch committed.

User interface changes

no.

API changes

SearchQuery will be more standard and more efficient about no-results cases. But it doesn't actually change the API.

Reference: https://www.drupal.org/core/beta-changes
Issue category Bug because SearchQuery should not be allowing CountQuery to run when it has already determined in the preExecute() step that the query is no good. Its method for aborting execute() was non-standard.
Prioritized changes It's a bug fix. Also makes SearchQuery extender more robust and more standard.
Disruption No disruption.
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

jhodgdon’s picture

Title: Search query has no grouping when less then 3 chars are searched » Search query should not run when search term has less then 3 characters
Issue summary: View changes

Thanks for filing this! Cleaning up issue summary and title.

jhodgdon’s picture

Status: Active » Postponed (maintainer needs more info)

So... I took a look at the code in the SearchQuery class.

Under normal circumstances, I do not see how the query is being run in this case. Here's what the code should/is doing:

execute() calls prepareAndNormalize(), if it hasn't already been called elsewhere.

prepareAndNormalize() calls parseSearchExpression() to parse out the string the user submitted into search words and and/or SQL conditions.

parseSearchExpression() calls parseWord() repeatedly, to actually parse out the words in the submission. In this method, only words longer than \Drupal::config('search.settings')->get('index.minimum_word_size') are added to $this->words and returned. I have verified that the parseWord() method is the only place that anything is added to $this->words.

After calling parseSearchExpression(), prepareAndNormalize() says:

    if (count($this->words) == 0) {
      $this->status |= SearchQuery::NO_POSITIVE_KEYWORDS;
      return FALSE;
    }

It is only after this if() statement that prepareAndNormalize would set $this->normalize to a non-zero value. I have verified that $this->normalize is never set to something other than zero outside this method.

After calling prepareAndNormalize(), execute() says:

    if (!$this->normalize) {
      // There were no keyword matches, so return an empty result set.
      return new StatementEmpty();
    }

So. It looks to me as if with the default setting where anything less than 3 characters is discarded, no query will be run if you enter a search with only 1 character words.

We do have a few tests that change that setting; in this case, the query should be run if it finds any words with length >= the minimum word setting.

So I do not think there is a bug here. I'll just set this to "postponed (maintainer needs more info)". If you can find a case where something is actually wrong, please point out where it is so I can investigate.

Thanks!

andypost’s picture

Status: Postponed (maintainer needs more info) » Needs review
FileSize
1.28 KB

This code is tricky because was introduced #1366020: Overhaul SearchQuery; make search redirects use GET query params for keywords to properly display message about keywords.

Specifically NodeSearch do that in findResults() but the needed result is set in prepareAndNormalize() so it seems that bug in order of execution.

@jhodgdon Just debugged that, patch is just and idea, but to fix that properly I'm going to check that more

Let's see how many tests are affected by that

Status: Needs review » Needs work

The last submitted patch, 3: 2425759-search-3.patch, failed testing.

andypost’s picture

PagerExtender causes a count query to execute, that's why not grouping in previous issue

andypost’s picture

Status: Needs work » Needs review
FileSize
1.45 KB

new patch, the issue caused by count query because PagerSelectExtender runs before SearchQuery extender for query.

Patch just changes an order of extenders so no count query executed.

Steps to reproduce: try to search "a aa" setting breakpoint into \Drupal\search\SearchQuery::countQuery()

No idea how to add tests for that

andypost’s picture

FileSize
1.33 KB

Actually search plugin should care about grouping

Status: Needs review » Needs work

The last submitted patch, 7: 2425759-search-7.patch, failed testing.

The last submitted patch, 6: 2425759-search-6.patch, failed testing.

jhodgdon’s picture

Status: Needs work » Needs review
FileSize
1.93 KB

Ohhhhh... so you're saying that the PagerSelectExtender being added after SearchExtender means that its execute() takes precedence over SearchQuery. Right!

But your patch that switches the order, in #7, doesn't work -- the paging doesn't work. So it looks like the order has to be what it was. Also we cannot remove the grouping, that was necessary to get PostgreSQL to work on the other issue.

So. Given that, I think this can still be fixed. Looking at PagerSelectExtender::execute(), it will abort if preExecute() returns FALSE. That will call preExecute() on SearchQuery and any other extenders, or the base query.

So I think all we will need to do is to add a preExecute() method to SearchQuery, which it should have anyway to make it more standard. Here's a proposed patch. Thoughts?

andypost’s picture

Nice idea! preExecute() exactly for that.
otoh the behaviour of the order is strange and not documented!

+++ b/core/modules/node/src/Plugin/Search/NodeSearch.php
@@ -206,8 +206,8 @@ protected function findResults() {
       ->select('search_index', 'i', array('target' => 'replica'))
-      ->extend('Drupal\search\SearchQuery')
-      ->extend('Drupal\Core\Database\Query\PagerSelectExtender');
+      ->extend('Drupal\Core\Database\Query\PagerSelectExtender')
+      ->extend('Drupal\search\SearchQuery');

When I changed the order of extenders it makes "search" run before "pager" - the order is FILO, check extend() method.

So we also need document this if any search plugins will try to add own extenders

andypost’s picture

FileSize
878 bytes
2.79 KB

Let's fix the order with comment.

Status: Needs review » Needs work

The last submitted patch, 12: 2425759-search-12.patch, failed testing.

bzrudi71’s picture

Nice, seems that patch from #10 makes the groupBy() obsolete. So patch should include the removal of the groupBy() and I will do a another final test run before commit to make sure.

andypost’s picture

Status: Needs work » Needs review
FileSize
1.47 KB
2.94 KB

Added comment and reverted change, in other order of extenders no paging displayed... somehow

jhodgdon’s picture

Status: Needs review » Needs work

It looks like the patch in #15 is exactly the same mine in #9 except for the added comment.. but I am not sure about the comment really.

So. Let's think about this. What happens really on extenders is that when you do something like $query->execute() or $query->countQuery, the last added extender's method gets called first, and it generally adds stuff to the query and then passes the call on to the next extender, and so on down the chain until you get to the innermost select itself.

The Pager extender, during its execute(), invokes the countQuery() method to do a count and set up the query limits. When it does that, it needs to get a count on the actual, final, fully-modified query, not a partially-constructed query. So this is a fundamental property of the PagerExtender: it has to be last in the list of extenders. It is not something specific to a Search query.

So. I think maybe we should take that comment out of NodeSearch, and instead add a note to the PagerSelectExtender class that it must always be the last extender added to any query. Maybe that's a separate issue, but since it was noticed here and really only needs to be one or two lines of documentation, let's just do it now.

Also, in #14 it says we can take out the groupby that was added in the other issue. Should we do that now?

jhodgdon’s picture

Title: Search query should not run when search term has less then 3 characters » Search lacks preExecute(), making countQuery() run even when there are problems in prepare step

Also fixing the title.

andypost’s picture

I'd prefer to change PagerSelectExtender doc block here, because absence of it could cause the bugs

+++ b/core/modules/search/src/SearchQuery.php
@@ -606,8 +615,6 @@ public function countQuery() {
-    // PostgreSQL requires a group by condition to prevent a GROUPING ERROR.
-    $inner->groupBy('i.sid');

the main difference

The final patch for @bzrudi71 to test on postgres, I've no way now

bzrudi71’s picture

Patch in #15 just passed PG bot, nice work!

jhodgdon’s picture

Status: Needs work » Needs review
Issue tags: +PostgreSQL
FileSize
2.91 KB
1.32 KB

Oh yeah I see we have removed the groupby. Ok then all we need to do is put docs in Pager extender. How about this? @bzrudi71 can you test on PostgreSQL? Adding tag for reference.

I am +1 on marking this RTBC, if both @andypost and @bzrudi71 agree.

jhodgdon’s picture

Oh, OK, so we are good with PostgreSQL! The new patch only differs in documentation so it should be fine too.

Either @andypost or @bzrudi71 want to review the docs changes and mark this RTBC?

andypost’s picture

Status: Needs review » Reviewed & tested by the community

I'm +1 rtbc, comments are great!

jhodgdon’s picture

Title: Search lacks preExecute(), making countQuery() run even when there are problems in prepare step » SearchQuery lacks preExecute(), making countQuery() run even when there are problems in prepare step
Issue summary: View changes

Great thanks! I'm adding beta eval to issue summary and updating title slightly again.

jhodgdon’s picture

Issue summary: View changes

Adding note to the summary about why we are adding docs to Pager extender too.

alexpott’s picture

Status: Reviewed & tested by the community » Fixed

Nice. Committed 09aad27 and pushed to 8.0.x. Thanks!

Thanks for adding the beta evaluation to the issue summary.

  • alexpott committed 09aad27 on 8.0.x
    Issue #2425759 by andypost, jhodgdon: SearchQuery lacks preExecute(),...

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.