When the data source of an index contains zero item, running drush sapi-i prints out:

Couldn't index items. Check the logs for details.

Of course no messages are in the logs, because there were no actual error.

This is just wrong. Let's fix it.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Damien Tournoud’s picture

Status: Active » Needs review
FileSize
1.25 KB

I suggest we just remove the condition on zero items indexed. This is just not an error.

Damien Tournoud’s picture

Title: Deceptive error message when no item is indexable » Deceptive error message when no items are indexable

Status: Needs review » Needs work

The last submitted patch, 1992684-search-api-zero-index.patch, failed testing.

Damien Tournoud’s picture

I see that there are actual tests for this, but they are most likely testing the wrong thing.

drunken monkey’s picture

I suggest we just remove the condition on zero items indexed. This is just not an error.

It is an error if there are items to index. The test is therefore correct, too. What you want is a special case for when there are no items to be indexed. The UI doesn't allow indexing in this situation, probably Drush should just do the same.

Damien Tournoud’s picture

It is an error if there are items to index. The test is therefore correct, too.

Well, that's why we have this condition:

if (!empty($results['not indexed'])) {
}

I haven't checked, but I'm pretty sure the test would trigger this condition too?

"There is precisely zero item indexed" is a pretty poor error checking condition, as it can really only happen in very very very specific cases (like a fatal PHP error?).

drunken monkey’s picture

"There is precisely zero item indexed" is a pretty poor error checking condition, as it can really only happen in very very very specific cases (like a fatal PHP error?).

Not at all. First of all, a fatal PHP error would terminate the page request, and no message at all would be displayed. But if the server configuration is wrong, or the (Solr) server couldn't be reached or in other such cases, indexing will fail completely, not just for some items. And in those cases, it's much more user-friendly to tell users that indexing failed completely instead of displaying "X items could not be indexed" and leaving them to conclude that those were all of the items. If indexing only failed for some items, it's much more likely that something is wrong with the data (or the code handling specific data) – so the two messages tell us two different things.

Damien Tournoud’s picture

@drunken monkey: well, in that case at the minimum we should store somewhere in the state that indexing has been attempted at all. Due to various race conditions, having indexed zero items can be totally legitimate (say you delete an entity between the batch being registered and the batch executing).

As I said, it's a poor error checking mechanism.

drunken monkey’s picture

You are welcome to provide a patch for that.

jazzslider’s picture

Issue summary: View changes

So I'd like to take a stab at explaining why this messaging is problematic. I apologize in advance if there's something significant I'm missing in the following explanation; please feel free to correct me as needed.

The error message "Couldn't index items. Check the logs for details." very strongly implies two things:

  1. The word "couldn't" implies that there was a failure or an error —i.e., we tried to index some items but some sort of error condition stopped us from doing so.
  2. Directing the user to "check the logs for details" strongly implies that when they do so, they'll find some additional logs about what that error condition was.

However, if you look at the code here, one of those implications is essentially completely false, and the other is often false as well. With my own comments added:

if ($success) {
  // The indexing operation was "successful".
  if (!empty($results['indexed'])) {
    // There were some items indexed, and the messaging in this branch of the conditional reflects that correctly.
  } else {
    // This is where our error message comes from.  It's still part of `if ($success)`, and the only thing that's
    // gone "wrong" is that nothing was indexed (presumably because nothing _needed_ to be indexed).
    $drupal_set_message($t("Couldn't index items. Check the logs for details."), 'error');
  }
} else {
  // This, in contrast, is where $success is false, meaning an actual error actually occurred.
}

Since the "Couldn't index items" message occurs in cases where the indexing operation _didn't_ encounter any errors, then it seems like we ought to re-think the messaging a bit to remove the confusing implications. Some ideas, which may need tweaking for complete semantic accuracy, but you get the idea:

* $drupal_set_message($t("No items needed to be indexed. Drupal logs may contain more information."), 'status');
* $drupal_set_message($t("The indexing operation ran successfully and found no items matching this index's indexing criteria. Drupal logs may contain more information."), 'status');

Again, I'm not 100% certain I've accounted for all the possibilities here, but based on how the code itself reads as well as some of our own troubleshooting on this, I tend to like the second suggestion above more than just about anything. The existing message just causes unnecessary anxiety about whether indexing is actually working, and I'd love to see it changed to avoid that confusion.

If you agree, please let me know if you'd like me to make any additional messaging changes, and I'll be happy to roll a patch.

drunken monkey’s picture

I completely disagree with that. The $success has little to do with the actual indexing process, but comes from the batch processing. I.e., if $success is TRUE nothing went wrong from what the Batch API knows (e.g., you get FALSE there if you leave the page while the batch is running). Since we do (at least in this case) a good job of catching our exceptions, indexing errors will almost never lead to $success = FALSE.

Instead, if, e.g., Solr could not be reached, or there is another basic error that prevents indexing completely, there is a SearchApiException which we catch in _search_api_batch_indexing_callback(). Unless some items were already indexed before the state changed in a way that made indexing impossible (which is unlikely, but can of course happen), this will result in no items being indexed and thus, in my eyes correctly, in the "Couldn't index items. Check the logs for details." error message.

The case that no exception occurred, but still no item was indexed successfully, can only occur under special conditions. If you have those frequently, please describe how to reproduce this and why the error message is confusing in this case. Then we can maybe just add some code when catching the exception to distinguish between the two cases.

When no items need to be indexed, you can either not even execute the batch operation or (when processors like the "Bundle filter" remove all the items) they will still be counted as "indexed" and won't trigger that error message.

Damien's example of deleting the (sole) item just as the batch is started is the only realistic condition I can think of, and I'd say it's far too improbable to change the code for that.

janusman’s picture

Status: Needs work » Needs review
FileSize
1.5 KB

Here's my proposal.

My assumptions are that, at the end of the batch, this is the information we have:

  • $results["indexed"] is a count of items **added** to the index.
  • $results["indexed"] could be '0' under many circumstances, say, if we processed a batch of 1 item where that item was filtered out by the Bundle filter.
  • $results["not indexed"] just means "count of items not added to the index". It basically equals [total items processed] - $results["indexed"], regardless of any exceptions being triggered or not
  • During 'error-less' indexing we could have either of the above numbers be zero or more.
  • Also, these numbers don't tell us if there were 'errors' (exceptions) during indexing.

Because of this, we should never output anything with 'error' level at the end of the batch... we just don't have the needed information.

The patch, thus:

  • Eliminates the message 'no items indexed' as a separate condition... it made no sense as a separate condition since both "indexed" and "not indexed" numbers can always be >=0 (and empty(0) === TRUE).
  • Logs a message of counts for both indexed/non-indexed items for a batch.
  • Changes the "Could not index items" message to a more neutral "Items were not indexed" wording, and is output as a 'warning' level message.
  • $success = false directly means a Batch API failure, so that error message is also expanded.

Status: Needs review » Needs work

The last submitted patch, 12: 1992684-12-search-api-zero-index.patch, failed testing.

drunken monkey’s picture

$results["indexed"] could be '0' under many circumstances, say, if we processed a batch of 1 item where that item was filtered out by the Bundle filter.

That's simply not true, as I've now explained multiple times in this issue. Please try to understand the code before proposing changes.

legolasbo’s picture

Status: Needs work » Closed (outdated)

This issue has not seen activity in over 2,5 years. I am therefore closing this issue to clean up the issue queue. Feel free to re-open and update this issue if you feel this issue is still relevant and of importance.

klausi’s picture

This error message can be caused by a race condition when search_api_cron() is running at the same time as drush sapi-i. Adding this to our bash script:

# Sometimes indexing happens at the same time as search_api_cron() and then we
# get a meaningless error message because 0 items were indexed. Just suppress
# this warning because Thomas does not want to fix it in
# https://www.drupal.org/project/search_api/issues/1992684
drush --pipe sapi-i job_search 200 2>&1 | grep -v "Couldn't index items. Check the logs for details."