Follow-up of #2679773: Indexing fails to load some entities, entity paths end with ':en'..

In order to debug my errors from #2920943: Improve error handling and reliability on indexing I only enabled indexing for one content type (Index-Bundle-Option: None except those selected: Article). On indexing I get following error message:

Could not load the following items on index Default content index: "entity:node/1318:de", "entity:node/1319:de" ...

I can't see what is wrong with these nodes, they shouldn't be there anyway as they are from another content type, not included in indexing.

I got this error with 8.x-1.5 and also with current dev version. I run also "Clear all indexed data" and "Queue all items for reindexing".

Temporary Workarounds

Option 1: Readonly/NotReadonly Index + Reindex

This option sets the index to be readonly, then back, then clears index, then reindexes. See comment #2 which introduced this. Steps are:

  1. Edit your search index, under "Index options" change "Read only" from unchecked to being checked and save
  2. Edit your search index again, under "Index options" change "Read only" from checked to being unchecked and save
  3. Go to the View tab of the search index and click "Clear all indexed data"
  4. Go to the View tab of the search index and click "Index now"

Option 2: Disable/Enable Index + Reindex

This option sets the index to be disabled, then back, then clears index, then reindexes. See comment #5 which introduced this. Steps are:

  1. Edit your search index, change "Enabled" from being checked to being unchecked and save
  2. Edit your search index, change "Enabled" from being unchecked to being checked and save
  3. Go to the View tab of the search index and click "Clear all indexed data"
  4. Go to the View tab of the search index and click "Index now"

Option 3: Destroy data in index + Reindex (Risky!)

This option completely destroys all data in the search index, then reindexes. You'll need to adjust the curl command according to your SOLR setup.

WARNING! Use this at your own risk! Do not use this in production.

  1. curl -X POST -H 'Content-Type: application/json' http://localhost:8983/solr/drupal/update?commit=true -d '{"delete":{"query":"*:*"}}'
  2. Go to the View tab of the search index and click "Clear all indexed data"
  3. Go to the View tab of the search index and click "Index now"
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

osopolar created an issue. See original summary.

abramm’s picture

Hi,
You may be experiencing an issue with your search api tracking table.
Can you please try the following steps to see if it helps:

1) Edit index.
2) Check 'Read only' (found under 'Index options').
3) Save.
4) Edit again, uncheck 'Read only', save again.
5) Clear all data and reindex.

Note that's a way to fix broken index status but not a real solution; it's still unclear why your index became broken.

osopolar’s picture

I deleted all items from search_api_item table (delete from search_api_item;) disabled and re-enabled the index to get the "Track now" Button on the index view. Indexing the items brought them back to search_api_item table.

It might be better if there would be a button beside "Clear all indexed data" for that or it might be done on "Clear all indexed data".

Edit:
I had all content type bundles enabled for indexing, but experienced some bugs. So I decided for debugging purposes to enable just one at the time to find the cause of the errors.

abramm’s picture

Hi osopolar,
What you did lead to the same result I suggested, just in a different way. So did it help?

Also can you please confirm that de-selecting some bundles leads to search api items tracking table breakage?

osopolar’s picture

Yes, there is some weird behavior. I checked again and found that the items get removed from search_api_item table via cron job, index now from ui and also by calling drush sapi-i, after clicking on "Clear all indexed data" (drush sapi-c && drush sapi-r didn't have not have the same effect). The items won't get removed all at once rather step by step with each invoke of the index-function by the amount of the batch size. After getting the error "Could not load the following items on index ... Couldn't index items. Check the logs for details" the index job stops and need to be executed again which take a while to remove all items.

After that I tried to add all node bundles again but the table search_api_item won't grow again. I tried drush sapi-c && drush sapi-r && drush sapi-i, it says "Indexing a maximum number of X items ...", but X is just the amount of items in table search_api_item, the real amount should be higher now.

Checking and unchecking read-only didn't work for me, but disabling and enabling the index also brought the "track now button" back to fix the amount of items to index.

abramm’s picture

> The items get removed from search_api_item table via cron job.

This happens if Search API fails loading the entity. In this case, the code realizes that something gone wrong (as if someone deleted the entity in unusual way and din't call any hooks) and tries to clean up the table entries which aren't loadable.

> The items won't get removed all at once rather step by step with each invoke of the index-function by the amount of the batch size.

That's normal. The 'Clear all indexed data' button (or drush sapi-c) does the following:

1) Clears all indexed data using backend-specific code.
2) Marks items as not indexed with 'UPDATE search_api_item SET changed = 0 WHERE index_id = XXX;' (the query is not that simple really, just FYI).

> After that I tried to add all node bundles again but the table search_api_item won't grow again.

That may be a bug. Could you please try reproducing it on a clean environment such as simplytest.me and write a bunch of simple steps to reproduce? That will help formally describing the issue.

osopolar’s picture

FWIW: I saw, that there is a drush command "search-api-reset-tracker", I would expect that that would workaround the problem, but as sapi-r is an alias of that which I was using already, I guess not.

drunken monkey’s picture

FWIW: I saw, that there is a drush command "search-api-reset-tracker", I would expect that that would workaround the problem, but as sapi-r is an alias of that which I was using already, I guess not.

No, all of these are just for re-indexing items, we currently assume that the tracking data is always correct and provide no (explicit) means of regenerating it. It might make sense to change that, I guess – but finding the root cause of your problem would still be important.

johnvb’s picture

I'm seeing the "Could not load the following items on index Default content index" log message when trying to build a search index as well. I have a site with approx 110,000 nodes of content type A that I want to exclude from the index, plus a content types B, C and D covering about 5,000 nodes I do want indexed. I've set up the index to index content, only for the content types B, C and D that I want to include in the index. When I do this, the index status bar on the index view tab shows 0/115000 indexed - it seems to be counting all the nodes including the excluded content type A. When I click Index now, it gives up after a few seconds with an error "Couldn't index items. Check the logs for details." My logs have the aforementioned error and list of nodes which it failed on which are all of content type A.

My guess is the Index Now button gives up if it finds nothing to do in a batch. The query which grabs the list of nodes to index grabs all the nodes and the content type filter is applied after the event, as it goes through each item. Because it skips the items and most items are un-indexed in my case, it ends up aborting.

johnvb’s picture

I've just found a workaround so posting here in case it helps anyone else. Here are the steps to show the problem and then fix it.
1) On a site with content in several different content types, set up an index for content, limiting it to only some of the content types. At this point, the search index thinks it needs to index the whole set of content for all types. The index progress on the view tab will show an incorrect total. When you "Index now" you will get errors in the log and a possible abort as a result of content nodes which belong to the un-indexed content types.
2) Now, alter the settings for indexing content so that all content types are included, then index the site. This will of course work, but you end up with an index that contains items you don't want.
3) Having indexed the site, alter the settings for indexing content again so that the content types you don't want are excluded.
The total number of items to index is now correct and you can re-index the site without errors.

osopolar’s picture

I am glad that you found a way to index only the desired content, I guess it took some time to index all these items only to be able to remove them later from the index. The workaround from #5 didn't work for you?

disabling and enabling the index also brought the "track now button" back to fix the amount of items to index.

drunken monkey’s picture

In any case, I'm almost positive you don't have to actually index the items before changing the settings again. Which items are indexed shouldn't affect which items are tracked in the slightest. So, just creating the index with all content types enabled and then editing it right away to have only the desired content types enabled should work fine, too.

However, it's still a mystery how this could happen in the first place. We have test coverage to ensure this is working (only tracking selected bundles when creating an index) and I've also just tested again to see that it works fine. So, there must be something special causing this in your setup.

drunken monkey’s picture

Btw, see #2930720: Add UI for rebuilding the tracking table for an index for adding a way to quickly clear and rebuild the tracking table.

marcvangend’s picture

Workaround #5 worked for me.

jimkeller’s picture

I ran into this problem this evening and the workaround of disabling my index, re-enabling, then clicking the "track now" button seems to have worked.

I believe I might have some insight into the steps to reproduce. When creating my index, I inadvertently set up my bundles backwards -- I didn't realize the default was to "index all BUT the selected", rather I selected the (2 of 8) bundles I wanted to index.

While setting up my fields, I noticed that I wasn't seeing the right options for rendered_html view modes (I was seeing bundles I wasn't expecting), which is when I realized my error. I went back and changed the radio button to index only the selected bundles.

When I went back to index my content, this issue was present; it seems like it got confused by my messing with the config.

drunken monkey’s picture

Status: Active » Postponed (maintainer needs more info)

Sorry, I still can't reproduce this. No matter with what content types I create the index, and how I edit them, the tracking info keeps being correct for me.
I'd really need a failing test case, or some steps to (reliably!) reproduce this.

marcvangend’s picture

Re #16: Just a guess, maybe it has to do with config synchronisation. At least in my case, the problem occurred after a co-worker had excluded certain node types from the index, exported/committed/pushed the config, and I pulled in the changes.

drunken monkey’s picture

Re #16: Just a guess, maybe it has to do with config synchronisation. At least in my case, the problem occurred after a co-worker had excluded certain node types from the index, exported/committed/pushed the config, and I pulled in the changes.

How does "pulling in changes" work in your setup?
I just tried it once again, and it works flawlessly through Drupal's own "Synchronize" tool at admin/config/development/configuration.

marcvangend’s picture

My standard procedure for pulling in changes:

git pull
drush updb
drush cim -y
drush cr

On practically every project we use the Config Split module, not sure if that makes a difference.

drunken monkey’s picture

Component: General code » Drush / Rules
Status: Postponed (maintainer needs more info) » Needs review
FileSize
424 bytes

For me, that works just the same as the Core "Synchronize" UI and only fails to already execute the tasks batch (since that works a bit differently in Drush). I either have to go to the site's "Status report" or the Search API overview page to see there are pending tasks and execute them, then everything is fine, too.

If the whole problem is just executing these tasks automatically, you should have just said so and saved all of us a lot of time. The attached patch should implement that, please tell me if it works for you.

marcvangend’s picture

you should have just said so and saved all of us a lot of time

I'm sorry that you feel like your time was wasted. But personally, I'm a bit offended by the tone of voice here. I had no idea that something like a Search API tasks batch even exists, and if I did, I probably had not expected to find it on the status report page. We're all trying to help here, just like I know you do, and no one is withholding relevant information on purpose.

That said, I thank you for the patch and I will try to test it and provide feedback. It would help if you can tell me which kind of changes cause tasks to be added to the tasks batch. It would make my testing less random.

drunken monkey’s picture

I'm sorry that you feel like your time was wasted. But personally, I'm a bit offended by the tone of voice here.

Yes, I have to apologize, I'm sorry about that. It just gets exhausting and frustrating from time to time, but of course it's not your fault. It's really not as apparent as it should be that there are tasks that still need to be run – especially with a Core issue (#2722237: "Local actions" block doesn't take cache data of route access information into account) preventing the pending tasks from also being shown as an action on the Search API overview page.
And making this work in Drush, too, is apparently almost trivial, so we could have just covered that from the start.
Again, sorry!

That said, I thank you for the patch and I will try to test it and provide feedback. It would help if you can tell me which kind of changes cause tasks to be added to the tasks batch. It would make my testing less random.

Changing the settings for an existing datasource on an index (in a way that more or less bundles will be included). With such a change, just try to reproduce the bug again. (It should actually be the same for adding or removing whole datasources – those tasks are just additionally shown on the index's "View" tab, which makes them more readily apparent.)

marcvangend’s picture

Status: Needs review » Reviewed & tested by the community

I have used the patch for a while and it works as expected.

  • drunken monkey committed 3a92a63 on 8.x-1.x
    Issue #2921022 by drunken monkey, marcvangend: Fixed automatic task...
drunken monkey’s picture

Status: Reviewed & tested by the community » Fixed

Great, thanks a lot for reporting back!
Committed.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

ndobromirov’s picture

I am getting this at the moment.
I am 99% certain of the reproduce scenario...

1. Have a content type that is multilingual and nodes that are translated in 3 languages for example English (EN), Bulgarian (BG) and French (FR).
2. Index that content & be sure it is in the index and in the trackers.
3. Delete one of the languages (for example FR).
4. Delete the nodes.
5. Try to re-index anything.

I am getting this error:

Could not load the following items on index NAME: "entity:node/843:sv", "entity:node/844:fr", "entity:node/844:it", "entity:node/844:nb", "entity:node/844:nl", "entity:node/844:sv", "entity:node/845:fr", "entity:node/845:it", "entity:node/845:nb", "entity:node/845:nl", "entity:node/845:sv", "entity:node/846:fr", "entity:node/846:it", "entity:node/846:nb", "entity:node/846:nl", "entity:node/846:sv", "entity:node/847:fr", "entity:node/847:it", "entity:node/847:nb", "entity:node/847:sv", "entity:node/848:fr", "entity:node/848:it", "entity:node/848:nb", "entity:node/848:nl", "entity:node/848:sv", "entity:node/849:fr", "entity:node/849:it", "entity:node/849:nb", "entity:node/849:sv", "entity:node/850:fr", "entity:node/850:it", "entity:node/850:nb", "entity:node/850:nl", "entity:node/850:sv", "entity:node/851:fr", "entity:node/851:it", "entity:node/851:nb", "entity:node/851:sv", "entity:node/852:fr", "entity:node/852:it", "entity:node/852:nb", "entity:node/852:nl", "entity:node/852:sv", "entity:node/853:fr", "entity:node/853:it", "entity:node/853:nb", "entity:node/853:sv", "entity:node/854:fr", "entity:node/854:it", "entity:node/854:nb".

The site had ~10 languages and then they were dropped to 3...
In the problematic IDs I am seeing only deleted languages.
I think a good solution would be to disregard items if they are not loadable...

The workaround with enable / disable of the index helped reindex the items there.

cmthiele’s picture

I have run into a similar issue, running most recent versions of Search API, Apache Solr. I think the developer is awesome, and I'm sure this is 99-100% user error, but I'll post my story for the next mistaken user.

Background

I have a core with two indexes that connect Solr to two different tables. Table A is 400k rows (with one long text field), and is a subset of Table B, which is over a million (no long field texts are indexed). Both A and B are indexed on Search API.

Problem

After updating the underlying tables, and reindexing through Search API, I got the same error. The unloadable entities were entities that had been deleted from the search_api_item A, yet Search API was trying to load them into some (but not all) results. If it didn't load a previously deleted entity, the search performed fine. If it did, it went empty, and logged the error.

Issue #1

I found updating tracking through Search API was very very slow. With one table losing and gaining a small number of entries each monthly update, it took dozen+ hours to retrack and reindex that table.

While I'm no expert, I decided to use python, mysql, and multiprocessing to simply populate the search_api_item table independently. What took many hours now takes a few minutes. I saved the old table as search_api_item_backup for comparison purposes. I still use Search API to index, but use this script to update tracking only. Everything shows up properly in Search API and seemed to work (when I was only adding to the size of A and B, but not deleting from A).

After a recent update (which deleted from A), I noticed the update did not commit on Apache Solr, and ended up doing so manually. This might be something or nothing.

Solution #1

Before retracking through Apache Solr ( I already reindexed without success), I wanted to try other solutions. One suggested idea had to do with batch updates.

I did notice the Search API update pending tasks was not complete, and still showed up on the status report. So, I decided to run it. It seems to performing a bunch of inserts into the cache_entity table. I'm not 100% sure why but it is. If I turn of Search-API tag based caching would this fix the issue? It didn't seem to during preliminary attempts.

Anyway, looking at the progress (a 1,000 every few minutes) it'll likely take a few hours to complete. I'll update with results. But even if it works, I'll probably want to add an additional script to take over this process as well to speed up performance.

Conclusion

'm still not quite sure why and how Search API is sending A to Solr, Solr is sending back results {X,Y,Z} to Search API, yet Search API is trying to load results {X, Y, B}, and failing. A caching error seems like an obvious culprit.

cmthiele’s picture

Update to Solution #1:

After letting the cache_entry insert process play out over night for over a million entries, the status report message went away, but the error did not. My guess is I previously stopped this process before it was complete, and it needed to finish regardless of if I disabled caching or not. It seems unrelated to the error.

Solution #2:

After attempting the read only on/off, the current number of items indexed changed in the sentence string ("There are ..."), but was still correct in the bar above. I decided to investigate Apache Solr and realized the Apache Solr entities indexed were actually slightly larger than they should be. Thus the problem was that Apache added the new entries (and updated existing ones) but never actually deleted the old entries. This is because I clicked quere all items for reindexing without first deleting the index. I assumed this option would also delete prior to inserting (while preserving Apache Solr's replication feature). It did not.

The solution was thus to click "Clear All Indexed Data", and THEN Index. If I simply reindexed without first clearing (even if I cleared the mysql table itelf), it created the error because instructions were never sent to Apache Solr to clear it from there.

This may be due to the python script I use to update tracking (in place of Search API), but I'm willing to live with this bug (and bug fix) if it means I get to retrack a million plus rows in a couple minutes rather than overnight.

Anyone who has a similar problem, hope my story helps.

josephdpurcell’s picture

Issue summary: View changes

Since I am experiencing this issue I thought I would add a comment here.

The observed issue
The behavior I observe is that when the index is run it puts these errors in the log:

> Could not load the following items on index Acquia Search Solr Index "entity:commerce_product_variation/101871:en".

Then, the View that renders the list of indexed products will sometimes only show 3 of the 20 items per page.

This issue only happens on DEV and STAGE. It does not happen on PROD.

Hypothesis

Since this only happens on DEV and STAGE I'm assuming it's because when we deploy to those environments we are copying the PROD database. But, when we do that the search_api_item table is copied but not the SOLR index itself (obviously). Probably more important than that, the products in the database are not equal to the products in the SOLR index.

I also notice it locally and for that I can just completely destroy the data in my local SOLR index like so:

curl -X POST -H 'Content-Type: application/json' http://localhost:8983/solr/drupal/update?commit=true -d '{"delete":{"query":"*:*"}}'

WARNING! This command completely drops your SOLR index that is named "drupal".

Given that I've not been able to get any permutation of using the "Clear all indexed data" and "Rebuild tracking information" buttons with a re-index does not resolve the issue, but completely dropping the SOLR index using that curl request does, I assume that those two buttons don't fully clear out the SOLR index.

This is a hypothesis. I've not taken the time to prove it out. I'd love to hear if someone else has a different hypothesis.

Workarounds (summary of this ticket)

To summarize this thread it sounds like there are two workarounds which I'll add to the description for anyone else passing by this ticket.

I'll say that the Option 1 which sets the index to readonly then not readonly worked. It's completely obscure to me why that would eliminate the errors.

I'll also add the Option 3 which is to completely drop the search index. You can only use that locally though, I don't believe you can do that on Acquia.

Fernly’s picture

For future reference: only option 3 worked in my case. It was an effect of a down-sync of production data to our staging server which caused wrong solr index hashes. Long story short, clearing the index using the curl command and reindexing fixed it.

judapriest’s picture

Happen to me. For future reference :

  • I have 2 environnement "live" & "test"
  • I have 2 SolR "Live solR" & "test solR"

Live environnement was plug to "test solR".

Plug back my production environnement to my solR production, and everything is fine now.

brooke_heaton’s picture

This issue is happening to me on version: '8.x-1.23'

Very bizarre and confusing behavior:

>  [warning] Could not load the following items on index Content: "entity:node/10346:und", "entity:node/103471:und", "entity:node/103476:und", "entity:node/103481:und", "entity:node/103486:und", "entity:node/103491:und", "entity:node/103496:und", "entity:node/1034:und", "entity:node/103501:und", "entity:node/103506:und", "entity:node/103511:und", "entity:node/103516:und", "entity:node/10351:und", "entity:node/103521:und", "entity:node/103526:und", "entity:node/103531:und", "entity:node/103536:und", "entity:node/103541:und", "entity:node/103546:und", "entity:node/103551:und", "entity:node/103556:und", "entity:node/103561:und", "entity:node/103566:und", "entity:node/10356:und", "entity:node/103571:und", "entity:node/103576:und", "entity:node/103581:und", "entity:node/103586:und", "entity:node/103591:und", "entity:node/103596:und", "entity:node/1035:und", "entity:node/103601:und", "entity:node/103606:und", "entity:node/103611:und", "entity:node/103616:und", "entity:node/10361:und", "entity:node/103621:und", "entity:node/103626:und", "entity:node/103631:und", "entity:node/103636:und", "entity:node/103641:und", "entity:node/103646:und", "entity:node/103651:und", "entity:node/103656:und", "entity:node/103661:und", "entity:node/103666:und", "entity:node/10366:und", "entity:node/103671:und", "entity:node/103676:und".
>  [notice] Successfully indexed 1 item.
>  [warning] Could not load the following items on index Content: "entity:node/103681:und", "entity:node/103686:und", "entity:node/103691:und", "entity:node/103696:und", "entity:node/1036:und", "entity:node/103701:und", "entity:node/103706:und", "entity:node/103711:und", "entity:node/103716:und", "entity:node/10371:und", "entity:node/103721:und", "entity:node/103726:und", "entity:node/103731:und", "entity:node/103736:und", "entity:node/103741:und", "entity:node/103746:und", "entity:node/103751:und", "entity:node/103756:und", "entity:node/103761:und", "entity:node/103766:und", "entity:node/10376:und", "entity:node/103771:und", "entity:node/103776:und", "entity:node/103781:und", "entity:node/103786:und", "entity:node/103791:und", "entity:node/103796:und", "entity:node/1037:und", "entity:node/103801:und", "entity:node/103806:und", "entity:node/103811:und", "entity:node/103816:und", "entity:node/10381:und", "entity:node/103821:und", "entity:node/103826:und", "entity:node/103831:und", "entity:node/103836:und", "entity:node/103841:und", "entity:node/103846:und", "entity:node/103851:und", "entity:node/103856:und", "entity:node/103861:und", "entity:node/103866:und", "entity:node/10386:und", "entity:node/103871:und", "entity:node/103876:und", "entity:node/103881:und", "entity:node/103886:und", "entity:node/103891:und", "entity:node/103896:und".
>  [warning] Message: 27110 items could not be indexed. Check the logs for details.
tonytheferg’s picture

I am getting the same warning messages rolling back taxonomy term migrations. I'll try to post the exact message next time I'm migrating.