I have an index on some external entities that connect to a database. Leveraging search api is great because it enables the ability to use views. I am using the database backend.
https://www.drupal.org/project/external_entities
I realized that when a new row is created on the external database the total number of items is not updated for the search api index because drupal is not notified at all.
Clearing the index does not count the number of entities available to index again so the new items are left out of the index.
There is a couple of options on the interface that force the count and reindex.
- Destroy the index and create it again from configuration
- Disable and enable the index again
Programatically I am doing:
$index = Index::load('external_records');
// Clear it.
$index->clear();
IndexBatchHelper::setStringTranslation($this->getStringTranslation());
IndexBatchHelper::create($index);
Recently I changed it to
$index = Index::load('external_records');
$index->setStatus(0);
$index->save();
$index->setStatus(1);
$index->save();
IndexBatchHelper::setStringTranslation($this->getStringTranslation());
IndexBatchHelper::create($index);
I would like to have a way to do this from the interface with less steps and also this will make more obvious for people in situations similar to mine to really start the index from scratch.
Comment | File | Size | Author |
---|---|---|---|
#14 | 2930720-14--rebuild_tracker_button.patch | 11.92 KB | drunken monkey |
|
Comments
Comment #2
rodrigoaguileraComment #3
drunken monkeyThat's a good idea, thanks! Something like this has been often requested, we really should make this easier. (Even though, under ideal circumstances, this should never be necessary.)
Does this make more sense as a third button in the "Index status form" (i.e., next to "Queue all items for reindexing" and "Clear all indexed data") or as an additional checkbox for both these buttons' confirm forms? Or how would you have seen the UI for this?
(In any case, I guess we'd seque right into the "Track items" batch after submitting the confirm form.)
However, still a note specifically about your case: as explained in the doc block of
DatasourceInterface
, your custom datasource (or, rather, the module by which it is provided) is responsible for keeping track of new/updated/deleted items. So, it's to be expected that you'll need some custom code for that – if you can't detect CRUD operations on the datasource, then it's expected that you have to provide code that rebuilds the tracking table regularly (or however else you want to handle it). And, by the way, I think this would be the best way to do that (and how I'd implement it in the module):That way, you avoid any undesired side effects (and the additional overhead) of disabling the index.
Comment #4
drunken monkeyComment #5
rodrigoaguileraThanks for looking into this. I will improve my code with that snippet.
I agree too add that third button as something like "Clear all indexed data and rebuild tracked items" with a confirmation informing about how that is the slowest and drastic action you can take for solving problems on an index.
I'm my situation I have no way to hook into the CRUD operations of the datasource so rebuilding it on demand is my fastest workaround.
Comment #6
drunken monkeyI don't think we want to force people to also delete all indexed data while doing this. There's situations where you know there are only items missing, not surplus items indexed, and then throwing away all the indexed data seems like a waste.
If it's an extra button, I'd just add a "Rebuild tracking information" (or whatever, label TBD) button, and people would have to do "Clear all indexed data" in addition to that if that's what they want.
That's why I also think just having additional checkboxes for the two existing actions might be a good alternative.
Since others were interested in this functionality, too, I'll leave this in "Postponed" for another week or two to get additional input on the best UI.
(Also, "Needs work" is only for when there's already a patch, but that needs work. "Active" would have been the correct status.)
Comment #7
RenrhafHey there, for a custom need I developed a Datasource fetching data from an external API and indexing it into an ElasticSearch backend.
I needed to track items too before being able to index them via the search api indexing batch, and I used a custom batch to insert all entries into the tracking table to do so. Also interested in this kind of features.
Comment #8
drunken monkeyThanks, good to know! But do you have any opinion on the UI, as discussed in the last few comments (#3 ff.)?
Comment #9
patrickfwestonI'm running into a similar issue as Renrhaf above. We have an external API that we're indexing into a Solr backend. This API is updated and we detect updates. We were using code similar to what rodrigoaguilera originally posted to reset the tracking for the index. I've updated it to your snippet in #3.
As far as the UI goes, I think it makes sense to add a "Rebuild tracking information" link similar to the "Queue all items for reindexing" and "Clear all indexed data" links. I think grouping it here makes a little more sense because these are all actions to take after an index has been built out, ie they all involve updating or refreshing the current index.
You mention checkboxes in #6, but I'm not quite sure what you have in mind for those?
Comment #10
RenrhafI'm also for a solution using an additional button in the UI that will throw a batch to rebuild the whole tracking table.
Maybe some documentation should be added here to explain what this is for, because the tracking system is under the hood and not known by all users.
Comment #11
drunken monkeyOK then, how about this?
I mean't having an "Also rebuild tracking information" checkbox on the "Reindex" and "Clear" confirm forms, instead of a separate form for rebuilding the tracker.
Comment #12
drunken monkeyAnyone want to test/review?
Comment #13
borisson_I'm here with nits!
This can be improved by reversing the if
This is just for readability so feel free to ignore.
/s/item/items/.
This is really unwieldy to read, both in the patch and and when applied, but I don't think we can easily improve that. the description does get very long in the UI as well. Do you think it'd be better to change this into html with breaks/paragraphs in between the sentences?
This is technically an API break. I think that means we should write a change record for this issue?
Can we change this $count to $maniuplated_number_items? Or something else, I don't think $count suffiently conveys the meaning here.
Comment #14
drunken monkeyThanks, I agree with all of those!
Here is the change record.
For 1., it's just a bit weird that the other methods around there follow the other pattern (though your proposed one is of course preferable). I now also changed it for
clear()
, I think that makes enough sense to do here even though it's out-of-scope.Comment #15
borisson_Great work Thomas!
Comment #17
drunken monkeyOK then, thanks a lot for your feedback and help!
Committed.
Comment #18
rodrigoaguileraGreat feature!
Thank you Thomas :)
Comment #19
RenrhafThanks !