Currently the query in usasearch_api_update_index() gets all nodes, up to the search cron limit, and the check to see if the node is of an allowed content type happens later while those nodes are being looped through. This can be a problem if a site has, say 100,000 nodes total, but only 10,000 nodes of allowed content types, as it means that it can take an extremely large number of cron runs before any nodes are actually indexed (for example if the search cron limit is 10, in the above situation it might take 9,000 cron runs before any nodes are indexed.
Proposal: Limit the query in usasearch_api_update_index() only to nodes of allowed content types (ie, content types that have been configured to be indexed in DigitalGov search).
| Comment | File | Size | Author |
|---|---|---|---|
| #3 | usasearch-index-query-allowed-types-2869167-3.patch | 1.88 KB | brockfanning |
Comments
Comment #2
brockfanning commentedComment #3
brockfanning commentedI messed up the join condition in the query, so this update fixes that.
Comment #5
schiavone commentedPatch has been included in 7.x-5.7 release