Indexing your content

Last updated on
4 September 2025

After configuring a Search API Index, choosing fields to index, and configuring processors, the next step is to fill the index with search data, by indexing your content.

If your index is Read-only, i.e.: because you want something else (e.g.: Elastic web crawler) to fill the Index, this documentation page is not for you — you need to use that other tool to index your content, and when you're done, proceed to the Adding a search page guide.

The rest of this page assumes that your Index is not Read-only.

Start from Search API's configuration at /admin/config/search/search-api, find the table row for an Index, and click the Index name. Or, edit an Index and click the View tab.

For more information, see the section on The "View" Tab in the Search API documentation on Adding an index.

The Index View page

At the top of the View page is an Index status progress bar, which is a visual indicator of how many pieces of content in the Datasource(s) you selected for that index have been indexed (the small text below the bar shows the fraction and percentage of content that has been indexed).

In the middle of the View page is a summary of information about the index: its Status, the Datasource(s) that it is configured to use (and a fraction indicating how many items in each Datasource have been indexed), the Tracker that the Index is configured to use (Default is the only available tracker at time-of-writing), the Server that the Index is on (it should be the Search API Server you set up earlier), the Server index status, and the Cron batch size configured for the index.

At the bottom of the page is a Start indexing now details-element containing controls to bulk-index the Index's Datasources, and three buttons below it: one to Queue all items for reindexing, one to Clear all indexed data, and one to Rebuild tracking information

Indexing incrementally

If you checked the Index items immediately option in the Index's basic settings, then Search API will send new, modified, and deleted content to Elasticsearch when you take those actions.

If your site already has content in it, and you set your Index's Cron batch size to something other than 0, then Search API will index a portion of your content every time that Drupal's Cron process runs.

After setting up a Server and/or Index for the first time, it is worth bulk-indexing 1 item to see if everything is working — if it works, then you can feel reasonably confident that the remaining items will index incrementally over time.

Indexing content is resource-intensive for both Drupal and Elasticsearch: allowing Search API to index content incrementally (i.e.: with Index items automatically and using Cron) allows you to spread load of indexing all your content over time.

Indexing incrementally is a trade-off:

  • on the one hand, search will not work properly until everything has been indexed; but,
  • on the other hand, indexing incrementally doesn't require you to take the site offline temporarily to index in bulk.

Understand your client's expectations, and take time to explain the indexing process and the trade-offs therein. Some clients will be willing to schedule a maintenance window to ensure all items are completely indexed; and other clients are willing to accept that it may take a few hours for everything to come back to normal.

You can run cron from the command-line using the drush core:cron command.

Indexing in bulk

If you can't index incrementally, the other option is to index in bulk.

Indexing content is resource-intensive for both Drupal and Elasticsearch. This means that indexing in bulk will cause Drupal and Elasticsearch to respond slowly, time-out, or display errors until the indexing is complete. This is often undesirable in a production environment!

If you must index in bulk on production, consider ways to minimize disruption to end-users: scheduling a maintenance window or run the bulk-indexing during off-peak hours.

If you are using a CDN or edge-cache, consider temporarily increasing the time that pages are cached during your maintenance window, to ensure that the site remains responsive for end-users.

Timing how long it takes to index all content on a staging server can help inform how long to schedule a maintenance window or increase the caching time.

However, don't forget to add a buffer: staging sites usually have less traffic load than production sites, and therefore, a staging site will finish indexing faster than a production site under more traffic-load.

You can index in bulk using the Start indexing now controls on an Index's View page: choose the number of items, and how many to index at a time by filling in the "Index all items in batches of 50 items" phrase, then click the Index now button. This will start a Drupal batch job, which will run until you close the window, all items have been indexed, or until it encounters an error.

If Search API thinks there are 0 items left to index, then the Index now button will be disabled, and you must either use the button to Queue all items for reindexing, or use the button to Clear all indexed data.

You can also index items in bulk for a Search API Index from the command-line with drush search-api:index...

  1. drush search-api:index node_index - indexes all items for the Index with the machine name node_index
  2. drush search-api:index --limit=100 node_index - indexes 100 items for the Index with the machine name node_index
  3. drush search-api:index - indexes all items for all enabled indexes

Invalidating the content of an index

It is possible for the contents of an Index to become out-of-date (i.e.: invalid). This can happen automatically, or you can trigger it manually. Even if Search API considers the content of an Index to be invalid, you can still search the index. However, once the contents of an Index have been invalidated, Search API will try to index its items again (i.e.: "reindex" the items).

Most changes to field settings on an Index (including importing changes using Drupal's configuration management system) will cause Search API to automatically consider the contents of the index invalid.

Because Search API will try to index items again once it considers the contents of an Index invalid, depending on your client's expectations, you may need to schedule a maintenance window or deploy configuration changes to an index during off-peak hours.

It's always a good idea to check the Index status after deploying configuration.

There are some ongoing efforts to reduce which configuration changes invalidate the content of an Index: see [#3429647], [#3248665], and Elasticsearch's Update mapping API documentation.

You can manually invalidate the contents of an index by clicking the Queue all items for reindexing button at the bottom of the page. If you click this button, you can still search the index!

You can manually invalidate the contents of an index from the command-line with drush search-api:reset-tracker...

  1. drush search-api:reset-tracker node_index - invalidates the contents of the Index with the machine name node_index
  2. drush search-api:reset-tracker - invalidates the contents of all enabled indexes

Clearing all indexed data

It is also possible to clear all indexed data. This can happen automatically or you can trigger it manually. You cannot search the index after clearing all indexed data. After clearing all indexed data, Search API will try to index its items again (i.e.: "reindex" the items).

All index data can be cleared automatically in most Search API Server Tasks. See [#3529273] for more information.

In practice, Search API Server Tasks seem to be rare occurrences.

You can manually clear all index data by clicking the Clear all indexed data button at the bottom of the page. If you click this button, you won't be able to search until you index content again.

You can manually clear all index data from the command line with drush search-api:clear...

  1. drush search-api:clear node_index - clears all data from the Index with the machine name node_index
  2. drush search-api:clear - clears all data from all enabled indexes

Rebuild tracking information

Rebuilding tracking information clears Search API's knowledge of what has/hasn't been Indexed. You can still search the index after rebuilding tracking information. After rebuilding tracking information, Search API will try to index its items again (i.e.: "reindex" the items).

You can manually rebuild tracking information by clicking the Rebuild tracking information button at the bottom of the page. If you click this button, you can still search the index!

You can manually rebuild tracking information from the command-line with drush search-api:rebuild-tracker...

  1. drush search-api:rebuild-tracker node_index - rebuilds tracking information for the index with the machine name node_index 
  2. drush search-api:rebuild-tracker - rebuilds tracking information for all enabled indexes.

Next steps

If your site is completely empty, add a piece of content now.

If your site already has content and you want to index incrementally, bulk-index at least 1 piece of content to make sure it works. If your site already has content and you can safely bulk-index all content, then do so now.

Once you have some content on your site, and some content has been indexed, continue to Adding a search UI.

Help improve this page

Page status: No known problems

You can: