On this page
Adding an index
Click Add index to go to the index creation form. Here, you have to enter some information for your new index:
- Index name
- Any human-readable name by which your index will in the future be referenced. A good choice is to specify what kind of data will be indexed in it. E.g., Node index, Profile index, etc. This name can later be changed and is only used internally, so mistakes don't matter much.
- ID
- This machine name will be used as a unique internal identifier for the index (e.g., in administration paths) and cannot be changed afterwards. It is usually automatically created based on the human-readable name you choose, but you can also manually edit it by clicking Edit next to the created name (once you entered something in the Index name field). It is usually only used internally and in the admin UI, and cannot be changed for existing indexes.
- Datasources
- Select the type of data you want to index and search with this index. This is the most important setting for an index. At least one datasource has to be selected.
- Datasource configurations
- Some datasources provide (or require) additional configuration to be used. In these cases, additional boxes will appear with the configuration forms for selected datasources.
- Tracker
- Usually hidden (when only one tracker is available), this lets you customize how new and indexed data is internally tracked. It is usually only interesting for advanced use cases.
- Server
- The server to which the index will be initially connected, and which will therefore be used to index and search its data. You can also create an index without a server, and later change an index's server.
- Enabled
- Here you can select whether the index will initially be enabled. Since indexes without a server (or attached to a disabled server) cannot be enabled, this will only take effect if you also select an enabled server for the index.
- Description
- This can help you make some notes on the index's purpose, e.g., for other administrators or for future reference. It will only be displayed on the index's main page and can also be left empty.
- Index options
- This box contains advanced options for the index. The defaults are usually fine for new users, and the settings should only be changed if you are sure of the consequences.
- Read only
- This allows you to create indexes which will only be used for searching existing data on a remote server. No data will be indexed to the server, or deleted from it, via the Search API.
- Index items immediately
- Usually, when new items are created, or existing items are changed, they are marked internally as "needs indexing". By enabling this setting you can change that behavior to instead index new and changed items right away, so the search index is always up-to-date with the latest content.
- Cron batch size
- When items are indexed during cron runs, this setting will determine how many items will be indexed in a single batch. Too low values for this setting might decrease indexing performance, especially with some types of servers; with too high values, on the other hand, cron runs might run out of time or memory, and no indexing at all could occur.
Use a number of items that can be indexed in about 5 to 10 seconds. You can manually index items (on the index's View tab) to test this out.
Click Save and add fields to finish this first step of the process. You will be automatically redirected to the new index's Fields form.
Selecting the indexed fields
The "Fields" tab lists all fields that will currently be indexed for this index, grouped by datasource (plus, possibly, some fields that will be added for items of any datasource). For a new index, this list is usually empty. Click Add fields to be able to add fields.
You will see a list of all properties available for each datasource on this index. By pressing the "+" next to a property (where available), you can expand the property's inner properties and thus also browse properties on related entities (for instance, a node's author's fields). Add fields for all properties for which you want to store data on the search server. These fields can then be searched, used for filtering and sorting, and potentially used for other purposes by contrib modules.
If you don't see a property you were looking for, check the "Skipped fields" section at the bottom of the form. It's possible that its underlying data type isn't yet supported by the Search API. In this case, please search the issue queue for an issue discussing this property or type, and otherwise create a new one requesting support.
After adding all desired fields, press "Done". The fields are now listed in the table, but are not yet saved permanently to the index. (Similar to how editing a view works.) You have now the chance to update the fields' settings from them defaults that were selected automatically. Note that changing most of these values (except "Label") after saving the fields will necessitate a complete reindex of the index's content. You should therefore try to choose the right settings right away.
Note that only fields of type Fulltext can be used in fulltext searches. So when you want to find individual words contained in this field, not just the whole field value, use this type. Other types can be used, e.g., for filtering and sorting.
A few fields, like "Rendered HTML output", also have additional configuration available. These will have an additional "Edit" link under "Operations". (You will also have been redirected to their configuration form upon first adding the field.) Other fields don't have the "Remove" link there – these fields are, for some reason, required for the index and can't be removed (they're "locked", in Search API terminology) – usually because of a processor. For some of them, even the type might be locked, so that that can't be changed, either.
After you're done configuring all the fields, do not forget to press "Save" to make the changes permanent. Afterwards, you should proceed to the "Processors" tab before finishing with index configuration for now.
Processors
This tab allows you to configure the processors the index will use. Here you can (from top to bottom):
- Enable all processors that you want to use.
- Re-order the processors for the different phases, if you think the defaults aren't right for your use case. Be careful doing this, though, since for some processors a wrong order might lead to bad results. (E.g., "HTML filter" should always run before "Tokenizer" during "Preprocess index".)
- Finally, where available, configure the enabled processors in more detail.
A detailed explanation of the function and capabilities of processors, as well as short descriptions of all known processors can be found in the Processors section of this guide.
Please note that some backends, such as Apache Solr and Elasticsearch, already do many of these filtering strategies for you. It is advised you understand what processors are recommended for which backend. However, backend plugins can also decide themselves to hide certain processors that are known to not work well with them, so most of the processors you see listed should actually be applicable to your setup.
The "View" tab
The View tab is not needed while creating an index. However, it can be useful later (or immediately afterwards, to check if everything went well) to check the index's status.
Below the index description, if there is any, you will see the current index status. (At least if the index is enabled—if it is disabled, a lot of the elements discussed here won't be present.) This contains the number of items indexed so far in its latest state (i.e., that were not edited after being indexed the last time) as well as the total number of items to be indexed. (Note, however, that this doesn't take processors or hooks into account that may filter the items to be indexed.)
Following that is a table with general information about the index and its status.
Below that, unless all items are already indexed, you have the option to manually index items. You can use that to index items without running cron, if you don't want to wait for that (and don't want the additional overhead of running cron manually).
In some cases, after making certain changes to the index, you also get a "Track items for index" form here. It's usually a good idea to run this when you see it. Otherwise, it will be taken care of during the next cron run, though. It's in any case no warning sign of any kind – it merely means that a recent change to the index changed the data set that should be indexed, and the tracking information has not yet been updated to reflect that. This will not cause any data loss or indexing problems, but might slow indexing down a bit in some cases.
Finally, you have the options to mark all items as "dirty" using the Queue all items for reindexing button (done automatically as needed, but might be necessary manually due to external modifications) or to completely clear the index (which will not only mark all items as "dirty" (i.e., unindexed), but also delete all indexed data for this index from the server).
Use these options with care, as re-indexing might take some time, depending on the size of your data set. Clearing the index might be necessary, though, when index data has become corrupt in some way. And re-indexing might be necessary when external changes to the server are made.
Help improve this page
You can:
- Log in, click Edit, and edit this page
- Log in, click Discuss, update the Page status value, and suggest an improvement
- Log in and create a Documentation issue with your suggestion