Problem/Motivation

Clearing and reindexing an ai_search index whose backend uses database: postgres recreates the collection table without the
columns previously added for fields configured as Attributes in ai_search.index.<id>.indexing_options. The next batch
of inserts fails with:

column "<attribute>" of relation "<collection>" does not exist
  

Reproducible with any field marked as Attributes (for example langcode, status) on a multilingual or permissions-aware index.
Tested on Drupal 11, ai 1.3.3, ai_search 1.3.0-alpha1, ai_vdb_provider_postgres 1.0.0-alpha3, PostgreSQL 17 + pgvector 0.8.2.

Steps to reproduce

  1. Configure an ai_search index content_rag with backend postgres.
  2. In the index "Indexing options" tab, mark one or more search_api fields as Attributes (for example langcode
    and status).
  3. Save. The hook ai_vdb_provider_postgres_search_api_index_update() fires and PostgresPgvectorClient::updateFields() adds the
    columns. Indexing succeeds.
  4. Trigger a clear+reindex via any path that funnels through SearchApiAiSearchBackend::deleteAllIndexItems():
    drush search-api:clear
      && drush search-api:index

    , the "Clear all indexed data" button in the UI, or any equivalent.

  5. The first batch of inserts fails as quoted above.

Manual workaround until a release ships, run after each clear+reindex:

ALTER TABLE content_rag ADD COLUMN IF NOT EXISTS langcode VARCHAR;
  ALTER TABLE content_rag ADD COLUMN IF NOT EXISTS status   BOOLEAN;
  

Replace with the postgres types from PostgresPgvectorClient::DATA_TYPE_MAPPING for whichever fields are marked as Attributes.

Root cause

Drupal\ai\Base\AiVdbProviderClientBase::deleteAllIndexItems(array $configuration, IndexInterface $index, $datasource_id = NULL) receives
$index from SearchApiAiSearchBackend::deleteAllIndexItems() but discards it when delegating to
deleteAllItems($configuration, $datasource_id), which executes:

  • dropCollection() issuing DROP TABLE IF EXISTS ... CASCADE;
  • createCollection() issuing
    CREATE TABLE ... (id, content, drupal_entity_id, drupal_long_id, server_id, index_id, embedding
      vector(N));

The recreated table only contains native columns. The Attributes columns added previously via updateFields() are gone. The hook calling
updateFields() only fires on hook_search_api_index_update, i.e. on index config changes, not on a clear+reindex cycle.

PostgresProvider already overrides deleteAllItems() to re-apply ensureVectorIndex() after the drop+create cycle
(HNSW/IVFFlat get restored), but no equivalent restoration exists for Attributes columns.

Proposed resolution

Override deleteAllIndexItems() in PostgresProvider so it re-applies
PostgresPgvectorClient::updateFields($index->getFields(), ...) after the parent's drop+create. The fix stays inside
ai_vdb_provider_postgres (the only provider in the AI ecosystem with a rigid schema, hence the only one affected) and leaves
\Drupal\ai\AiVdbProviderInterface untouched. updateFields() is idempotent (ADD COLUMN IF NOT EXISTS), so this is
also safe on the very first create path.

Patch attached. Verified locally end-to-end on the stack listed above: after applying the patch and running

drush search-api:clear &&
  drush search-api:index

, the table retains all Attributes columns and the 6 test chunks index without errors. Pipeline retrieval against the rebuilt
collection returns expected results.

Remaining tasks

  • Review the patch.
  • Add a kernel/functional test that creates a postgres-backed ai_search index with at least one Attributes field, performs a clear+reindex,
    and asserts the columns survive.

User interface changes

None.

API changes

None. Adds an override in PostgresProvider; the public interface contract is unchanged.

Data model changes

None. The columns the patch reapplies are already part of the documented data model when Attributes are configured; the patch only ensures they are not
silently lost during a clear+reindex.

Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Comments

ignacio.perez.puertas@gmail.com changed the visibility of the branch 3586862-clearreindex-drops-columns to hidden.

avpaderno’s picture

Version: 1.0.0-alpha3 » 1.0.x-dev
Issue tags: -attributes, -clear and reindex, -pgvector, -multilingual