Problem/Motivation

This is a copy and paste from elastic_connector #3540361

Currently, the Opensearch Connector module only supports searching indexes that follows the data structure it created.

But, some organizations (especially larger ones) might want to provide a Search UI that is managed by Drupal and searches both the organization's main Drupal site and a number of legacy non-Drupal sites (i.e.: that haven't been migrated to Drupal yet).

#todo determine if opensearch can combine indexes like elasticsearch.

  • Elasticsearch B.V.'s Elasticsearch-as-a-Service products offer several ways to combine indexes from multiple sites into a meta-index and/or crawl legacy, non-Drupal sites.

Being able to search an externally-created index would be desirable!

Search API does actually support searching an index that it didn't create and/or doesn't manage. In particular, the Search API Solr module provides 2 Search API datasource plugin implementations (src/Plugin/search_api/datasource/SolrDocument.php and src/Plugin/search_api/datasource/SolrMultisiteDocument.php). These datasource plugins make it possible to search externally-created Solr indexes using Search API's Views plugins.

Proposed resolution

Add a Search API datasource plugin.

Remaining tasks

  1. Write a merge request
  2. Review and feedback
  3. RTBC and feedback
  4. Commit
  5. Release

User interface changes

To be determined.

API changes

To be determined. Hopefully only API additions.

Data model changes

To be determined.

Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Comments

robpowell created an issue. See original summary.

robpowell’s picture

Creating a new datasource plugin is a blackbox for me. I am trying to go commit to commit for the changes in the elasticsearch_connector changes in 3540361.

This commit is my best attempt at matching #3540361's commit

robpowell’s picture

Here is the commit that adds the datasource, connector, and typed data #3540361's commit

robpowell’s picture

StatusFileSize
new138 bytes
new104.43 KB
new108.7 KB
new117.12 KB

This latest commit fixes some bad copy pasta and updates logic to make it work on my local. I have to debug more as it seems like I have to pass the index around more and when the machine name of the index is different than the real index name I get an error.

setup steps

  1. New Opensearch cloud connecter extends basic auth. Configure new Search server (/admin/config/search/search-api/add-server) Note: leave Open Search Idblank or you could delete your index
  2. We need to use the new datasource: OpenSearch Document. Configure new Search Index Note: for now, name your index the same as Open search
  3. Configure fields for the search index including _id
  4. Create view off new index. Fields should include all indexed fields and in the advanced section disable access checks

Chopping block

There's some functionality added in the original issues patch that I don't think are necessary here and should probably remove to make the review process easier.

  • label and url fields on the datasource
  • date field

Unknowns and gotchas

Here are some oddities I don't quite understand that may or may not require code changes:

  • missing fields in Index::loadMultiple() can lead to index clearing. I ran into this when I was toggle on and off the new datasource id checkbox.
  • real_opensesarchcloud_index_id override isn't available everywhere it is needed (upstream) so if you make a change to real id name search api will still use

Changes from the original patch elastic_connector #3540361

Since I can't really provide a diff of the changes I thought I'd at call them out here.

  • add src/OpenSearchFieldManager.php and interface
  • OpenSearch/Connector
  • Remove key module support
  • ...SearchField helper module now specifically uses the realindex if defined
  • Add real index check src/SearchAPI/BackendClient.php
  • Add real index checks to src/SearchAPI/Query/*

Results

I now have a working Drupal -> OS integration for remote content that doesn't get purged. Attached are couple items to follow along.

Attached is my OS search result index, index field config, view field config.

Next steps

  1. resolve the features/functionality that can be remove to make this easy to review
  2. resolve bugs and other items from the "gotcha" list above
  3. Add test
robpowell’s picture

Pushed some changes after testing with a single data source of OS document and multiple datasources (content & os document)