Apache Stanbol is a modular set of components that provide semantic content management features. One of its core capabilities is to extract information from unstructured content, i.e. plain text. This process is called content enhancement. With content enhancement you can identify entities like persons, places, or organizations within unstructured content. Once entities are identified they can be automatically linked to open linked data sources on the web, like DBPedia. Other feasible use cases include: direct usage from web applications (e.g. for tag extraction/suggestion; or text completion in search fields), 'smart' content workflows or email routing based on extracted entities, topics, etc. Read the Apache Stanbol overview to learn more about Apache Stanbol.

The Drupal Search API Stanbol project enables the indexing of Drupal entities such as nodes, users, taxonomy terms, files, etc. in Stanbol EntityHub. This data is sent to Stanbol in the RDF format. This data can be mixed with data from other sources via EntityHub and Managed Sites, and can be repurposed by the Enhancer to other web applications via a RESTful API or a Java API.

Requirements

- Search API keeps track of Drupal entities and ensure they are indexed properly.
- RDF Extensions provides an RDF serialization of Drupal entities which are sent to Stanbol.

Screencast

Credits

This project is brought to you by the Interactive Knowledge Stack (IKS) European project, and was developed by Stéphane "scor" Corlosquet and Wolfgang "fago" Ziegler.

Project Information

Downloads