The following sections provide an overview of the goals and inner workings of the module. In addition, the documentation illustrates environments best suited for Search Lucene API as well as areas where other solutions might make the most sense.

The Problem with Drupal Search

Although the Drupal Search module provides a solid interface to build on top of, the Node module's implementation is lacking in features and scalability. Because it is SQL based, full-text searches are inefficient and limited in functionality. The core search is so resource intensive that larger sites are often forced to disable it or move the database to another server. Unfortunately, many search alternatives are difficult to install and configure while others require core patches. Developers are usually left with little choice but to install the Google Custom Search Engine module which cannot take advantage of metadata such as taxonomy.

What is Lucene?

Apache Lucene is an open source, high-performance, full-featured text search engine library written in Java. The Zend Framework provides a PHP port of Lucene that makes near identical functionality available in the same programming language Drupal is written in. Lucene analyzes and parses text into a file-based index specifically designed for full text searching. This is in contrast to the core Drupal search which stores its index data directly in the database, an environment that is not optimized for this type of matching. Lucene natively supports advanced query syntax such as wildcard searches, fuzzy searches, and proximity searches to name a few. Visit the Lucene query language page for the full range of search capabilities supported by Search Lucene API.

Why Search Lucene API?

Search Lucene API provides the advanced functionality the core content search lacks. Lucene is the most advanced open source search engine library to date and is found in many applications and websites. After enabling the bundled Search Lucene Content module, end users will immediately notice that more relevant results are returned by searches, while power users will make use of the advanced Lucene syntax to refine their queries and target the content they are looking for. Enabling the bundled Search Lucene Facets module exposes a series of filters to refine search results through an intuitive interface.

Compared to other alternative solutions, Search Lucene API is the only self-contained module to provide such advanced search capabilities. Other contributed modules rely on applications outside of Drupal that are difficult to set up and maintain. Since Search Lucene API is the application, installation is done entirely through Drupal making the process familiar and simple. Furthermore, Lucene indexes are file-based which improves scalability by eliminating the SQL bottlenecks of the core search. Although Search Lucene API and Search Lucene Content are designed to work out of the box with no configuration, site administrators can customize many aspects of the modules via the administrative settings page.

As the name suggests, Search Lucene API exposes an application programming interface which allows developers to build and extend Search Lucene API modules. The interface was designed with extensibility in mind, so custom solutions can often be built with little programming. There are many contributed modules that extend the API, so check drupal.org to see if a solution already exists to help you meet your goals.

A Word on Acquia Search and Apache Solr Search Integration

Apache Solr is an open source search server based on Java Lucene that uses a Java servlet container such as Tomcat to provide a HTTP interface. The Apache Solr Search Integration module is a well supported, full featured project that communicates with the servlet to index and search your site. One advantage of Apache Solr Search Integration is that the search service is run outside of the Drupal installation, so the load of the search is separated from the Drupal page request thus adding scalability. However, the separation of services is also a disadvantage because the servlet can be difficult to set up and maintain.

Acquia Search addresses the complexities of Apache Solr by offering a web service that hosts and maintains your Lucene indexes remotely. To date it is the only enterprise grade search solution for Drupal, and it will help bring the platform into exciting new environments. The downside of Acquia Search is that it is a paid service which makes it unattractive to projects with tight budgets.

Search Lucene API is not a universal solution, and the two modules described above may best achieve the search requirements of your site. However, the goal of Search Lucene API is to fill the gap to a better search solution where Solr-based options are unavailable or not cost effective. Because Search Lucene API is an integrated PHP solution, Solr-like search functionality is freely available to all Drupal sites without the drawbacks of its Java based alternatives.

Maintainer

Search Lucene API is developed and maintained by Chris Pliakas of CommonPlaces e-Solutions, LLC located in Hampstead, New Hampshire. Chris is an open source enthusiast who has been tinkering with Drupal since March of 2008. He can be found on drupal.org under the handle cpliakas, or you can follow him as @cpliakas on Twitter. Chris is a Zend Certified PHP 5 Engineer, a MySQL Certified Developer and Administrator, and he holds a Linux Professional Institute Level 1 Certification.