- better support for non-English languages
- support for multilingual search
- cross-language information retrieval (CLIR)
- an easy-to-use administration interface
The projects mentioned above provide a (more or less) easy way to use Apache Solr as a powerful search engine for Drupal. Unfortunately, the only language that works well with them, out-of-the-box, is English.
So if you run a non-English website, you need to tweak all the configuration files by hand or you lose some of the advantages that Solr provides compared to Drupal's built-in database-driven search. Doing so requires a deep knowledge of Solr and search technology in general.
The entire process gets even more complicated if you run a multilingual website.
That is why we started thinking about an additional module called Apache Solr Multilingual, to hide most of the complexity from a Drupal website's administrator.
- Stop Words
Words you want to exclude from your search index are called stop words. The list of words strongly depends on the focus of your website and, of course, on your site's language.
Every word in the search index is stored in a reduced form called a word stem. This strategy enables the user to find content, independent of the key word's inflection, e.g. singular or plural. Unfortunately, the stemming algorithm is different from language to language.
- Protected Words
In some cases, you'll want to exclude certain words from the stemming described above. These protected words are language-specific, like stop words.
- Compound Word Splitting
Languages like German frequently combine words (e.g. "Dampfschifffahrt"). In order to deal with that problem you need to split such words into parts depending on language-specific word catalogs.
- Spell Checking
No doubt that spell checking should be language-specific.
Apache Solr Multilingual
Apache Solr Multilingual provides an out-of-the-box solution to language-specific problems (described above) or supports the site administrator by providing a user interface that hides some of the complexity.
Additionally, Apache Solr Multilingual provides a way to offer language-specific searches for different languages at once on multilingual websites. Therefore, Apache Solr Multilingual integrates with Drupal's standard multilingual features provided by core modules and the Internationalization and the Entity Translation modules.
As a special feature, Apache Solr Multilingual can be configured to deal with the translations of nodes and taxonomies on multilingual sites. That means that you can find content in any language, no matter which language was used to enter the search phrase. It's currently a simple implementation of CLIR, but our plan is to extend this feature.
Apache Solr Multilingual 7.x-1.x is considered stable. The feature list is mostly complete for European languages. Features for different languages might be added on demand.
Apache Solr Multilingual 7.x depends on or uses some other modules:
Apache Solr Multilingual supports content translation and entity translation and therefore interacts with these modules if available:
If you're interested in helping us, there are different ways. Beside code contributions we need native speakers for each language as testers and to help us defining default settings for different languages.
An integration with Search API 7.x doesn't exist and isn't planed as long as there's no sponsor for its development.
For Drupal 8 we'll switch from Apache Solr Search to Search API as the base for Apache Solr Multilingual 8.x-1.x. This branch is currently in development. To have a look at the current state, you need to install
- the latest release of Search API 8.x-1.0
- apply the still pending required patches for Search API
- the latest development snapshot of Search API Solr from github
- the latest development snapshot of Apache Solr Multilingual 8.x-1.0
- at least Solr 5.2
In order to get it to work you need to deploy the solr conf shipped with Apache Solr Multilingual.
The only supported branch of Apache Solr Search is 6.x-3.x.
6.x-1.x is not supported anymore.
- bio.logis offers a comprehensive spectrum of genetic analyses.
- pgsbox.com - Personal Genomics Services
- Oct 2013: "Deutsche und mehrsprachige Volltextsuche mit Apache Solr" at Drupal Camp Essen (German slides).
- Sep 2011: "Language-Specific and Multilingual Full-Text Searching" at Drupal City Berlin (English slides).
- Feb 2011: "Non-English and Multilingual Search with Apache Solr Multilingual" at Drupal Dev Days Brussels (English slides).
- May 2010: Apache Solr Multilingual session at Drupal Dev Days Munich (German slides).
- Maintenance status: Actively maintained
- Development status: Under active development
- Module categories: Administration, Multilingual, Search
- Reported installs: 495 sites currently report using this module. View usage statistics.
- Downloads: 34,983
- Automated tests: Enabled
- Last modified: August 31, 2015