I'm using a site with content in Spanish, and I have installed the Apache Solr Multilingual module, but accents are not working.

If I search the word: "división" I get no results. If I search the word: "division", I get the results.

It's kind of funny because thanks to the Multilingual module, if I search "division" I get the results and "Did you mean: división", and if I search "división" I get nothing.

Comments

lelizondo’s picture

Category: bug » support
Status: Active » Fixed

Adding URIEncoding="UTF-8" in my connector config in /etc/tomcat[version]/server.xml file solved it.

<Server ...>
<Service ...>
<Connector ... URIEncoding="UTF-8"/>
...
</Connector>
</Service>
</Server>

http://drupal.org/node/443980

jpmckinney’s picture

Project: Apache Solr Search » Apache Solr Multilingual
Component: Language » Code
Category: support » bug
Status: Fixed » Active

Moving to Multilingual queue.

This thread may have an answer: http://www.mail-archive.com/solr-user@lucene.apache.org/msg06145.html

mkalkbrenner’s picture

Status: Active » Fixed

As described #1 the problem is related to the servlet container that runs Apache Solr which needs to be configured correctly. There's nothing we can do within a drupal module. But I added some information to the troubleshooting section of README.txt:

Searching for words containing accents or umlauts does not work!
You need to verify the configuration of your servlet container (tomcat, jetty, ...)
to support UTF-8 characters within the URL. For tomcat you have to add an attribute
URIEncoding="UTF-8" to your Connector definition. See Solr's documentation for details:
http://wiki.apache.org/solr/SolrInstall
http://wiki.apache.org/solr/SolrTomcat

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

wmostrey’s picture

Not only does URIEncoding="UTF-8" need to be added to the Connector, you also need to make sure that useBodyEncodingForURI is removed from it.

mkalkbrenner’s picture

Component: Code » Documentation
Category: bug » task
Status: Closed (fixed) » Needs review
wmostrey’s picture

I wrote a Very Small Blog Post about this subject: Configuring Tomcat to provide UTF-8 support for Solr.

wmostrey’s picture

Status: Needs review » Fixed

I updated the documentation to reflect this new information since it's been confirmed in a few other support issues that this fixed the problem.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.