It makes no sense to index the same value multiple times in a multi value field.
Here's a patch that avoids duplicates, no matter how many times contrib modules probably add the same value.

#1 1917400_avoid_multiple_identical_values.patch522 bytesmkalkbrenner
PASSED: [[SimpleTest]]: [MySQL] 513 pass(es). View


mkalkbrenner’s picture

Status: Active » Needs review
522 bytes
PASSED: [[SimpleTest]]: [MySQL] 513 pass(es). View
Nick_vh’s picture

Status: Needs review » Reviewed & tested by the community

Makes sense

Nick_vh’s picture

Version: 7.x-1.x-dev » 6.x-3.x-dev
Status: Reviewed & tested by the community » Patch (to be ported)

needs backport to 6.x-3.x, committed to 7.x-1.x

pwolanin’s picture

Version: 6.x-3.x-dev » 7.x-1.x-dev
Status: Patch (to be ported) » Needs work

I think we should revert - it's not the module's job to clean up your data or guess what you meant.

Nick_vh’s picture

Reverted in code, we should figure out if this is a difference for Solr (eg. Boosting). For facetting this certainly makes not a big difference.

pwolanin’s picture

Status: Needs work » Closed (won't fix)

the number of times the value appears affects scoring, so I don't think we should be trying to guess the intent at this level.

mkalkbrenner’s picture

After our conversation I agree with Peter and will solve the duplicate entry issue within Apache Solr Multilingual.

heacu’s picture

note that this can also be done effectively in solrconfig.xml using

you'll need Solr 4.0 for this, though.