When indexing multilingual content (pages) created using the i18n module (using language prefixes), both the site and url field in the Solr index will be wrong, they always point to the node in the default language instead of the language specified for the node in question. This can be fixed by explicitly passing the entity language in all url calls in _apachesolr_index_process_entity_get_document (in apachesolr.index.inc).
I changed it as follows (might not be the most performant way to do this, but it seems to work) :
function _apachesolr_index_process_entity_get_document($entity, $entity_type) {
list($entity_id, $vid, $bundle) = entity_extract_ids($entity_type, $entity);
$document = new ApacheSolrDocument();
$languages = language_list();
$urlOptions = array('absolute' => true);
if (!empty($entity->language)) {
$urlOptions = $urlOptions + array('language' => $languages[$entity->language]);
}
$document->id = apachesolr_document_id($entity_id, $entity_type);
$document->site = url(null, $urlOptions);
$document->hash = apachesolr_site_hash();
$document->entity_id = $entity_id;
$document->entity_type = $entity_type;
$document->bundle = $bundle;
$document->bundle_name = entity_bundle_label($entity_type, $bundle);
$path = entity_uri($entity_type, $entity);
// A path is not a requirement of an entity
if (!empty($path)) {
$document->path = $path['path'];
$document->url = url($path['path'], $path['options'] + $urlOptions);
}
if (empty($entity->language)) {
// 'und' is the language-neutral code in Drupal 7.
$document->language = LANGUAGE_NONE;
}
else {
$document->language = $entity->language;
}
// Path aliases can have important information about the content.
// Add them to the index as well.
if (function_exists('drupal_get_path_alias')) {
// Add any path alias to the index, looking first for language specific
// aliases but using language neutral aliases otherwise.
$output = drupal_get_path_alias($document->path, $document->language);
if ($output && $output != $document->path) {
$document->path_alias = $output;
}
}
return $document;
}
| Comment | File | Size | Author |
|---|---|---|---|
| #21 | 1807552-ss-fix-21-D6-do-not-test.patch | 3.55 KB | pwolanin |
| #20 | 1807552-ss-fix-20.patch | 3.56 KB | pwolanin |
| #14 | 1807552-14.patch | 2.47 KB | pwolanin |
| #8 | 1807552-8.patch | 2.18 KB | nick_vh |
| #5 | 1807552-5.patch | 2.29 KB | nick_vh |
Comments
Comment #1
nick_vhCan you highlight the changes or post a patch with your changes?
How to make a patch -> http://drupal.org/node/707484
Comment #2
wimvds commentedWill do. I fixed a small issue related to this on the results page as well (urls were regenerated there instead of using those already stored in the Solr index, causing the same issue).
Comment #3
wimvds commentedComment #4
nick_vhUpdated the code to be a bit more robust and Drupal API friendly
Comment #5
nick_vhWhitespaces issue
Comment #6
nick_vhDoes this apply to 6.x-3.x? I'm sure i18n is not as developed in D6 compared to D7?
Comment #7
nick_vhComment #8
nick_vhComment #9
nick_vhfixed
Comment #10
nick_vhclosing to clear out the issue queue a bit
Comment #11
marc angles commentedHi,
I'm running 7.x-1.1.
I still don't have the right url to the non english nodes. When a non-english node appears in the results the url displayed for it is the url of the source (english) node.
In brief I still do not have the right url in search results...
Is this supposed to work correctly ? If so, I'll look elsewhere.
Thanks
Comment #12
nick_vhThis is supposed to work correctly. It could be that you work with a different translation system?
If you can reproduce this problem, please tell us what you are working with and give us as much information as possible. I'd also check out the apachesolr_multilingual project to see if they have some additional information. I'm closing this, please open a follow up ticket.
Comment #13
pwolanin commentedthis is a serious regression use from index the absolute url - it screws up anyone who might index on http, and view on https, or index on a back-end server.
Comment #14
pwolanin commentedthis rolls back part of the change, and renders the URL from the path at display time, which is the only correct way.
Comment #15
nick_vhGood to go - this also fixes https://drupal.org/node/1852088
Comment #16
cilefen commented#14 works for me in terms of returning https links when searching on https.
Comment #17
pwolanin commentedcommitted to 7.x and 6.x-3.x
Comment #19
pwolanin commenteddoh, that was wrong too - the common schema won't index $doc->language
Comment #20
pwolanin commentedComment #21
pwolanin commentedComment #22
pwolanin commentedcommitted