I am learning to avoid the "millions of nodes" approach to Solr integration. So if I were to have a separate Solr server with millions of targets indexed already, how best might I integrate that into a Drupal site? Should I just hit it completely outside using PHP code in some page nodes, or should I go ahead and still use the Apache_Solr_Search_Integration mod with a custom schema/config? Would that cause problems with hooks and such that expect the default configs?
Some eventual goals are:
- Fast results with millions of indexed docs
- Responsive site page loads and user experience
- Ability to "import" a doc found in the external Solr index as a node
- Some visibility/communication between the index collection and the Drupal system.
Comments
Comment #1
pwolanin CreditAttribution: pwolanin commentedAre you planning to index your nodes into Solr or not?
The framework module provides genral abilities to connect to Solr - you might just want to write a custom search module to replace apachesolr_search
Comment #2
Todd Young CreditAttribution: Todd Young commentedNo, I am not looking to index my nodes in Solr. I am hoping to search a Solr/Lucene index filled with "other stuff" and only import items returned from that "external" Solr index as non-indexed nodes. Basically I would like a faceted Solr search with advanced syntax on non-node data, but within the Drupal framework so I have access to all the tools, tricks, hooks, etc that will be necessary to create those nodes based on clicked items in the return set.
The actual index content is going to come from the Data Import Handler hitting a totally separate MySQL database.
Comment #3
Scott Reynolds CreditAttribution: Scott Reynolds commentedSo as I understand, this is how I would approach this problem.
You will basically, have to write a 'apachesolr_search.module' for only your documents. You provide the hook_search() for it which is where the majority of it happens. There you build the query just like apachesolr_search module does and executes it and returns the results.
I have to imagine all you need is the hook_search().
Comment #4
pwolanin CreditAttribution: pwolanin commentedRight - you may also need to create fect blocks based on the relevant fields, etc
so essentially you woudl start with apachesolr_search, but remove the node indexing part and lots of other node-related code
Comment #5
robertDouglass CreditAttribution: robertDouglass commentedAbove info by Scott and Peter is correct.
Comment #6
Todd Young CreditAttribution: Todd Young commentedThank you all, I will take a ball-peen hammer to ApacheSolr SI next week. It will be my first foray into modding a mod. I will report back if it turns out to be something cool.
Comment #7
mausolos CreditAttribution: mausolos commentedI was trying my hand with a similar approach before I had to put it aside for other stuff. I'd be very curious to see what you do with this. For reference, my somewhat different need/approach is here: http://drupal.org/node/631836
Comment #9
alexmc CreditAttribution: alexmc commentedHow did you get on with this? I'd love to know. I'm about to do the same sort of thing- use Drupal as a front end for a SolR index of a totaly different site - (not Drupal's own nodes)
Comment #10
zacho CreditAttribution: zacho commentedDitto that, I'm also curious how you got on. My university uses Drupal as a frontend to digital respository objects which are themselves already indexed in solr. I'd like to use the apachesolr module to search those objects in tandem with Drupal's own content.
Comment #11
scott.whittaker CreditAttribution: scott.whittaker commentedBump?
Comment #12
fr34ck CreditAttribution: fr34ck commentedWhat I try to do is similar to post # 1.
I have an external index, created with solr, and I would like to integrate it in drupal. The best way to do this is to modify schema.xml and solrindex.xml, mapping it with those of drupal?
It is wrong to use Nutch to take data from a Solr and then put it into drupal?
Comment #13
Raul Cano CreditAttribution: Raul Cano commentedHi all,
I'm having the same issue here (D6, Solr). It looks like all possible solutions belong to the realm of D7, but for the moment, the migration is not possible in my case.
So basically, I have a Solr index with data coming from Drupal and from other non-Drupal sites and I would like to show all the data in Drupal search results. The non-Drupal content, just should be displayed and linked to the external URL.
I have found some interesting resources, but still I'm missing some working example.
https://www.palantir.net/blog/remote-data-drupal-museums-and-web-2009
https://dev.acquia.com/blog/bridge-gap-between-drupal-non-drupal-content...
Any ideas will be much appreciated!
Cheers