I am learning to avoid the "millions of nodes" approach to Solr integration. So if I were to have a separate Solr server with millions of targets indexed already, how best might I integrate that into a Drupal site? Should I just hit it completely outside using PHP code in some page nodes, or should I go ahead and still use the Apache_Solr_Search_Integration mod with a custom schema/config? Would that cause problems with hooks and such that expect the default configs?

Some eventual goals are:

- Fast results with millions of indexed docs
- Responsive site page loads and user experience
- Ability to "import" a doc found in the external Solr index as a node
- Some visibility/communication between the index collection and the Drupal system.

Comments

pwolanin’s picture

Are you planning to index your nodes into Solr or not?

The framework module provides genral abilities to connect to Solr - you might just want to write a custom search module to replace apachesolr_search

Todd Young’s picture

No, I am not looking to index my nodes in Solr. I am hoping to search a Solr/Lucene index filled with "other stuff" and only import items returned from that "external" Solr index as non-indexed nodes. Basically I would like a faceted Solr search with advanced syntax on non-node data, but within the Drupal framework so I have access to all the tools, tricks, hooks, etc that will be necessary to create those nodes based on clicked items in the return set.

The actual index content is going to come from the Data Import Handler hitting a totally separate MySQL database.

Scott Reynolds’s picture

So as I understand, this is how I would approach this problem.

You will basically, have to write a 'apachesolr_search.module' for only your documents. You provide the hook_search() for it which is where the majority of it happens. There you build the query just like apachesolr_search module does and executes it and returns the results.

I have to imagine all you need is the hook_search().

pwolanin’s picture

Right - you may also need to create fect blocks based on the relevant fields, etc
so essentially you woudl start with apachesolr_search, but remove the node indexing part and lots of other node-related code

robertDouglass’s picture

Status: Active » Fixed

Above info by Scott and Peter is correct.

Todd Young’s picture

Thank you all, I will take a ball-peen hammer to ApacheSolr SI next week. It will be my first foray into modding a mod. I will report back if it turns out to be something cool.

mausolos’s picture

I was trying my hand with a similar approach before I had to put it aside for other stuff. I'd be very curious to see what you do with this. For reference, my somewhat different need/approach is here: http://drupal.org/node/631836

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

alexmc’s picture

How did you get on with this? I'd love to know. I'm about to do the same sort of thing- use Drupal as a front end for a SolR index of a totaly different site - (not Drupal's own nodes)

zacho’s picture

Ditto that, I'm also curious how you got on. My university uses Drupal as a frontend to digital respository objects which are themselves already indexed in solr. I'd like to use the apachesolr module to search those objects in tandem with Drupal's own content.

scott.whittaker’s picture

Bump?

fr34ck’s picture

What I try to do is similar to post # 1.
I have an external index, created with solr, and I would like to integrate it in drupal. The best way to do this is to modify schema.xml and solrindex.xml, mapping it with those of drupal?

It is wrong to use Nutch to take data from a Solr and then put it into drupal?

Raul Cano’s picture

Issue summary: View changes

Hi all,
I'm having the same issue here (D6, Solr). It looks like all possible solutions belong to the realm of D7, but for the moment, the migration is not possible in my case.
So basically, I have a Solr index with data coming from Drupal and from other non-Drupal sites and I would like to show all the data in Drupal search results. The non-Drupal content, just should be displayed and linked to the external URL.
I have found some interesting resources, but still I'm missing some working example.
https://www.palantir.net/blog/remote-data-drupal-museums-and-web-2009
https://dev.acquia.com/blog/bridge-gap-between-drupal-non-drupal-content...

Any ideas will be much appreciated!
Cheers