To support Drupal 6.x-3.x and Drupal 7.x-1.x

What is needed :

  • use the site and hash information in solr to facet on but also to filter queries (deletions, selects if needed)
  • Add metadata for D6 and D7
    • Content types
    • Bias information
    • Site Hash (already included from the module)
    • Site Url (already included from the module)
    • Vocabulary names
  • Add a facet on the hash with output the name of the site
  • Modify the bias pages to include bias for content from the other sites.

I propose that we open up a new branch for D7 and for D6 and we start developing.
Let's do this in the same line as apachesolr. 7.x-1.x for the Drupal 7 version and 6.x-3.x for the Drupal 6 version.

Comments

webchick’s picture

Title:Upgrade to Drupal 7» Upgrade Apache Solr Multisite Search to Drupal 7

Actually, this title will make my community initiatives page make more sense. ;)

webchick’s picture

Title:Upgrade Apache Solr Multisite Search to Drupal 7» Port Apache Solr Multisite Search to Drupal 7

One more. I'm really done now. :P

ogi’s picture

subscribe

gorillaz.f’s picture

cool , subscribe
and drupal.org already use it in d7core, how come ?

jpmckinney’s picture

Category:task» feature

Upgrade path will be 6.x-1.x -> 6.x-2.x -> 7.x-1.x

jpmckinney’s picture

Category:feature» task
jpmckinney’s picture

Noting that apachesolr 7.x now has significant schema changes.

Ravi.J’s picture

Category:feature» task

Hi,
Anyone working on this , any progress ?

Ravi.J’s picture

sub

chymz’s picture

sub

anavarre’s picture

Subscribe

hernani’s picture

Sub

Refineo’s picture

subscribe

wmostrey’s picture

I'm taking this on, expect an initial patch soon.

wmostrey’s picture

Status:Active» Needs review
StatusFileSize
new35.11 KB

Here's the first basic version. Everything works except for the blocks/facets. Any help getting this last bit fixed with the new Facet API would be very much appreciated.

(Go to admin/config/search/settings and make sure the checkbox for Multisite is checked. Possibly clear cache afterwards.)

wmostrey’s picture

I'm actually curious if we need to add any other filter than "Filter by site". Filters provided by apachesolr (like "Filter by content type") also work in this D7 multisite version.

wmostrey’s picture

StatusFileSize
new31.75 KB

Here's a new patch. All it needs now is the "Filter by site" facet, and probably cleaning up some legacy code afterwards.

pwolanin’s picture

Thanks for the patches.

re: $document->entity_id = 1;, seems like instead it should be the hash?

Or maybe we should make that a non-required field?

pwolanin’s picture

We should discuss the architecture - I had though that we might actually merge this into the main module, depending on what's left after we remove the facet code.

wmostrey’s picture

Making entity_id a non-required field seems like a good idea. Also, entity_id is type long, so it can't be the hash.

Making the multisite functionality part of the main module makes sense to me, we're still using a lot of semi-duplicate code anyway. What's the best way to discuss this?

Refineo’s picture

Issue tags:+d7 ports

adding d7 ports tag

pwolanin’s picture

@wmostrey - I hope to have a better handle on this architecture by late next week. Maybe we can have a call on Sept 23? I'll look at changing the schema in advance of that.

wmostrey’s picture

Let's do that, great!

wmostrey’s picture

An updated version, with all d6 facet code removed and a clean settings page. Tested with both Drupal sites using the apachesolr module and non-Drupal sites crawled with Nutch.

synbaxp’s picture

wmostrey’s picture

Here are the instructions: http://drupal.org/node/666606/git-instructions/6.x-1.x

In short:

1. Setting up repository for the first time

git clone --branch 6.x-1.x http://git.drupal.org/project/apachesolr_multisitesearch.git
cd apachesolr_multisitesearch

2. Applying a patch
Download the patch to your working directory. Apply the patch with the following command:
git apply -v [patchname.patch]

synbaxp’s picture

stijn.vanden.brande’s picture

Can you create the 7 branch with this patch?
That way it is easier to get the module.

wmostrey’s picture

The patch will most likely need to pass review first before Peter Wolanin creates the branch. So if you want to help move this forward: review the patch and get the status to RTBC. Thanks!

wmostrey’s picture

I now also added a site/hash facet so you can now again filter the search results per site.

pwolanin’s picture

looks like a good start, especially if it's moving toward faceapi integration.

We'll need to figure out how to expose the appropriate multi-site facets there, however.

synbaxp’s picture

I had a issue with the site metadata, when go to the "Multisite seetings" section then under the "Delete data from sites using this index" section I only can see two sites (total subsites are >10). My question is how to get the correct information from all subsites on the list?

Thanks a lot!

wmostrey’s picture

Since the schema has change a bit to include an entity_id (instead of entity) you need to index each individual subsite again using this module. That should fix your problem.

pwolanin’s picture

Note that we don't yet have a 6 version of apachsolr compatible with the 7 version. That will be the 6.x-3.x branch.

mgifford’s picture

Subscribe.. As always reminding folks about using http://drupal.org/project/coder to look over patches and help review them before release.

ebremer’s picture

Subscribe

pwolanin’s picture

Maybe I will create the branch if this is a generally working basis for progress.

pwolanin’s picture

For 7.x I would like to figure out how to meld the multisite search functionality with the search environments concept we added in 7.x apachesolr module, as well as with the custom search pages.

I feel like we should be able to make this module even smaller, since I always work to have the support for multi-site search pretty well baked into the main module.

wmostrey’s picture

I agree. I'll see what I can do to integrate the concepts of apachesolr_multisite into the apachesolr modules.

pwolanin’s picture

Ok, I may take a crack at it this weekend myself.

pwolanin’s picture

Status:Needs review» Active

note, patch above I committed to a new branch, so setting back to active

pwolanin’s picture

pwolanin’s picture

Also, I think we should potentially remove the use of the core search hooks (especially for the search page), and just leverage the user defined search pages.

pwolanin’s picture

pwolanin’s picture

Status:Active» Needs work
StatusFileSize
new13.81 KB

Starting to reduce this down to the essence.

synbaxp’s picture

Hi Guys,

I have a question here and I really appreciate your help! I have a drupal7 with multisite setup and solr multisite search module to do the durpal multisite search. Now I have another non drupal (simple html) site running on another web server. My question is how to search cross the many drupal sites and the non-drupal site and get the results from ALL sites?

Thanks a lot!

pwolanin’s picture

@synbaxp - off topic. This issue is about the code update for 7.x.

Open a separate support request or try IRC.

wmostrey’s picture

I've been talking through this with Nick Veenhof. I did some testing with the latest Apache Solr dev module, and since it now also takes the hash into account, every page is actually ready to support multisites. I believe we need the following functions in Apache Solr to get it working:

  • apachesolr_multisitesearch_facetapi_facet_info() to create the site/hash facet; we could add the metadata settings to the facet configuration.
  • apachesolr_multisitesearch_apachesolr_query_alter() with an option per Search Page to enable or disable the addFilter

We might need to work out the details as to what configuration goes where, but this will bring us a long way.

Your patch is good to go, except that the function should be apachesolr_multisitesearch_facetapi_facet_info() and not apachesolr_multisitesearch_facet_info().

Nick_vh’s picture

Status:Needs work» Needs review

If the complete module is replaced with this code it is already working between different Drupal 6 and 7 sites. What is left is to make node access integration work between Drupal 6 and 7.

<?php
/**
 * @file
 * Extends Apache Solr Search module to provide multisite support.
 * This includes
 * 1) A facet that allows filtering per site
 * 2) changes the links so they redirect to the approriate site
 *
 */

/**
 * Implements hook_facetapi_facet_info().
 *
 * @param type $searcher_info
 * @return type
 */
function  apachesolr_multisitesearch_facetapi_facet_info($searcher_info) {
 
$facets = array();
 
$facets['site'] = array(
   
'field' => 'site',
   
'label' => t('Site Name'),
    
'description' => t('Filter by Site Name'),
   );
  return
$facets;
}

/**
 * Make sure that the links in our search results link to the website of origin
 */
function  apachesolr_multisitesearch_apachesolr_process_results(&$results, DrupalSolrQueryInterface $query) {
  foreach (
$results as $id => $result) {
   
$results[$id]['link'] = $results[$id]['fields']['url'];
  }
}
?>
Nick_vh’s picture

StatusFileSize
new23.46 KB

I propose a much bigger change and make it easier for all of us to build it from scratch again

Nick_vh’s picture

StatusFileSize
new23.9 KB

Fixing the right package so it shows up in search toolkit now

Nick_vh’s picture

StatusFileSize
new23.9 KB

Some namespace issues. I think this one should be good to go in and let's follow up with other functionality later on? What do you think?

wmostrey’s picture

The patch in #52 is good to go. I would already prefer to see a dev release based on this patch to continue working on.

Nick_vh’s picture

I pinged pwolanin to take a look at this issue. Afaik he will do that asap.

kattekrab’s picture

I reckon this might benefit from a good issue summary too...

pwolanin’s picture

Status:Needs review» Needs work

Trying to figure out all the deletions

function apachesolr_multisitesearch_map_hash() becomes a no-op? You removed hook_facetapi_facet_info()?

We certainly still need the hook_apachesolr_query_alter(), but it should be looking to a per-envirnoment setting.

Also, all the metadata functionality seems to be removed. I'm not sure what's going on - is this the right patch?

Nick_vh’s picture

To support Drupal 6.x-3.x and Drupal 7.x-1.x

What is needed :

  • use the site and hash information in solr to facet on but also to filter queries (deletions, selects if needed)
  • Add metadata for D6 and D7
    • Content types
    • Bias information
    • Site Hash (already included from the module)
    • Site Url (already included from the module)
    • Vocabulary names
  • Add a facet on the hash with output the name of the site
  • Modify the bias pages to include bias for content from the other sites.

I propose that we open up a new branch for D7 and for D6 and we start developing.
Let's do this in the same line as apachesolr. 7.x-1.x for the Drupal 7 version and 6.x-3.x for the Drupal 6 version.

(added this to the opening post)

Nick_vh’s picture

Status:Needs work» Needs review
StatusFileSize
new26.8 KB
new36.58 KB

These patches should include the metadata + the corrected hash to sitename mapping that comes from the metadata.

as I mentioned before I would prefer if those were added to the 7.x-1.x branch and the 6.x should be added to a new branch 6.x-3.x

I've tested these patches on a 6.x site and on a 7.x site and multisite between 6.x and 7.x is working perfectly. The regular module takes care of indexing fields with their machine name so a D6 and a D7 site can easily create a facet that is using content from both.

I also added the content types/bundles to the meta information but I'd like to have some more input how we could handle bias information for content types/bundles that are not part of the site where the search was executed

Nick_vh’s picture

StatusFileSize
new28.24 KB
new39.51 KB
  1. Merged in the suggestions of pwolanin from #45
  2. You have to explicitly mention you want a specific environment to be multisite capable. If not, you won't see multisite search results. This settings was added to the environment configuration

The patch for 6 was diffed with the current 6, the patch for 7 was diffed with the current 7

Would this be a good starting point for all?

Nick_vh’s picture

StatusFileSize
new28.22 KB

Forgot to remove a dsm...

pwolanin’s picture

Let's change this:

$document->entity_type = 'multisite_meta';

and use a string that cannot be a valid Drupal entity type.

e.g. 'multisite.meta' or 'multisite/meta' or 'multisite-meta'

You moved a bunch of functionality like apachesolr_multisitesearch_generate_metadata() into the .module instead of leaving it in the admin.inc. If it's not used on most page loads, I think better to keep in the .inc file?

klaasvw’s picture

How should we go about testing this? I ran into several issues so I might be doing something wrong.

I tried this with apachesolr 7.x HEAD and 3 sites sharing the same solr core. I enabled multisite support for the solr server and cleared the index and reindexed every site. I ran into these issues:

  • All search results share the same website facet. The facet is always equal to the site you're currently on.
  • $results[$id]['fields']['hash'] doesn't exist (line 77 in apachesolr_multisitesearch.module
  • When using the same facets but for entities that don't exist on the current site the raw value of the facet is displayed. For example, if a node from another site is attached a term with tid 6 that is only available on the other site there is a facet "6".
Nick_vh’s picture

StatusFileSize
new39.42 KB
new28.22 KB

@pwolanin : I moved it because I felt some of this code did not belong in an admin.inc. The code that is used is not only for the admin pages but could be used as an API (crud of the metadata) for those that need it.
I only included functions in the admin file that are directly related to the admin configuration. Which one do you want to move to the admin.inc?
Patch attached with multisite.meta as entity type

@klaasvw
This is still very much a work in progress. I suggest that you try to find the broken part, correct it and upload the patch. Does that work for you?

RobLoach’s picture

Status:Needs review» Needs work

Although I'm not that familiar with the module, here's a quick Dreditor scan! This is off drupal7.patch. I really like the LOC I/D ratio :-) .

+++ b/apachesolr_multisitesearch.infoundefined
@@ -2,5 +2,5 @@ name = Apache Solr Multisite Search
-package = Apache Solr
+package = Search Toolkit

Should it be "Apache Solr Search Toolkit" instead? It's usually good to namespace module names.

+++ b/apachesolr_multisitesearch.moduleundefined
@@ -23,133 +23,89 @@ function apachesolr_multisitesearch_menu() {
+  return $data;
+}
+
+function apachesolr_multisitesearch_apachesolr_process_results(&$results, DrupalSolrQueryInterface $query) {
+  $env_id = $query->solr('getId');

Might like some doc block for apachesolr_multisitesearch_apachesolr_process_results().

+++ b/apachesolr_multisitesearch.moduleundefined
@@ -23,133 +23,89 @@ function apachesolr_multisitesearch_menu() {
+ *
+ * @param string $query
+ *   Defaults to *:*
  */
-function apachesolr_multisitesearch_cron() {
-  apachesolr_multisitesearch_refresh_metadata();
+function hook_apachesolr_delete_by_query_alter($query) {
+  // use the site hash so that you only delete this site's content
+  if ($query == '*:*') {
+    $query = 'hash:' . apachesolr_site_hash();

Is this suppose to be "hook_apachesolr_delete_by_query_alter()"? Maybe we should move that to apachesolr_multisitesearch.api.php instead?

pwolanin’s picture

@Nick - I was using admin.inc as a generic include file, despite the name.

Nick_vh’s picture

@rob loach - That hook is clearly wrong, should be fixed indeed. The search toolkit is a general package name so this module will appear in the same list as apachesolr and its derivatives.

@pwolanin, Are you ok with moving them to an apachesolr_multisite.index.inc (similar to apachesolr?). We could even call it meta.inc or something similar.

pwolanin’s picture

@Nick - have a index.inc file is fine as you like it - I was just lazy when I wrote it and found it easier to have just one .inc file to look in.

Nick_vh’s picture

Status:Needs work» Needs review
StatusFileSize
new36.85 KB
new31.68 KB

This patch should have an index.inc + the fix with the delete hook. Tested out most of the functionality with a D6 and D7 site. Also the D6 and the D7 module are now very similar when compared to eachother so I'll include a small diff of that also

ignore this one

Nick_vh’s picture

This patch should have an apachesolr_multisitesearch.index.inc + the fix with the delete hook. Tested out most of the functionality with a D6 and D7 site. Also the D6 and the D7 module are now very similar when compared to each other so I'll include a small diff of that to show the differences.

pwolanin’s picture

Looks better. I think we still need to e.g. alter the author facet for a multisite environment, but that can be a follow-up.

@Nick - I added your commit access if you want to get these patches into git.

Nick_vh’s picture

Status:Needs review» Fixed

Commited to 7.x-1.x

Nick_vh’s picture

Created a branch 6.x-3.x and applied the patch for the 6.x-3.x branch

webchick’s picture

Oh, rock!!! Thanks so much guys! :D

Nick_vh’s picture

The cool thing is that you can now do a multisite between D6 and D7 sites ;-) Still a work in progress though!

Status:Fixed» Closed (fixed)
Issue tags:-Needs issue summary update, -d7 ports

Automatically closed -- issue fixed for 2 weeks with no activity.

Anonymous’s picture

Issue summary:View changes

Changed opening post