Once #986718: Add support for sandbox projects lands we're going to want to add a setting that tells project_solr to exclude sandbox projects from the solr index so that they don't appear in search results and project browsing pages.

Luckily, apache_solr provides us a nice hook for exactly this purpose, hook_apachesolr_node_exclude(), so this should be relatively painless.

However, since I'm the most familiar with project_solr from the Project* maintainers, I'll be the one to implement this. Definitely needed for Git phase 2 and we aim to have this committed during sprint #6.

Comments

dww’s picture

Issue tags: +sandbox projects

Oh, we should probably also exclude issues attached to sandbox projects from the solr index (see #991486: Add a setting to hide sandbox projects from global issue queues).

And, given that #994260: Add a setting to disable releases for sandbox projects is a setting, not a hard-coded requirement, it's possible that sandboxes will have release nodes. We could also hit that case if a full project with releases is demoted to a sandbox for some reason. So, we probably also want to have a setting to exclude releases from sandboxes from the solr index, too...

Also, tagging for 'sandbox projects'

dww’s picture

Status: Postponed » Active

#986718: Add support for sandbox projects landed. This can now happen.

eliza411’s picture

Issue tags: +git sprint 7

Tagging for Git Sprint 7.

mikey_p’s picture

Another option is to expose the sandbox bit as a facet which would let users choose whether or not to include sandboxes in their search.

eliza411’s picture

Issue tags: +git sprint 8

Tagging for Sprint 8.

eliza411’s picture

Title: Add a setting to exclude sandbox projects from the Solr index » Expose the sandbox bit as a facet in Solr index

Changing the title to reflect the decision on this topic. I swear there's another issue requesting the facet, but I can't find it at the moment. When/if I find it, this one could be closed as duplicate.

For discussion of the overall Sandbox visibility strategy, see #1031578: [Meta] Reflect a consistent strategy for sandbox visibility

damien tournoud’s picture

Status: Active » Needs review
StatusFileSize
new2.45 KB

Starter patch.

dww’s picture

Status: Needs review » Needs work
StatusFileSize
new2.75 KB

Thanks! That's similar to the patch I had started. ;) I merged the two and cleaned up a few things. However, when I tried it on git-dev, the UI isn't working at all. The sandbox filter isn't getting added to the underlying solr query (even though it's showing up in the URL) so a) nothing about the query changes as you alter the knob and b) the default value of the knob doesn't match what's in the search URL. I have to run to class now, so here's the patch I've got on git-dev for now. @DamZ: if you can give this a look anytime today, that'd be great. Thanks!

-Derek

damien tournoud’s picture

// If not defined, we want to *hide* sandboxes, hence 0.
$query->add_filter('is_project_sandbox', 0, TRUE);

Isn't sandbox 1?

dww’s picture

Re: #10: From your own #options array:

      0 => t('Official projects'),
      '*' => t('All projects'),
      1 => t('Only sandbox projects'),

If they didn't say they wanted "All projects" or "Only sandbox projects", we should default to "Official projects", as the comment explains.

That said, still wish I knew why this doesn't seem to work on git-dev at all. :/

dww’s picture

Status: Needs work » Needs review
StatusFileSize
new3.11 KB

Okay, weird. The filter dropdown and underlying query logic started working once a critical mass of project nodes were re-indexed. So, that's sort of working now.

However, using '*' for 'All projects' leads to all kinds of grief. You ended up with an unescaped * in the URL, and I think that confused everything. You ended up with a weird page with no filter form at all, no search results, etc.

So, I changed that value to be the string 'all' in the form drop-down, and then added a clause inside project_solr_apachesolr_modify_query() (right when we're already looking for that filter anyway, to decide to add the default to hide sandboxes) that if it sees 'all' as the value it simply removes the filter from the underlying query. That seems to be working so I'm going to go with it for now. If it turns out we need something else, we can always change it later. ;)

Other than the inconsistent UI terminology (see #1031578-12: [Meta] Reflect a consistent strategy for sandbox visibility) which we can also make incremental tweaks to down the road, I think this is basically ready as a first step.

This doesn't yet handle issues attached to sandbox projects. I'll work on that next. Frankly, that should probably just be a separate issue entirely.

dww’s picture

Status: Needs review » Fixed

Split off the issue stuff here: #1045714: Expose the sandbox bit of the project an issue belongs to as a facet in Solr index

The UI is still yucky as per #1031578-12: [Meta] Reflect a consistent strategy for sandbox visibility but that's an easy fix once we figure out what we want the text to be.

Since the logic itself appears to be sound, I went ahead and committed this to HEAD of project_solr.

I also created #1045722: Either remove the 'project_solr_show_sandbox_ui' variable or configure it to TRUE on d.o when deploying Git and tagged that for "git deployment" so we don't forget to expose this UI on the project browsing forms once we're live (the $conf override is already in settings.local.php on git-dev).

Given all that, calling this particular issue fixed. ;)

pwolanin’s picture

Status: Fixed » Needs review
StatusFileSize
new4.15 KB

I think it would be better to index this as a boolean per discussion w/ sdboyer

I also see what I think is 1 real bug in the code:

// If not defined, we want to *hide* sandboxes, hence 0.
$query->add_filter('is_project_sandbox', 0, TRUE);

The code seems to do the opposite of what the comment says - it hides official projects.

Committed this fix to CVS HEAD per IRC discussion w/ sdboyer.

pwolanin’s picture

discussed some with dww in IRC also - please mark this back as fixed if it's working right.

One change my patch made was that in project filtering, selecting 'All' projects sends [* TO *] as the filter, and ideally should populate the correct default value on the next page.

pwolanin’s picture

Status: Needs review » Fixed
StatusFileSize
new1012 bytes

This wasn't working as desired since the the code is always adding a filter for projects that have releases:

$query->add_filter('bs_project_has_releases', '1');

This one-line change should fix that.

Committing to HEAD.

pwolanin’s picture

Status: Fixed » Needs work

For some reason this doesn't seem to work after sdboyer pushed it to the dev site. The result is not right in the UI, plus I see this as the query in the log:

/solr/do-git-dev/select?fl=id%2Cnid%2Ctitle&start=0&rows=4&fq=bs_project_sandbox%3A1&fq=type%3Aproject_project&fq=im_vid_3%3A14&fq=bs_project_has_releases%3A1&fq=hash%3A1hbejm&sort=sis_project_release_usage+desc&version=1.2&wt=json&json.nl=map&q=

Which looks like this when passed through urldecode():

/solr/do-git-dev/select?fl=id,nid,title&start=0&rows=4&fq=bs_project_sandbox:1&fq=type:project_project&fq=im_vid_3:14&fq=bs_project_has_releases:1&fq=hash:1hbejm&sort=sis_project_release_usage desc&version=1.2&wt=json&json.nl=map&q=
pwolanin’s picture

note, you can watch the log in real time by doing something like this on civicspace:

 tail -f /usr/share/jetty6/logs/2011_02_16.request.log | grep do-git-dev
pwolanin’s picture

Status: Needs work » Fixed
StatusFileSize
new981 bytes

Would help if I used the right method.

dww’s picture

Status: Fixed » Needs work

Awesome, thanks pwolanin! This whole question of the filter on releases was on my mind over at #1019300: Extend "filter by compatibility" -- funny I didn't think of that when we were looking at this in IRC. ;)

However, one slightly weird thing about the current behavior I'd like to fix:

Compare:

http://git-dev.drupal.org/project/modules

"7512 Modules match your search"

And what happens when you press "Search" without changing any of the exposed filters:

http://git-dev.drupal.org/project/modules?filters=bs_project_sandbox:0&s...

"7402 Modules match your search"

So, the default case is adding the wrong filter (any) which disagrees with the #default_value for the exposed filter (official-only). Should be an easy fix. Stay tuned.

Thanks again!
-Derek

dww’s picture

Status: Needs work » Needs review
Issue tags: +git sprint 10
StatusFileSize
new1.5 KB

See attached. Yeah, #16 was the culprit. I hope the code comments are clear enough, so I won't repeat myself here.

Tried this on git-dev and now both of these give 7402 modules:

http://git-dev.drupal.org/project/modules
http://git-dev.drupal.org/project/modules?filters=bs_project_sandbox:0

Any objections?

Thanks,
-Derek

p.s. since we're doing a fair bit of work on this issue here in sprint 10, figured I should tag accordingly...

dww’s picture

Committed to HEAD.

dww’s picture

Status: Needs review » Fixed
pwolanin’s picture

Oh, I see - so now you are adding the filter on has_releases 2 places?

Guss that's ok, but post-release we should do some cleanup and optimization of this code.

Among other things, it seems to be getting facets but not using them?

dww’s picture

@pwolanin: Yeah, this code is a mess, which is a result of me never really fully understanding apachesolr during both the original project_solr development and then the d.o redesign. I'd love to make some time post-launch to go through all this with you and cleanup/simplify/optimize it. But yeah, post-launch, and in another issue. ;) Thanks!

pwolanin’s picture

Status: Fixed » Needs work

I think this needs work - I think it will end up filtering out most results on the normal keyword search page.

dww’s picture

Status: Needs work » Needs review
StatusFileSize
new947 bytes

How right you are.

Before:
http://git-dev.drupal.org/search/apachesolr_multisitesearch/signup
Found 57 results containing the words: signup

After:
http://git-dev.drupal.org/search/apachesolr_multisitesearch/signup
Found 5417 results containing the words: signup

;)

This seems like the right fix. @pwolanin, care to review and comment? Thanks!

dww’s picture

In IRC, pwolanin pointed out we want to keep the generic filter for excluding sandbox content (e.g. to keep sandbox projects out of general search term results). So, fixed the logic (and comments) accordingly. Pretty sure this does what we want.

dww’s picture

Status: Needs review » Fixed

I tried and failed to re-write #28 to only handle the release filter inside project_solr_browse_page(). I don't have the time or energy to mess with this any further right now. Tested and verified #28 is working as expected. We'll have to revisit this whole mess later.

dww’s picture

p.s. and meant to say... committed #28 to HEAD of project_solr: http://drupal.org/cvs?commit=502692

Status: Fixed » Closed (fixed)
Issue tags: -git phase 2, -git sprint 6, -sandbox projects, -git sprint 7, -git sprint 8, -git sprint 10

Automatically closed -- issue fixed for 2 weeks with no activity.