Once #986718: Add support for sandbox projects lands we're going to want to add a setting that tells project_solr to exclude sandbox projects from the solr index so that they don't appear in search results and project browsing pages.
Luckily, apache_solr provides us a nice hook for exactly this purpose, hook_apachesolr_node_exclude(), so this should be relatively painless.
However, since I'm the most familiar with project_solr from the Project* maintainers, I'll be the one to implement this. Definitely needed for Git phase 2 and we aim to have this committed during sprint #6.
Comments
Comment #1
dwwOh, we should probably also exclude issues attached to sandbox projects from the solr index (see #991486: Add a setting to hide sandbox projects from global issue queues).
And, given that #994260: Add a setting to disable releases for sandbox projects is a setting, not a hard-coded requirement, it's possible that sandboxes will have release nodes. We could also hit that case if a full project with releases is demoted to a sandbox for some reason. So, we probably also want to have a setting to exclude releases from sandboxes from the solr index, too...
Also, tagging for 'sandbox projects'
Comment #2
dww#986718: Add support for sandbox projects landed. This can now happen.
Comment #3
eliza411 commentedTagging for Git Sprint 7.
Comment #4
mikey_p commentedAnother option is to expose the sandbox bit as a facet which would let users choose whether or not to include sandboxes in their search.
Comment #5
eliza411 commentedTagging for Sprint 8.
Comment #7
eliza411 commentedChanging the title to reflect the decision on this topic. I swear there's another issue requesting the facet, but I can't find it at the moment. When/if I find it, this one could be closed as duplicate.
For discussion of the overall Sandbox visibility strategy, see #1031578: [Meta] Reflect a consistent strategy for sandbox visibility
Comment #8
damien tournoud commentedStarter patch.
Comment #9
dwwThanks! That's similar to the patch I had started. ;) I merged the two and cleaned up a few things. However, when I tried it on git-dev, the UI isn't working at all. The sandbox filter isn't getting added to the underlying solr query (even though it's showing up in the URL) so a) nothing about the query changes as you alter the knob and b) the default value of the knob doesn't match what's in the search URL. I have to run to class now, so here's the patch I've got on git-dev for now. @DamZ: if you can give this a look anytime today, that'd be great. Thanks!
-Derek
Comment #10
damien tournoud commentedIsn't sandbox 1?
Comment #11
dwwRe: #10: From your own #options array:
If they didn't say they wanted "All projects" or "Only sandbox projects", we should default to "Official projects", as the comment explains.
That said, still wish I knew why this doesn't seem to work on git-dev at all. :/
Comment #12
dwwOkay, weird. The filter dropdown and underlying query logic started working once a critical mass of project nodes were re-indexed. So, that's sort of working now.
However, using '*' for 'All projects' leads to all kinds of grief. You ended up with an unescaped * in the URL, and I think that confused everything. You ended up with a weird page with no filter form at all, no search results, etc.
So, I changed that value to be the string 'all' in the form drop-down, and then added a clause inside project_solr_apachesolr_modify_query() (right when we're already looking for that filter anyway, to decide to add the default to hide sandboxes) that if it sees 'all' as the value it simply removes the filter from the underlying query. That seems to be working so I'm going to go with it for now. If it turns out we need something else, we can always change it later. ;)
Other than the inconsistent UI terminology (see #1031578-12: [Meta] Reflect a consistent strategy for sandbox visibility) which we can also make incremental tweaks to down the road, I think this is basically ready as a first step.
This doesn't yet handle issues attached to sandbox projects. I'll work on that next. Frankly, that should probably just be a separate issue entirely.
Comment #13
dwwSplit off the issue stuff here: #1045714: Expose the sandbox bit of the project an issue belongs to as a facet in Solr index
The UI is still yucky as per #1031578-12: [Meta] Reflect a consistent strategy for sandbox visibility but that's an easy fix once we figure out what we want the text to be.
Since the logic itself appears to be sound, I went ahead and committed this to HEAD of project_solr.
I also created #1045722: Either remove the 'project_solr_show_sandbox_ui' variable or configure it to TRUE on d.o when deploying Git and tagged that for "git deployment" so we don't forget to expose this UI on the project browsing forms once we're live (the $conf override is already in settings.local.php on git-dev).
Given all that, calling this particular issue fixed. ;)
Comment #14
pwolanin commentedI think it would be better to index this as a boolean per discussion w/ sdboyer
I also see what I think is 1 real bug in the code:
// If not defined, we want to *hide* sandboxes, hence 0.
$query->add_filter('is_project_sandbox', 0, TRUE);
The code seems to do the opposite of what the comment says - it hides official projects.
Committed this fix to CVS HEAD per IRC discussion w/ sdboyer.
Comment #15
pwolanin commenteddiscussed some with dww in IRC also - please mark this back as fixed if it's working right.
One change my patch made was that in project filtering, selecting 'All' projects sends [* TO *] as the filter, and ideally should populate the correct default value on the next page.
Comment #16
pwolanin commentedThis wasn't working as desired since the the code is always adding a filter for projects that have releases:
This one-line change should fix that.
Committing to HEAD.
Comment #17
pwolanin commentedFor some reason this doesn't seem to work after sdboyer pushed it to the dev site. The result is not right in the UI, plus I see this as the query in the log:
Which looks like this when passed through urldecode():
Comment #18
pwolanin commentednote, you can watch the log in real time by doing something like this on civicspace:
Comment #19
pwolanin commentedWould help if I used the right method.
Comment #20
dwwAwesome, thanks pwolanin! This whole question of the filter on releases was on my mind over at #1019300: Extend "filter by compatibility" -- funny I didn't think of that when we were looking at this in IRC. ;)
However, one slightly weird thing about the current behavior I'd like to fix:
Compare:
http://git-dev.drupal.org/project/modules
"7512 Modules match your search"
And what happens when you press "Search" without changing any of the exposed filters:
http://git-dev.drupal.org/project/modules?filters=bs_project_sandbox:0&s...
"7402 Modules match your search"
So, the default case is adding the wrong filter (any) which disagrees with the #default_value for the exposed filter (official-only). Should be an easy fix. Stay tuned.
Thanks again!
-Derek
Comment #21
dwwSee attached. Yeah, #16 was the culprit. I hope the code comments are clear enough, so I won't repeat myself here.
Tried this on git-dev and now both of these give 7402 modules:
http://git-dev.drupal.org/project/modules
http://git-dev.drupal.org/project/modules?filters=bs_project_sandbox:0
Any objections?
Thanks,
-Derek
p.s. since we're doing a fair bit of work on this issue here in sprint 10, figured I should tag accordingly...
Comment #22
dwwCommitted to HEAD.
Comment #23
dwwComment #24
pwolanin commentedOh, I see - so now you are adding the filter on has_releases 2 places?
Guss that's ok, but post-release we should do some cleanup and optimization of this code.
Among other things, it seems to be getting facets but not using them?
Comment #25
dww@pwolanin: Yeah, this code is a mess, which is a result of me never really fully understanding apachesolr during both the original project_solr development and then the d.o redesign. I'd love to make some time post-launch to go through all this with you and cleanup/simplify/optimize it. But yeah, post-launch, and in another issue. ;) Thanks!
Comment #26
pwolanin commentedI think this needs work - I think it will end up filtering out most results on the normal keyword search page.
Comment #27
dwwHow right you are.
Before:
http://git-dev.drupal.org/search/apachesolr_multisitesearch/signup
Found 57 results containing the words: signup
After:
http://git-dev.drupal.org/search/apachesolr_multisitesearch/signup
Found 5417 results containing the words: signup
;)
This seems like the right fix. @pwolanin, care to review and comment? Thanks!
Comment #28
dwwIn IRC, pwolanin pointed out we want to keep the generic filter for excluding sandbox content (e.g. to keep sandbox projects out of general search term results). So, fixed the logic (and comments) accordingly. Pretty sure this does what we want.
Comment #29
dwwI tried and failed to re-write #28 to only handle the release filter inside project_solr_browse_page(). I don't have the time or energy to mess with this any further right now. Tested and verified #28 is working as expected. We'll have to revisit this whole mess later.
Comment #30
dwwp.s. and meant to say... committed #28 to HEAD of project_solr: http://drupal.org/cvs?commit=502692