We should add a "drupal.org issues" link in the primary navigation block (/project/issues/drupal-org?). This page would include:

- A form that looks like an issue submit form that's really an advanced search form, with a few changes to the "Project information" fieldset. "Project:" would be hard-wired to point to the core, webmasters, and infra queues, along with the issues queues of all the projects running on d.o: project*, cvs, drupalorg.module, comment_upload, etc. Maybe "Project" wouldn't even be a selector. There'd definitely be no "Version" selector. The "Component" drop-down would include all of the components from all projects in the list, in subsections labeled by project. The whole "Issue information" fieldset could be identical. No File attachements, Log message, or other node-form related stuff. There's no "submit" button, only "preview", which is really "search". ;) The help text at the top of the page would read something like "Please describe the issue you would like to report about this site."

- If you preview the form, it displays any matches for the query.

- Whether or not matches are found, at the bottom of the results page, there's the real issue submit form, pre-populated with what you entered already (and it'd know the right project based on the component you selected). If you didn't select a component, it could give you a project selector restricted to the hard-wired list of projects, or it could make the mega-component field required so it'd be a validation error after you preview. If any search results are found, there'd be a required checkbox on this issue form, defaulted to not checked, which says "I have reviewed the search results and believe this is a distinct, unreported issue".

This would drastically reduce the number of duplicate issues submitted. I waste a huge amount of time marking stuff duplicate, since issues are spread out all over the place (especially between d.o-specific queues, and the various project* + cvs* queues). I'm one of the few (possibly only?) people somewhat familiar with nearly every issue from any of those queues, so I can easily spot dups just by reading titles, but it's a big drain on my effort.

This could start life as a custom menu item/page/form provided by the http://drupal.org/project/drupalorg module, just to get it working quickly. Then we could look into folding some (most?) of the functionality back into project_issue once we've gained some experience with it in our specific case. Therefore, the patch should probably be rolled against drupalorg.module. Once someone posts a patch, this issue can move into that queue. I'm starting it here in the webmasters queue for better visibility.

My hands are full. Whoever writes this becomes my life-long friend (and a hero to the d.o community). Who will it be?

Thanks,
-Derek

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

pwolanin’s picture

this sounds like a good idea - but it would seem to require a rather advanced search algorithm in order to return meaningful results?

oadaeh’s picture

Considering how poor the search mechanism currently is, the preview page will always have tons of false-positives.

If you're going to only use the title to search with, how often has someone submitted an issue with a totally unrelated title, or a non-descriptive title like, "found a bug!"

If you're going to search on the description, some words need to be eliminated from the search (like "a", "the", "to" and "of", for starters) to reduce the number of false-positives.

I spend (waste?) countless time searching for related issues before I post, simply because I have to wade through all the false-positives that come up. I'm sure I'm not the only one. Now imagine that shows up every time a new issue is posted. How long will it be before people automatically click on the check box w/o reading through the list of issues that are presented to them?

greggles’s picture

@oadaeh - without a doubt issues need better titles. I try to make it a rule to improve titles whenever I follow up (and whenever they can be improved...). But that weakness doesn't reduce the value of this feature so much that we should avoid it.

Some related information:
I believe this is very similar to an issue from zen.

Another idea to help this problem (copied from bugzilla) is to create a report of "most commonly duplicated issues". If we present those titles to users on the submit page that should help reduce duplicates. Of course, that requires keeping track of duplicated issues in a structured way.

oadaeh’s picture

without a doubt issues need better titles. I try to make it a rule to improve titles whenever I follow up (and whenever they can be improved...).

You seem to be missing the point. If you have to come along and change the title on YAD, then the search feature did not do it's job. If I put in a title that doesn't relate to the issue, then the ten previous submissions of the same issue will never show up. (I'm a master at exaggeration. :^)) Changing the title later doesn't solve the original problem, and we'll never get everyone to comply with the "Make a better title" rule. That's just the nature of people, especially people for whom English is not their primary language.

But that weakness doesn't reduce the value of this feature so much that we should avoid it.

That was only one of the weaknesses I mentioned, and, granted, the weakest weakness at that.

I'm not at all saying we should avoid the problem. I'm pointing out where I believe the proposition, as stated, will fail (and therefore where more thought needs to be placed). If this gets implemented only as proposed, I don't believe it will solve the problem.

dww’s picture

@oadaeh:

A) Of course we'll filter out bogus words. We'll need some logic to strip the title and body of common words and just look for "potential keywords" (however we want to define that). I guess that mostly answers pwolanin, too.

B) The most important part of my proposal, that you seem to have totally missed, is the behind-the-scenes hard-coding to include only the issue queues of *all* queues related to drupal.org. A huge part of the problem now is that, for example, people search in the webmasters queue, but are really asking about project_issue functionality, so they never get a chance to "wade through all the false positives" of the right queue(s).

C) I said "This would drastically reduce the number of duplicate issues submitted." I never said "This will completely eliminate duplicate issues".

D) I'm not (yet) proposing this is the default node/add/project-issue page. I'm only talking about something specific for people complaining about drupal.org itself.

E) The search query is only going to search for "open" issues. If things are marked duplicate, they'll never show up in the query results. Only the earliest, still active issue, will show.

My goal is not just to save *myself* time fending off all the duplicates, it's to make it easier for people to actually find the right issues so they can just reply there with any new information, instead of forking the discussion.

@greggles: Thanks for the info and support. Yes, zen's issue was part of what inspired this, but the big difference is the hard-coded set of projects stuff. The suggest-dups-on-preview thing is just part of the implementation details for how this could work. A report of "most dup'ed issues" would be nice, but yeah, that'd be a different project entirely, which would require a bunch of additional work as you point out.

oadaeh’s picture

A) Okay. You didn't mention it, so I didn't assume it would happen.

B) I didn't miss it. I believe my comments still apply. Just because the search isn't being done on the entire site doesn't mean that invalid results won't show up.

C) Okay.

D) Okay.

E) Hmm. Yes, you're right. Even the current advanced issue search does that.

dww’s picture

A) I wrote up a summary of the proposal. I didn't just implement it all myself nor mention every possible detail.

B) Just because some false positives can show up doesn't make this not worth doing.

chx’s picture

This is what sphinx www.sphinxsearch.com is made for. Configure the indexer, and then for the search, set attributes, specify keywords and get results. Instantly. Drupal.org has so few documents to index... sphinx is made for 10-100M documents. There is a stemmer, too for better results. NowPublic runs this for weeks now. Before we had like two minute pageloads coming from a complex query builder and a fulltext index, now we have... nothing. Both the code and the search time just... disappeared. The code is like a dozen lines now, and the search time I was unable to measure, it's too small.

If we go the traditional route, again we need a query builder, either own or from views, and then you have a complex SQL query to run. Slow, complex code. The much imporved search in D6 core is built to search regular Drupal documents, beautifully linked together, with few attributes. We have a different problem here: tons of attributes, independent documents.

catch’s picture

It might be possible to do a similar thing with pivots as well, at least for very common reports.

catch’s picture

dww’s picture

Status: Closed (duplicate) » Active

No, it's not.

Please read the first main point in the original post, or where it was restated as point (B) from #5.

catch’s picture

So this is project-issue specific, and the other one includes forum topics? Well, I think if we were going to do both, we'd need an extremely similar mechanism (like Sphinx), then configure it based on different node types. But considering this issue has much better discussion, let's leave it open either way.

dww’s picture

This is d.o specific about searching the particular set of issue queues you need to search when looking for a problem you're experiencing on d.o (which, admittedly is a confusing tangle of seemingly unrelated issue queues that I can't expect anyone to keep track of). There would be a block or link visible (but not prominent) in the site navigation somewhere, that brought you to the page where you maybe see the list of top known d.o-related issues but mostly where you see a form to submit a new issue that searches the *right* issue queues on preview and suggests duplicates. (More or less).

#19386 is about always searching for duplicates before being allowed to create any new issue (and perhaps forum topics if there's some way to do both, but I'm not sure this can be done generically outside of both project_issue.module and forum.module in a way that requires no changes to either one).

If it turns out that #19386 has no knowledge of issue queues at all, and always searches all issues ever submitted on the site, then this issue *might* just evolve into something about a page displaying a nodequeue of top d.o issues or something. However, I have serious concerns about the usability of #19386 if it's always searching all issues -- I fear a lot more false positives if we go that route.

moshe weitzman’s picture

apachesolr actually has a built in 'more like this' feature. we shoud show people possible dupes during the bug creation flow. launchpad does this. they look for dupes based on just a title and then the user proceeds if none of the proposed matches are dupes.

moshe weitzman’s picture

For inspiration, check out these screenshots of the Launchpad bug submission process (in reverse order cuz why would anyone care to reorder their uploads. hmmph.).

Some notes:

  1. All you are asked to provide at first is a title. Thats good - make it low effort for those are going to find a duplicate anyway
  2. Review a bunch of possible dupes in the same project. Dupes can be suggested by apachesolr More Like This or some other algorithm
  3. If you click the button, 'No, I just need to report a bug', the current page reveals the bug form. No page reload needed.
dww’s picture

@moshe: That's nice, but it's more relevant at #19386: Automatically search for duplicate issues/questions before submitting new issue/question than here. ;) I already explained the difference in comment #13, so I won't repeat myself.

p.s. I don't understand your gripe about reordering your comment attachments. It puts them in the right order by default now. Are you just showing off that you can reorder them and then complaining about the flexibility, or what? ;)

moshe weitzman’s picture

I'm a fool and missed the drag feature for uploasds. As penance, I copied my comment over to #19386: Automatically search for duplicate issues/questions before submitting new issue/question. Back to discussing adding a link.

Gerhard Killesreiter’s picture

Is this still an issue with improved search, redesign, and what not, that has happened in between?

catch’s picture

This is still valid (and thanks for bumping it killes, was looking for it the other day).

Related: http://groups.drupal.org/node/136179 (Prairie project duplicates discussion)

Also nearly a dupe but I think it's different enough from the proposal here to be in its own issue: #1060798: Enable apachsolr 'related posts' block for issues (and forums?).

greggles’s picture

This issue is specifically about adding a link for "drupal.org issues" into the navigation block. I don't think we should add a new link for it, but we could improve our current navigation.

Currently under "About" in the footer we have a link for "Drupal.org" which links to projects that are tagged with "Drupal.org" which seems really not useful.

Maybe we could create a new landing page for that and on that page create a list with links to pages like:
* Webmasters queue
* Infrastructure queue
* How to get involved in improving drupal.org

dww’s picture

To clarify, one key problem I'm trying to solve is that there's no 1 queue for drupal.org stuff. In addition to infra + webmasters (and drupalorg module), there are the separate issue queues for all the other moving pieces: project*, vcapi*, solr, etc, etc. I was hoping for a place where you could search and it'd automatically restrict to the issue queues for code running on d.o.

Charles Belov’s picture

How about just being able to find the issues queue in the first place? At the very least, add it to the mini-site map that appears at the bottom of each page. I stumbled around the site, trying to find the issues queue, only to have to search for it in Google.

Also, since Drupal seems to be variant by calling them "issues" and not "bugs," I would think that newcomers to Drupal would be looking for the word "bug" and not for the word "issue." So it would be even friendlier to newcomers to list it as "Issues (bugs)".

It's fine with me if the link lands at /projects/issues, just point me to somewhere specific instead of forcing me to go to Google.

Actually, now that I've searched for the issue I came for and gotten no search results, I don't see a button or link to log a new issue. I know I've logged issues before, so it must be there, but it's logical to have a "Log a new issue" link at the bottom of the search results.

klonos’s picture

Title: Avoid duplicate issues: "drupal.org issues" link in navigation block » Add a "Report a problem" link in d.o navigation block (in order to help avoid duplicate issues)

"Issues" is drupal.org specific, others use "bugs", but that's geeky too. How a bout "Report a problem" or perhaps a friendlier "Having trouble?" link then?

PS: I edited the issue's title in order to emphasize on the actual requested task (to add a link) instead of the goal (help reduce dupes). I hope you don't mind.

tvn’s picture

Status: Active » Closed (won't fix)

Closing old issues. Please re-open if needed.

mgifford’s picture

Project: Drupal.org site moderators » Drupal.org customizations
Version: » 7.x-3.x-dev
Component: Site organization » Code
Status: Closed (won't fix) » Active
Issue tags: +duplicate content
Related issues: +#19386: Automatically search for duplicate issues/questions before submitting new issue/question, +#1060798: Enable apachsolr 'related posts' block for issues (and forums?)

@tvn - Totally understand why you closed this issue. I'm just going to re-open it though now that we're using D7 finally.

We don't really have a good solution for managing duplicates.

Adding related issues.