Problem/Motivation

The Project Browser module enables users to search for and install modules and themes from within their Drupal site's admin area, instead of having to go to drupal.org and copy/paste the download link each time. This is a huge UX improvement. Targeting to be included in D8 Core.

Proposed resolution

To get the Project Browser module working, we need to deploy the following two modules on Drupal.org:

project_browser_server
----------------------
This module provides an API as well as endpoints so that you can serve search results to the Module / Theme Browser module.

drupalorg_pbs
-------------
This is an implementation of the Project Browser Server API specifically for Drupal.org, which takes advantage of the Apache Solr framework to serve results to queries. Think of it as the same thing as doing a search on Drupal.org, with all the caching and performance that provides.

Remaining tasks (in order of importance)

  1. Getting Apache Solr 6.x-3.x with a decent sized projects index working on the Dev Environments so that coding/testing can resume. Waiting for the Infrastructure team.
  2. Porting Project Browser Server and drupalorg_pbs to Drupal 7 (This still needs testing and further work once the Dev Environments are updated to use Drupal 7 and Solr)
  3. Some documentation pages could be written describing how to implement the server hooks. Done - See http://drupal.org/node/1612058
  4. Thorough code review and security audit of Project Browser Server.
  5. Thorough performance assessment of Project Browser Server.
  6. Deploy to Drupal.org

User interface changes

N/A

API changes

N/A

Original report by wildkatana

I need to get Project Browser Server and drupalorg_pbs deployed on d.o for the Google Summer of Code 2011 Project Browser module. This has been tested for months on a Sandbox site, and there haven't been any issues. Development on these two modules has been completed for over a month without any changes.

These two modules will allow Project Browser to pull project information from the live drupal.org site so that users of Drupal can easily browse and install new modules and themes straight from their Drupal sites.

Deployment Checklist:
---------------------

1. Install and Enable Project Browser Server 6.x-1.1 (http://drupal.org/project/project_browser_server)
-http://drupal.org/node/1842230 (6.x-1.1)
2. Install and Enable drupalorg_pbs 6.x-2.0 (http://drupal.org/project/drupalorg_pbs)
-http://drupal.org/node/1842228 (6.x-2.0)
3. In Permissions, enable 'access project browser server' for anonymous and optionally authenticated users

Risks
------

-Performance: There will be a lot of new traffic from the Project Browser module, fetching project data. This will only take place when admins are setting up their sites and/or installing new modules and themes, and it uses the existing update.module functionality where available, so I don't foresee a big issue here. In addition, the endpoints use GET parameters and are cache-able.
-I don't know of any impacts, performance or otherwise, on any other subsystems.

The project page on Drupal Groups: http://groups.drupal.org/node/145159

Comments

drumm’s picture

I'm not a fan of having another custom module. Please find a logical place within http://drupal.org/project/drupalorg and add the code there.

killes@www.drop.org’s picture

While I appreciate improvements to d.o, this one is a bit of a surprise.

Is there a dev site where people can look what this does?

This at the very least needs a review by a member of the security team.

And it does need some consideration regarding performance. Drupal.org can be a be quite a difficult beast.

Anonymous’s picture

There is a Sandbox site, though the main functionality can't be seen from it, since the modules are simply used to communicate with Project Browser. I wouldn't mind adding the functionality in to the drupalorg module, either way works as long as we can get this deployed on drupal.org somehow. Let me know what you think would be best.

Sincerely,
Leighton Whiting

dmitrig01’s picture

Here's a look at what this will enable: http://www.youtube.com/watch?v=mFDBQqTfG-8

bojanz’s picture

subscribing.

killes@www.drop.org’s picture

Really, I hate screencasts. Spent 4 minutes watching this and still don't have an idea what's under the hood...

Anonymous’s picture

Basically, it is a Project Browser that will reside on the drupal site installation, and it will hopefully make it into Drupal 8 Core to provide an 'out of the box' way for admins and developers to easily install new modules from their admin backend. This is similar to how WordPress does it, and it is similar to the old Plugin Manager module.

The modules that need to go on Drupal.org will expose a callback page (which uses json) which can be queried to retrieve the projects to display in the project browser. This can easily be cached using HTTP Caching, so it shouldn't be too much of a performance hit.

In addition, Project Browser (on the client side) has a number of caching mechanisms to reduce unneeded queries.

Hope that helps clear things up a little.

If it would be better to include the drupalorg_pbs module with the drupalorg project, then that would be fine with me. Who should I speak with to get this done?

The Project Browser Server module is simply an API module (which drupalorg_pbs then implements) which allows Drupal.org to function as a 'Server' for the Project Browser module. This allows everyone to have easy access to find and install the modules and themes on Drupal.org.

See http://www.bojhan.nl/module-theme-browser-interface for some screenshots of what we are doing and a better explanation.

Sincerely,
Leighton Whiting

killes@www.drop.org’s picture

I am not sure if including this within the drupalorg module makes sense. Maintenance may me easier if it is not. Regarding maintenance: How comitted are you? We can't have stuff on d.o that goes stale.

What I am interested in is the JSON that is generated. Is that for all modules at once, per module, per user? How does it deal with security releases?

dmitrig01’s picture

It does a Solr search similar to the one at project/modules and formats the results as JSON. It is only for downloading new modules, so I don't think security releases are relevant (though it doesn't show unpublished releases, or at least it shouldn't)

Anonymous’s picture

Sorry for the delay in my reply, my brother's wedding was today.

I don't mind maintaining it myself. It would only need updating if there are changes to Project Browser Server, which there might be in the next month or so if the 'Project Dependencies' module gets deployed to drupal.org. I fully intend to maintain this for the foreseeable future, at least a year more from now.

Sandbox projects are not shown, only regular full projects.

The json generated depends on the 'filters' that are passed in, as well as the page number. It returns about 10 projects per page, in json format. The filter sets can be cached, so that once a query has been run once, it can just serve the cached version the next time the same query is run (and page). The cache should probably only be valid for 12 or 24 hours. Caching it for this long shouldn't be a problem, because if a user wants to actually install a project, the releases are pulled in courtesy of the update module, using the existing infrastructure for that, so those will be the latest ones.

This solution is much more performance friendly than the old Plugin Manager module, since it doesn't need to download the entire projects list every time someone wants to install a new module.

Sincerely,
Leighton Whiting

drumm’s picture

I like having all custom code in one place, so it is easier to see what exactly we have. And we don't need a bunch of Drupal.org-specific modules on Drupal.org. The project is already split into sub-modules for code organization.

Contribution to drupalorg is a lot like any other module. Ideally, submit patches, get reviews, things get committed, and maybe you become a co-maintainer.

Anonymous’s picture

I will try and get this included with the drupalorg modules, but we still need to get the Project Browser Server module deployed at the very least.

Any issues with deploying it that I can take care of?

Sincerely,
Leighton Whiting

Anonymous’s picture

As a note, I opened an issue here to get this included in drupalorg modules: http://drupal.org/node/1248606

Sincerely,
Leighton Whiting

killes@www.drop.org’s picture

is the JSON always limited to about 10 projects? or could it be expanded to include all projects? (that's something we wouldn't want, at least not initially).

Anonymous’s picture

The limit is set via a parameter passed in. I will change this to hard code the 10 project ceiling in the drupalorg_pbs module.

Sincerely,
Leighton Whiting

Anonymous’s picture

I've hard coded a cap of 10 to the 'requested' parameter so that no one can go over that. I also released 6.x-1.0-alpha1 which is stable.

What else would be needed before we can get this included into the project?

Sincerely,
Leighton Whiting

Anonymous’s picture

While we wait for drupalorg_pbs to be included in the drupalorg project, can we get Project Browser Server (http://drupal.org/project/project_browser_server) deployed? It is simply an API module which has no functionality on it's own. It is a dependency of drupalorg_pbs. Let me know if there is anything that I need to do there to get this deployed. It is stable code, and hasn't changed in months.

Sincerely,
Leighton Whiting

Bojhan’s picture

subscribe, thanks for being on top of this - I am sure wildkatana is not familiar with all the processes for this yet, any guidance on it would be appreciated.

drumm’s picture

Where is the issue for this being included in Drupal 8?

wildkatana - this certainly needs plenty of review since Drupal installations might be pointing plenty of traffic to this in the future. And we can't readily change the behavior of the clients; we have to wait for people to upgrade. Many people who are capable of reviewing have been busy with DrupalCon.

For review, somewhere to start is knowing what queries are run and when.

Anonymous’s picture

@drumm - Project Browser Server doesn't run any queries on it's own. It simply invokes a hook to get the project results from other modules, in this case: drupalorg_pbs (http://drupal.org/node/1248606). It then serves these results (10 results per query) using JSON. The query pages can be cached in the same way that the update module handles the update callback. In many ways, this module is more efficient than the old Plugin Manager module, because it doesn't need to download the ENTIRE listing of modules when the admins want to install new modules.

Also, there is 'client' side caching as well from the drupal installation, so that any duplicate queries that are tried will just pull from the local cache.

The drupalorg_pbs module runs queries using Apache Solr, so it is performance friendly. And it wouldn't need to run any duplicate queries for as long as the HTTP Caching lasts (I recommend 12 hour caches or more).

I built the system with performance in mind, but am definitely open to any suggestions you may have on how to improve it.

I'm trying to split this up into only deploying project_browser_server from this thread, and I am working on including drupalorg_pbs in the drupalorg modules.

Thanks for taking a look at this drumm and Bojhan.

Sincerely,
Leighton Whiting

ggamba’s picture

subscribe

Anonymous’s picture

Just giving this a small bump.

I submitted a patch for the drupalorg modules to get drupalorg_pbs included here a couple weeks ago: http://drupal.org/node/1248606#comment-4913360

We also need to deploy the Project Browser Server module (http://drupal.org/project/project_browser_server). I want to do that from this thread if possible. What needs to happen next to get this in the queue for deployment?

Thanks again for your help and patience :)

Sincerely,
Leighton Whiting

sun’s picture

Status: Active » Postponed

Let's postpone this on some clean-ups to go into http://drupal.org/project/project_browser_server and http://drupal.org/project/drupalorg_pbs first.

David_Rothstein’s picture

Subscribing.

Are there issues for the clean-ups somewhere? There didn't seem to be much in those modules' issue queues.

Anonymous’s picture

Sun has been made co-maintainer of the modules so that he could do some code clean-ups. I don't think there are any issues existing for them now....

-Leighton

dmitrig01’s picture

Status: Postponed » Active

*bump* cleanups have been made...

Bojhan’s picture

Ok, so spoke to drumm. In general we can move this forward, but for d.o to start moving on this we need a core issue with relative good movement. We can have a test environment on d.o that implements this module.

Anonymous’s picture

Looking to get this moving again now that it is 2012. Cleanups have been made, so can/should we update this post: #1248606: Include drupalorg_pbs for deployment on Drupal.org ?

What can I do to get the test environment going? We already had it deployed at http://wildkatana-drupal.redesign.devdrupal.org and it was working there. Do we need to set up a different one?

I'm ready and willing to do what needs to be done to get this deployed and live :)

Sincerely,
Leighton Whiting

Bojhan’s picture

@wildkatana We need a core patch see #28

Anonymous’s picture

Hmmm, I'm somewhat of a noob when it comes to the workflow here. Does actual code need to be written or do we just need to open an issue somewhere? If so, where should I open it, and what should I ask for?

I thought that the post mentioned above was what we were waiting on (deploying drupalorg_pbs on drupal.org).

Sorry about the trouble, and thanks for your help.

Sincerely,
Leighton Whiting

drumm’s picture

I'd like a clear indication that this will be used and maintained. If Drupal core is definitely planning to depend on it, that's good motivation to be sure the server end is reviewed and deployed. And helps us know what sort of load to plan for.

Bojhan’s picture

@wildkatana You need to write a patch against core - in the Drupal core issue queue. Which basically uses your implementation in D8.

Anonymous’s picture

@Bojhan & @drumm,

There are basically two parts to the module. One is the drupalorg_pbs module which is only written for Drupal 6 (since that is what drupal.org is running). That module goes with the project_browser_server module, also written for Drupal 6. Both of those need to be deployed on Drupal.org before project_browser will really work.

As for load, the queries can be cached similar to what the current update module does, and rather than requesting the full module list each time, the module only serves about 10 modules per query, so the load should be considerably less than the current update module (and the plugin manager module which uses the update module to get it's project list). That was one of the reason I made the project_browser_server module was for better performance than the current system, which is starting to get old.

I'm definitely committed to maintaining this, and I have lots of features planned for it as well but I want to get it deployed before I start putting those out. I work with Drupal all day every day and will be probably for the rest of my career (I run a Drupal-based business). I always want to give back to Drupal and I think this is a great way, and it's something I'm proud of.

What I was shooting for is a two-step process. First, I want to get the two modules mentioned above deployed so that people can start using the project_browser for Drupal 7 as a contrib module. This will both help to flesh out the project and also test the load and make sure we get all the bugs and kinks worked out. I want to do that before going for Drupal 8 Core inclusion.

Either way, we would need to get the drupalorg_pbs and project_browser_server modules deployed on Drupal.org before project_browser gets into Drupal 8 Core - Optional, so I think we really need to get that moving first. Deploying those two modules to d.o now will not increase the load by any considerable amount simply because not many people will be using the project_browser at this point, mostly just people working on it and people who like to try it out.

Once it is deployed, we can test it and tweak it and I will port it to Drupal 8 for testing there, and then I can be ready to push it for Drupal Core inclusion with a patch.

Is that a reasonable workflow?

Sincerely,
Leighton Whiting

Bojhan’s picture

@wildkatana I think you are aiming for something that will be really hard to accomplish, generally d.o issues do not get traction unless there is clear incentive for (as you may have noticed, since this didn't move the last year).

This module will not get priority by the infrastructure team, unless there is some clear incentive - it being in or close to inclusion in core - does create that incentive. Given the fact you can setup a third-party source, it should be no trouble for testing of the core patch (for your contrib, you could consider asking http://drupalmodules.com/ to roll this out).

I am only saying this, because of my experience dealing with d.o improvements/changes, in terms of how long they take, how though the optimization process is and then the final roll-out which can take months. Your effort is really best spend on making an awesome core patch, when that is close it should be far less of a problem asking the infrastructure team to spend time reviewing this and rolling it out.

drumm’s picture

I agree with what Bojhan said.

Technically, I don't think Drupalmodules.com can roll this out, they don't have our DB. But the Drupal.org dev site does. When it is just too outdated, we can update it http://drupal.org/node/1182848.

Anonymous’s picture

Thanks guys, I'll focus my efforts on porting the project_browser module to Drupal 8 and make a core patch. For the core patch, it will just be including the project_browser module into the modules folder from what I can tell. Does that sounds right to you guys?

I'll get a dev site running the latest Drupal 8 code and get it going.

I will need to get the test environment updated to use the latest DB, so I'll get that rolling as well by opening an issue.

Sincerely,
Leighton Whiting

dmitrig01’s picture

Hey, just saw this discussion here. I've discussed this issue with dww a few times and he seemed to think (though I may be remembering wrong) that the way we should get this into core is by first deploying to drupal.org. Otherwise, it's hard to test out a core patch. We could just make the core patch talk to the sandbox, and before it's committed, deploy this to d.o. However, it's still somewhat tricky then, because that requires writing the patch, having it go through the whole review/revising process, and then right before committing (1-2 months process), coming back to this issue (probably another 1-2 months), deploying this, then going back and changing the other issue -- at which point it may be out of date, etc. That's why I think it'd be better to deploy it on d.o first. If it ends up not getting included in D8, we can always take it down from d.o. However, the project browser is an important project for Drupal 8 so I think it's likely to get in assuming there are some people to put some work into it.

Anonymous’s picture

It would certainly be a lot easier to write the patch if it could use the real drupal.org database instead of a dev one (which requires the http authentication each time and gets out of date quickly). Also, it would allow us to get people using and testing the 7.x version of project_browser, building momentum and getting kinks worked out there at the same time.

rodych’s picture

subscribe

webchick’s picture

FWIW, I support this whole-heartedly.

The last UX study we did demonstrated the absolute train-wreck it is propelling people from their website to some other website in order to find things, and the search tools on Drupal.org utterly befuddle people.

Providing a module-browsing interface "in-app" is what a lot of other similar things do (WordPress, Firefox, etc.) and we should as well.

Is anything else needed in terms of "blessings" for this to move forward? Or is it just code review at this point?

killes@www.drop.org’s picture

I think we'd need a somehow validated guestimate on the extra server load that this will cause...

Anonymous’s picture

@killes - I don't have actual numbers, just theory at this point.

Project Browser Server should have a much smaller server load than the current 'update' module does. The way the update module works is it spits out a huge file (several megabytes) that contains all of the project information for all of the modules on drupal.org. Plugin Manager used this to figure out when there were new updates available. I could have used this for Project Browser as well, but I felt that we needed something much more performance friendly if it was going to be included in Drupal Core eventually.

That was why I made Project Browser server. Not only does it reduce the size of the files returned from megabytes to kilobytes, it also still supports caching in the same way that the update module did, so the queries don't have to be run each time a request is made.

dmitrig01 guided me through that part to make sure that it would be as performance friendly as possible.

Doing a search on Drupal.org's module pages would cause more server load than doing a project browser lookup.

The only way to really tell is to deploy it and let people start using it as the Drupal 7 Contrib, and monitor the server load. This way it will be fine tuned and perfected by the time it is ready for inclusion with Drupal 8 Core.

Sincerely,
Leighton Whiting

Anonymous’s picture

Also wanted to include that the Project Browser module utilizes caching mechanisms on the client side as well, so that data is only fetched if the cached data is sufficiently old. I expect this to cut down the server load by a great deal as well.

Sincerely,
Leighton Whiting

Bojhan’s picture

@wildkatana Can't you trow some imaginary users at it? Even in contrib, you can get 10.000 users easily.

skottler’s picture

I love the idea, and we clearly need something like this in core, but I have some pretty serious performance concerns. I'm not fond of running this off of a dev site unless we decide that it makes sense to update the site's data often enough for the module information to be current.

I apologize if I missed this in the cruft of this issue and the surrounding ones - is there an issue currently open to get this module included in core?

I plan on profiling the modules soon, but do you have any kind of data as to how many calls they make back to Drupal.org or related infrastructure on each invocation of the module's primary functionality (the listing page)?

Anonymous’s picture

@skottler You're looking at it. The dev server is just there to test how it would do on a drupal type of site.

If you could provide some numbers from your profiling, that would certainly be useful.

There are a few calls. One is the categories call, which fetches categories. These are then cached and aren't updated very often at all. So you'd be looking at maybe one call per day at the most for that one.

Then there is the queries for available projects. These are only run when the user is interested in finding a module to install, so not an every-day occurrence. Usually people install the modules they need in a few days when they are setting up a site, and then they don't install new modules for months or years.

Contrast that to the update module which currently requests the full multi-megabyte file every day or even more often for each installation of drupal that has it enabled (probably most do).

The queries will definitely be the bulk of the server load though. If a user performs a query they have done recently, then the cached version is served. If the query is new or the cache is too old, then an httprequest is sent to drupal, similar to drupal.org/project_browser/server/query/module/7?text=views&page=2

This would then serve data in about 10 project chunks. It uses Solr so it is fast and can be cached. The results are served in JSON and will be cached since the requester is an 'anonymous user'.

So there are multiple points of caching that are being taken advantage of.

-Leighton

dww’s picture

@wildkatana: Thanks for moving this forward. I haven't been able to look at any of the proposed code here, and I'm not sure when I'll have a chance to. However, I want to clarify some things about how update.module in core currently works:

- It does *not* fetch a multi-megabyte file for all projects. Ever. That file is just there for things like drush and previous attempts at a package manager. Update manager only fetches for the available releases for individual projects.

- All the data being served to update.manager clients are static XML files which are heavily cached by squid (or whatever our web caching infra is these days). The only time there are ever queries that touch the DB are when we regenerate these XML files whenever the packaging script updates release tarballs for a given project.

You seem to misunderstand the performance load generated by the current system, so your comparisons relative to it aren't valid. That doesn't mean what's here is definitely going to melt our servers, but you seem to be arguing "can't be worse than what we have now" and that's not a compelling case if you misrepresent the current load. Just sayin'... ;) So, I think we need a more thorough and accurate assessment of the load before we plow ahead with this.

Also, my main questions from reading this thread relate to how this whole codebase interacts with other attempts to expose Project* data via JSON callbacks, e.g. #112805: JSON menu callback for project issues and #669910: Expose list of projects to external services (via JSON, XML, etc). Is the data exposed by this service documented anywhere? What's the schema? If we're going to expose all this data to external systems (and we definitely should) I'd like it to be as generic, standardized, and useful as possible.

Finally, it seems like probably most of this functionality should just be exposed directly by the Project* modules instead of separate add-ons like this (although again, I haven't looked at any of the code). However, that's not a blocker for this moving forward. I'm just saying that if all this can be done in a sane way directly in project*, that's probably better for code maintenance in the future.

Thanks,
-Derek

drumm’s picture

@dww - thanks for the detailed response. This is a lot of what I was thinking, but didn't have the knowledge offhand to say as well as you did.

Anonymous’s picture

@dww Thanks for the details and for correcting my incorrect assumptions. I was under the impression that the update module had to fetch that file each time it checked for updates. Looking at the code now I see that you are correct that it checks only for individual releases. I actually knew this last year because the Project Browser uses that function (and therefore depends on the update module). Don't know why I forgot about it haha.

I agree that we need a standardized way to expose Project Data. I originally tried to get this feature built in to the Project project, but it didn't go anywhere and they essentially suggested I do what I did, making it into a separate project.

Regarding the xml one, I originally used XML-RPC to make the Project Browser but then dmitrig01 advised against it because of performance issues (JSON can be cached at a higher level, using the squid method or whatever else is being used on Drupal.org).

#112805: JSON menu callback for project issues is a good one, don't know how I missed that. They are similar but Project Browser Server is intended for querying and searching for projects, rather than just displaying the information about a single project. Project Browser relies on the current update module method fetching and parsing the XML to get the releases for an individual project. I would love to use something like this instead, since it seems cleaner. If it ever got deployed, I could use that instead of the update module. EDIT: Apparently that issue doesn't expose release data. So I wouldn't be able to use it and will have to stick with the update module for now.

Thanks for taking a look at this drumm and dww. I'm no expert at profiling server load, but from what I can tell, it shouldn't be too much load because of the ability to heavily cache the results using squid. That coupled with the infrequent use over a site's lifetime, should keep it relatively low. I'm always ready to make any changes that are needed to improve speed.

Anonymous’s picture

I'm going to attempt to put some numbers to see if we can get an idea of the impact. Please correct me if I am wrong about anything.

10,000 Drupal sites

Update module checks for updates every day or so, right? So for every enabled module (say 10) it issues a page request to drupal.org: 10,000 x 10 = 100,000 page requests every day, most of which is simply serving a cached static file, no db activity except when a new release comes out. Am I correct? So over the course of 180 days, that would be 18 million page requests.

Project Browser queries for modules/themes whenever someone is setting up a new Drupal (8) site or adding modules to an existing Drupal site. Let's say this takes place for 10 days during a 180 day period. Let's say they perform 30 queries a day when they use it. So 10,000 sites x 30 page requests x 10 days = 3 million page requests over the course of 180 days, again most of which is simply serving a cached static file.

Most of the common modules like Views and CCK benefit greatly from the caching, while other less common modules don't get as much benefit. This also applies to the queries from Project Browser. Most people are going to be performing the same types of queries, for views, token, etc. Also, all of the most commonly used modules will be on the first query page, so that reduces a few common queries.

The pages that the update module uses would need to be regenerated each time there is a new release for the project in question.

The pages that Project Browser Server serves would need to be regenerated every few days (since it would only change when there is a new project added, we could either regenerate them each time a new project is added, or just set it to a fixed amount of 5 days for example - not many people will be searching for projects that are brand-new).

I tried to be liberal with the Project Browser estimates and conservative with the Update module estimates. I'm not trying to say that since it won't be as heavy as the update module it is okay, I'm merely trying to give something to compare it to.

I hope this helps people get a better idea of the kind of load we are discussing.

patcon’s picture

@wildkatana I've been watching this pretty excitedly from a distance for awhile, and just wanted to say that you're AWESOME for keeping at this. Hopefully it doesn't seem like people are throwing up roadblocks just for the sake of it -- everyone's just trying to show due diligence, and you're contributing to a area of the community that, for good reason, is the most cautious in accepting code

Anyhow, just wanted to reaffirm that this is a totally amazing and exciting piece of contrib :)

EugenMayer’s picture

I like the idea of a project browser, there are similar apporaches like the plugin-browser an kinds. For a different kind of purpose we build http://drupal.org/project/update_feed_api which basically does what you are looking for, without needing an extra module installed on drupal.org. It parses update.drupal.org project feed, and later the release feed of each project to fetch and store different meta data. On the base of this, http://drupal.org/project/drush_make_ui provides autocompletition for releases and projects while creating makefiles in the UI. One additional and interesting feature is, you can also parse other feeds of other module providers, which e.g. have a feature-server (which also provides such an update feed ).

So basically, you can browser drupal.org for modules, or and other websites providing drupal-modules, like for open-atrium or other distributions. update_feed_api supports a theoretically unlimitted amount of such "sources".

update_feed_api does lazy fetching ( e.g. you can let it fetch releases for a projects when the actually are demanded on the fly ) or you can fetch full - what ever you like. With this approach you might getting a good "index" of projects, while fetching meta-data on demand. You could extend the data model and practiacally use it as a backend to query for your results / sets. Parsing the feed takes arround 25 seconds (using a SAX xml stream parser, not a DOM parser) in a batch run (e.g. in a cron). We are using this for a quiet long time now and it works out pretty well. E.g. our system tells us, when updates for specific makes files ( and there ) pages are available ( we configured like "tell us about minor updates ), so we use this on a daily base / daily cron level.

Informations about all releases, were the packages can be downloaded, which are recommneded / security releases and stuff ( for different cores ) are included in the index and can be interesting for an "update" part of the browser, if needed.

What ever approach you take, iam glad to see this idea getting into a more realistic shape. Thanks for putting so much effort into this :)

Anonymous’s picture

Project Browser also allows for multiple 'repositories' of projects. They just need to implement the hooks in the project_browser_server module. I think that is an important feature for the future when other people start to maintain their own repositories and, who knows, maybe an App Store type of interface in the future.

For now though, this is intended to make it streamlined and easy for users to find and install modules from their own admin backend. I'm also hoping to see this included in Drupal 8 Core, so that's why I went to the trouble of making the project_browser_server module because I was concerned about performance if we continue using the update module's methods to parse the entire projects list for each site.

Plugin Manager does a similar thing by having to download the entire projects list. I wanted to provide an interface where instead of downloading the whole list you can just do query searches. Should save time and bandwidth on both sides, client and server. Building off of the existing update module functionality is just not scalable in my opinion (like every Drupal installation using it).

Thanks for the link though, looks like an interesting project you've got. Maybe I can learn some things from it :)

-Leighton

NROTC_Webmaster’s picture

I think this is a great addition and the perfect answer to the problem http://drupal.org/node/1164760

Keep up the good work and I hope this project makes it.

Anonymous’s picture

Thanks for sharing the link, I wasn't aware of that issue. I added a link there to this post, hopefully it can get some more traction and we can get it deployed :)

spM-1’s picture

@wildkatana : I wanted to take this up for my GSOC 2k12 project. I also wanted to suggest a few other improvements like recursively installing dependencies, (need brainstorming on how to handle external libraries) , etc. for the Project Browser module ?? Given your drupal experience, could you mentor me and help me with the proposal ??

Anonymous’s picture

@spm : Dependencies are one thing that I definitely want to include in the project. I already have Project Browser Server exposing dependencies, so this won't be very hard to do. I put it on hold until we get this issue resolved and get Project Browser Server and Drupal.org PBS deployed on Drupal.org. This project is pretty much at a stand-still until we can get that taken care of.

I myself am still a student, so I was thinking of doing another proposal for this year to add new features to Project Browser and further it's development.

Either way, we need to get it deployed before it can get much further along or get new features added.

Bojhan’s picture

I think its wise to get this deployed before GSoC 2012, it would be an afwul dependancy to have in your proposal. You should try to get in direct contact with dww or drumm and get this moving - otherwise we can be discussing this in the issue queue for years to come.

dww’s picture

I'm here, and listening. I asked a number of questions in #48, only some of which have been partially answered.

Meanwhile, #112805: JSON menu callback for project issues is deployed (although there are a few minor follow-up changes to make). A lot of thought went into the JSON in that issue. As much as possible, I'd like to re-use the results of that discussion as we expose more Project* data via JSON. Yes, that's only for issues at this point, but many of the principles would apply to releases and/or projects.

Speaking of releases, we could make a JSON representation of the release history XML feeds, although at this point, it's not clear that'd be much of a win over the XML. I know it's a bit wonky that clients might parse JSON for some things and XML for others, but that doesn't seem like the end of the world. If we wanted those feeds as JSON, the best way to do that would be to revisit #715582: Use DOM to create release history feeds such that we just build up a structured PHP array of the release history (which would also allow us to do #1003764: Add an alter hook invoked while generating release-history XML files which would be a win for other reasons) and finally we could take the structured PHP array and spit it out as XML and JSON.

I agree that providing a single listing of all projects isn't scalable (or useful). So, something like the approach here (at least as described, again, I haven't looked at any of the code) is probably necessary.

It'd be interesting to contemplate (although totally off-topic from this effort, and I don't mean to distract or derail anything) to do something similar with issue listings. Again, obviously a full JSON listing of every issue for a project isn't going to help (nor scale) so some kind of targeted queries that we then return the results via JSON could be good. I only raise it here to show the parallels with the likely approach we'd need to use to make that work.

As I see it, things that need to happen for this to move forward:

A) Thorough code review and security audit of the proposed code.

B) Thorough performance assessment of the proposed code.

C) Clear explanation of the JSON schema that this code is producing and discussion/possible refinement of that schema to reflect the emerging best practices for exposing Project* data via JSON.

I put a fair bit of work into documenting the process for getting changes deployed on d.o and the criteria the infrastructure team uses to assess code. If you haven't already, you should definitely read these two pages:
Process for getting changes deployed on Drupal.org
Drupal.org development guidelines

@wildkatana: You can definitely get started on C. You could also send an email to security [at] drupal [dot] org to see if anyone from the sec team would be willing to work on A. B would most likely require someone from the infra team, although there are certainly lots of performance-minded developers around our community and a review by any of them would go a long way. I'd be a perfect person to do all these reviews and assessments, but I've got way too many responsibilities and commitments right now, so I can't just volunteer to do it all myself. I'd rather be honest that I'm not actually going to have time to do it than to suggest I will and then it doesn't happen. Sorry, but that's just my reality these days.

I hope that helps keep this moving!

Cheers,
-Derek

EugenMayer’s picture

I dont exactly get, why you try to reinvent the wheel.

1. Feed / JSON
Fserver and Fserver Bonus are providing the same feed as d.o, means releases (fserver) and projects (fserver bonus) utilizing views. Why is that usefull? Because you can easily extend the fields printed by simple add it to the view. You can provide different formats by reusing views styles .. and json is already a part of it, the XML type is included in fserver and can be reused. So you are done with that issue in a second while being flexible for new approaches. Activate views caching or as i expect, d.o. has an proxy cache anyway, and performance will work out just fine.

But what is really cool on that approach? Filters! so update.drupal.org/..../projects/a give zou all projects starting with a...so you can chunk those things up or getting only specific parts. This is a huge plus to the views approach, as this is build-in. And...you can exclude those sandbox projects, which are currentl flooding the list.

We are really activily using the update feeds as mentioned above for our update / maintain infrastructure.

And the good news is, you dont need additional security checks of the code, as views is pretty good covered already. So you can save your time here. For the performance, views can be a overhead depending on the ammount of relations needed to get the meta data, for the fserver bonus, thats not perfect, as its based on cck, which adds a number of joins..
But commonly, i think the performance will work out and views is more or less the project maintain, so you jump on a very active wagon, no need to do it all by yourself..

2. JSON Project feed
For d.o., that will be completly nonsense. As there is no implementation for json like a XML SAX Parser, you have to load the whole thing into the memory, most probably into the clients browser mem. Iam not sure, but we have 7k projects, or even more due the sandbox ones. A DOM parser needs arround 30 minutes to parse that, a SAX parser needs less then 1 minute. JSON will (at leas most probabl) be used on the JS side, so this wont work out until we implement chunks / filters

3. JSON Release feeds
Big plus for that, as this is a ok amount of data which can be useful to be fetched by the client even ( autocompletition ). Still i think the only way to really scale here, is using the local-server cache method implemented in update_feed_api. The drupal infrastructure cant hold 200k sites trying to have autocompletition live on the release json representation. Anyway, i think thats a good addition

4. Feed format overall
As we are currently building a new project dashboard for l.d.o we need to utilize projecs/ release feeds to determine the recommended release for each major release. This ends up to be pain fo d.o. projects due the size of the index. 7k projects, means 7k http socket requests to fetch the release information/recommended release. For an update(maintain the local index) process this gets pretty slow due to the normal delay of http sockets. So i suggest the project-feed meta data should have an new format(just a additional callback) called somewhat 'more details', which additionally adds a short comma seperated list of releases (no further meta, just the release string ) plus for each major the recommended version for EACH project entry in the project feed. This would not only help the l.d.o. project...but also this project browser.

Overall, this is zero work when using views for the feeds as stated above, simply add those fields and your done. We are doing the same with the fserver bonus feed for projects, adding the SCM information (if available) for a project, which can then be reused for make filee (developer make files).

I dont want to hijack that issue, but i feels like problems getting solved in a sandbox, not globally. The need for the project / release feeds is there for a much longer time then the project browser, and there are a number of issues regarding that already. Fixing them, would nearly make the server-part of project browser obsolete, and thats the core point of the idea. Minimum effort, minimum changes, maximizing the result for all.

dww’s picture

@EugenMayer: Thanks for sharing your perspective. There's some useful stuff in comment #61. However, I think you're missing some important considerations:

a) updates.drupal.org isn't a drupal site, and it's definitely not running views. It doesn't talk to the drupal.org database at all. It is a collection of static XML files. That's the only way it can scale to the immense load it receives. It gets 350 million hits and serves a total of nearly 5TB of data a month (which is over 3 times the bandwidth used for drupal.org).

b) You don't need a SAX XML parser if you're not trying to process a feed with 10K projects. ;) The problem this service is trying to solve is "show me the 10 most installed projects that match the keyword 'media'" -- JSON is a perfectly reasonable representation for the answer to that. Neither updates.d.o nor the project browser wants to process a feed of all 10K projects, nor even all 500 projects starting with the letter 'm'.

c) If you *are* trying to process a feed of all projects, you're basically doomed. ;) But, if you are, and there are problems with the current monster feed, feel free to open issues about improving it. Especially since update manager in core doesn't use the monster project listing feed at all, I'm very happy to provide other feeds that would help other clients. But, that's off-topic from this issue.

Cheers,
-Derek

EugenMayer’s picture

thanks for considering my post dww:

It was clear to me, that update.drupal.org is not a drupal website and is stacitly cached. I was mainly about who actually creats the XML file which then gets cached. This can and should be a drupal with views with the approach outlined (which is never called live of course). For myself, i actually can't see, why we just don't use a drupal instance with a pretty good configure squid - no need to build reverse proxies yourself, or am i missing something? Its all anon - so pretty perfect.

But i see, some of the pros are beaten by your comment, which i simply did not though deeply enough, as you stated. Filters cannot be used, as its not dynamic, i admit. We could workaround it of course, create several XML flew for A / B / utilizing views as a backend and a script saving those indexes on the update.drupa.org server to be static cached for (some while). But yes, its not perfectly the same, i admit.

We are parsing the complete projects feed of u.d,o and fetching all realeses lazy later( thats what the http://drupal.org/project/update_feed_api project does ), so it works even with the current approach. It will not work as nice for l.d.o due the current "flush and recreate" process and the need of "fetch all releases" on l.d.o, not lazy)...but well different topic. After i discussed the topic with Gabor i will consider opening such an ticket and leave the discussion for the project browser here, so i don't hijack it. I still think, nevertheless, the needs are linked and should be addressed the same way ... as you will get enormous amount of hits with the project-browser approach, actually even more then u.d.o ... so a drupal serving those informations is no approach either, isn't it?

My opinion is, that the amount of hits on u.d.o can be reduced a lot by not querring it as a database "live" by drupal instances or the project browser, but rather using a cache approach like update_feed_api, but well, just my thoughts ( we tried the live one with upate_feed_cck for autocompletition releases / projects for make files, and it did not work out very well..) This would scale to any number of clients, as every client is a distributor of the informations.

dww’s picture

@EugenMayer: If you read this thread, wildkatana claims that the project browser client does its own local caching, too. That's pretty orthogonal to the question of how the server is setup. Obviously, any clients for *any* of this data should do as much caching as possible.

And no, I don't think the project browser will generate "even more" traffic than the update manager is generating on u.d.o. wildkatana's assertion (which makes sense to me) is that site builders will only use the project browser for short windows of time during site development/building, while update manager is going to keep hitting u.d.o every day as long as the site is online. And, definitely not all sites will even use project browser. Many will be built via Drupal distributions, drush, etc.

In terms of creating the static release history XML files -- views isn't really a good fit here. I mean, with a lot of effort we could make it work, but there's not really much benefit. Those files only need to be regenerated at very specific times (namely, whenever the packaging script creates any new releases or updates any -dev releases), so a drush command that runs on live d.o when we need to rebuild one of these XML files makes much more sense to me than a separate special site (which has read-only access to the live DB?) with views trying to generate these very specific feed files. I just don't buy your assertion that views would automatically solve a bunch of problems for us. And the current plumbing works (although it would need some retooling to also provide JSON representations of the same data if we wanted that).

Anyway, we're totally off-topic here. If you really want to keep arguing for a views-based solution for generating those XML files, let's move that to a separate issue (although I'm probably going to mark it "won't fix" for all the reasons I've already explained). Let's leave this poor issue alone so the project_browser_server has a chance to see the light of day. ;)

Thanks,
-Derek

EugenMayer’s picture

will do, thanks for your explanations

dmitrig01’s picture

Sorry if I'm reiterating -- I haven't had a chance to read the entire issue. But, my view on the issue, which was what I had in mind from the time we designed this, was that it shouldn't be any more expensive than doing a normal module search on Drupal.org. In fact, it probably is even less expensive, because we don't have to go through all the page-rendering cruft. Furthermore, because they're rendered as JSON, the search results are cacheable for some period of time (granted, shortish -- one day maybe? have to be careful to invalidate the cache when there are security releases)

Anonymous’s picture

Sorry I haven't been as active on this thread lately, had midterms and now finals are coming up. I will be able to spend more time on it though in a couple more weeks. I definitely haven't given up on this :D

webchick’s picture

Yay! Great to hear, Leighton! :) Please let us know how we can be of help!

patcon’s picture

I second that "yay!". Double-yay!

kristiaanvandeneynde’s picture

Chiming in to say that I am excited about this project and truly respect your dedication as patcon said in #52.
Amazing job so far.

patcon’s picture

I've forgotten quite a bit of the details atm, but I'll try to get started on an issue summary if I can find some time:
http://drupal.org/node/1155816

Anyone can feel free to beat me to the punch if this is fresher in their mind :)

patcon’s picture

Issue summary: View changes

Updated issue summary with empty template.

Anonymous’s picture

Hey guys, I've updated the Issue Summary at the top with an overview of what is happening and what needs to happen for this issue.

I'll be working on item 3, which will also include a "Clear explanation of the JSON schema that this code is producing and discussion/possible refinement of that schema to reflect the emerging best practices for exposing Project* data via JSON."

I'll post back here when that is done. We also need to get the code reviewed and performance inspected by someone from the infrastructure team. Any takers?

Anonymous’s picture

Issue summary: View changes

Added some Issue Summary notes

Anonymous’s picture

I added some documentation pages just now that outline the JSON schemas: http://drupal.org/node/1612058

I tried to get the test environment running again so I could fix some bugs in the modules and get them all up to date, but Apache Solr isn't working on the dev servers, so I wasn't able to do much testing :(

webchick’s picture

Priority: Normal » Major

I'm escalating the importance of this; this consistently comes up in usability testing as a complete WTF and fixing this in D8 would be amazing.

Anonymous’s picture

I just committed a bunch of changes to get things all working together. sun and dmitrig had gone in and changed a lot of the API in Project Browser Server to get it more secure and performance friendly, but it broke things with Project Browser. So I fixed things and got them working again, while keeping the changes they made to Project Browser Server. It may take a few hours for the new dev versions to show.

So to reiterate, we need to get the following done:

1. Thorough code review and security audit of the proposed code.
2. Thorough performance assessment of the proposed code.

As, I mentioned earlier, sun and dmitrig both looked over it and got things cleaned up. Do we need another pair of eyes as well?

Another note: since Apache Solr isn't currently working on the dev sites, I can't fully test the changes I made. It would be nice to figure out some other way of testing it, or perhaps we could just deploy it to drupal.org after the security reviews above and test it there? I tested it a couple months ago and it was working but I just wanted to test it again today and couldn't. drumm said in IRC that it may be 1 to 3 weeks before Apache Solr gets fixed on the dev sites.

kristiaanvandeneynde’s picture

In browser server, you can probably fix this @todo by specifying X-Content-Type-Options: nosniff

# Lines 73-75
// @todo msie9 might break with this content-type.
// @see http://stackoverflow.com/questions/111302/best-content-type-to-serve-jsonp/111306#111306
drupal_set_header('Content-Type: application/javascript');

The rest looks okay at first glance.

Anonymous’s picture

@kristiaanvandeneynde I am not familiar with that. Should we just set it as an additional header?

kristiaanvandeneynde’s picture

By setting that header, you tell IE it should not guess what content-type you're serving, but just accept what you say it is.
If the content then doesn't add up to the MIME type (e.g.: application/json for JSONP), IE will throw an error.

That way, you eliminate the risk of:

  • IE setting a wrong MIME type, potentially opening a security hole
  • IE not accepting your MIME type, provided you serve it right
Anonymous’s picture

Checked back today to see if the Apache Solr was running on the http://wildkatana-drupal.redesign.devdrupal.org/ development instance but it looks like it's still down :( Can't really do any more development and testing until that gets fixed.

Also, does anyone from the security team have time to review the code? We still need:

1. Thorough code review and security audit of the proposed code.
2. Thorough performance assessment of the proposed code.

Anonymous’s picture

Hate to be a nuisance, but it looks like the issue with Apache Solr on the dev environments hasn't yet been fixed. I can't do any more testing or work on this until we get it running again.

My environment is here: http://wildkatana-drupal.redesign.devdrupal.org

The error I am getting is this:

No Solr instance available when checking requirements.

It is also throwing a beanstalk error at the same time:

Socket error 111: Connection refused in /var/www/dev/wildkatana-drupal.redesign.devdrupal.org/htdocs/sites/all/libraries/pheanstalk/classes/Pheanstalk/Socket/NativeSocket.php on line 36.

Is there any hope of getting the Apache instance running on the dev environment again? If not, what should I do to get this issue moving again? I should note that we still need some code review to get this ready to deploy. Who should I ask for that?

tvn’s picture

Hi wildkatana. One important thing to consider while moving further with this issue is Drupal.org upgrade to Drupal 7 which is in progress right now. I just looked at both project_browser_server and drupalorg_pbs project pages and there are only Drupal 6 releases. Anything which gets deployed on Drupal.org right now should have D7 version ready for not to be a blocker for the upgrade. Considering the speed of this issue and that everyone is busy with the upgrade, there are high chances that this won't be deployed before the upgrade is done. So it might make sense to concentrate on D7 version of the modules now.
I'll check the Solr issue and comment later.

Anonymous’s picture

@tvn - I wasn't aware of the upgrade plans, but that is good news, since D7 is so much better than D6 :)

I'll start a new branch for those for 7.x, but I won't be able to test any of the Apache Solr things until the dev environment is fixed and updated to D7... Sounds like this might be a while :(

kristiaanvandeneynde’s picture

If it's any consolation: I think you've done a great job on this and definitely should keep at it.

tvn’s picture

We should have development sites on D7 available somewhere in September. Solr is fixed, it just does not index all dev sites. So when we have a D7 dev site for you, we'll add it to the index.

Anonymous’s picture

Awesome! When is Drupal.org planning on launching on Drupal 7?

chx’s picture

Priority: Major » Critical

I can't see how this can be less than critical if #1164760: [meta] New users universally did NOT understand that you can extend Drupal is critical and that is. This needs to happen ASAP so we can drive core further. With the PHP writing issue getting ready, we finally have the groundwork laid in core to make it easy to add and upgrade modules and even upgrade core through the UI. We will get this ported to D7 in no time. If noone else does, ping me, I will do it. That should not keep this up.

So. What holds this?

dww’s picture

Re: "What holds this?"

Internally: Very little in this issue has changed since the last time I answered this question in comment #60. As far as I can tell, those are still the necessary next steps, plus the additional step of getting a viable solr index on a dev site we can test this stuff with. But, even if that last part doesn't happen immediately, everything I wrote in #60 is still basically valid.

Externally: What holds this is a lack of resources. Most of the people who can review/deploy this code are busy with the D7 port of d.o. @chx: If you want to spend your energy on this, that'd be fantastic, since I certainly don't have the bandwidth to move this forward myself. You could start with a thorough security audit. Then a performance audit would be useful. Anyway, I won't repeat myself -- please read #60. ;)

Cheers,
-Derek

chx’s picture

I will do a sec audit but #75 "sun and dmitrig had gone in and changed a lot of the API in Project Browser Server to get it more secure and performance friendly" -- that's why I asked what to do. But let me look at this.

Anonymous’s picture

@dww - Of the 3 points you made in #60, two of them involve someone from the security team reviewing the code and making sure it looks okay. Item C has already been done; there is a documentation page for the project now and it is referenced in the first post of this thread (which I try to keep up to date with status and what needs to be done).

I would like to resume coding on the module but I really can't do it until the dev environments are fixed so that Solr works again. That was a fundamental part of the drupalorg_pbs server module and the client can't really work without it.

I will get to work on making a 7.x branch of Project Browser Server, but I can't port drupalorg_pbs until the Solr issue is fixed and the dev servers are updated to Drupal 7. Once this happens, please update here and I will get that taken care of immediately.

@chx - I agree that this is a critical issue. WordPress has had this for years and we really need this in Drupal 8. We can start testing it and ironing it out in Drupal 7 now, since the prototype has been proven to work, but it does need to be deployed to drupal.org before that can happen. The reason I was wanting it deployed to drupal.org as soon as possible was so that we could start improving it and iterating on it in preparation for Drupal 8.

I would definitely love to get things moving on this. I'll see if I can get the Drupal 7 branch of Project Browser Server started today, but as I said earlier, won't be able to really test it until the Solr is working on the Dev sites.

Thanks dww and chx for your help and input :)

Anonymous’s picture

Issue summary: View changes

Updated the documentation link

Anonymous’s picture

Issue summary: View changes

Added some new todo items

Anonymous’s picture

I updated the code for project_browser_server and drupalorg_pbs to Drupal 7, but I am not able to test them yet because of the lack of a working dev environment. But the branches are there and that is a step in the right direction. Once the dev environments are fixed, I will test them and fix any issues, then they should be ready for code review.

On a side note, any idea on when Drupal.org is switching to Drupal 7? If it is too far in the future, maybe it would be better to get the Drupal 6 version tested and deployed first, and updating the Drupal 7 versions with any fixes? This would still require the dev environment to get Solr fixed so that any changes can be made and the modules need to be reviewed, but at least we can get the Project Browser for Drupal 7 working and start iterating on it and improving it.

Also, please see the topic post at the top with an updated list of remaining tasks.

dww’s picture

Thanks for starting the porting and the new branches, that's great!

For the record, I'd be totally fine deploying this on D6 and not blocking it on the D7 launch. I agree it'd be better to deploy this sooner rather than later so there's a chance to use the experience while working on D8 core features. To me, the inter-dependency with the D7 launch is only that the infra team is small and busy, and the D7 port is huge and urgent. So, no one has a bunch of "free" time to audit this, deploy, be available to put out any fires that it causes, etc. But, technically, I don't think we need to postpone this until the D7 launch.

In typical fashion around here, the Drupal.org D7 launch will happen When It's Ready(tm). See https://drupal.org/community-initiatives/drupalorg/drupal7 for more. No one is putting a firm date on it yet, since there's still much to do.

Thanks also for keeping the issue summary updated, that's a big help.

Cheers,
-Derek

drumm’s picture

Status: Active » Postponed

As part of Drupal.org's upgrade to Drupal 7, we are upgrading our apachesolr module. See #1549374: Upgrade drupalorg_crosssite to support 6.x-3.x version of Apachesolr (Prep for D7 d.o), #1549496: Upgrade project to support 6.x-3.x version of Apachesolr (prep for d.o. upgrade to D7), and #1548064: Upgrade drupalorg to support 6.x-3.x version of Apachesolr (Prep for D7 d.o). This upgrade is happening so don't have a major upgrade of that along with the D7 deployment, less things to go wrong.

Until those issues are done and we are running smoothly on apachesolr 6.x-3.x, I don't really want to deploy anything else that might delay it. As part of that deployment, we'll get staging.devdrupal.org creating an index which all dev sites will have read-only access to by default.

In the meantime, help on those 3 issues would be appreciated. The solr-drupal.redesign.devdrupal.org dev site is being used to test. You could even use the same index as that site, but please do not write to it.

chx’s picture

Status: Postponed » Active

This might need to wait on apachesolr 6.3-3.x deployment (which is really a shame! it's been a year already!) but meanwhile we can test. Please go to http://solr-drupal.redesign.devdrupal.org/admin/settings/apachesolr/sett... (you can get a one time login url by going to /var/www/dev/solr-drupal.redesign.devdrupal.org/htdocs and run drush uli) and copy everything to your site but make very very sure to check the Read only flag.

Anonymous’s picture

@drumm - Does this mean that there won't be many (any?) changes that need to be made to the Solr functions in drupalorg_pbs? That would be great! :)

@chx - I'll give this a shot, but I am not much of a Unix guy, so if I run into any issues I'll post back here. Just to make sure I am reading this right, this means that I will be able to run Solr queries on my dev site, allowing me to continue testing and improving the 6.x version of drupalorg_pbs, and continue improving the Project Browser module as a result? That would be awesome! I've been in limbo for a while and this would mean I can start doing some coding again. :)

chx’s picture

yes it means you can run solr queries on your dev site but you can't change the solr index.

Anonymous’s picture

@chx - I won't need to write anything. I've accessed the settings page you linked but I seem to have the old version of the Solr module installed in my instance. Should I try to install the 3.x branch? Where can I get those files? Just copy them from the other instance?

drumm’s picture

We just use regular modules from Drupal.org, so those should be good. Be sure to get all the dependencies and patches/branches from #92 here. The other dev site should be a good guide for what works.

We still want to be sure the 7.x version of project browser server is doing well, we don't want more blockers for Drupal.org on D7.

Anonymous’s picture

I've been muddling around a bit but can't get them to install without errors (update.php runs into errors). Is there any way I could just get a copy of the solr-drupal site imaged to my wildkatana site?

chx’s picture

I am not sure about a site copy but I upgraded your apachesolr which wasnt trivial because it seems the module lacks an update function to install the new tables in 3.x (wtf??). I also installed facetapi. Please let me know whether this works as it should.

Anonymous’s picture

You are awesome chx. There were a few other drupalorg_ modules that needed updating and I did those, but I am stuck at the drupalorg_crosssite module. I can't delete or edit it because you are the owner, and it looks like it is out of date. I get this error:

Fatal error: Call to undefined function apachesolr_multisitesearch_has_searched() in /var/www/dev/wildkatana-drupal.redesign.devdrupal.org/htdocs/sites/all/modules/drupalorg_crosssite/drupalorg_crosssite.module on line 690

Any way you could delete that module for me so I can update it to the latest one from the other dev site (I checked and it doesn't have the undefined function in it).

EDIT - I put the new version in the drupalorg folder and drupal is using that now instead, but would still be nice to get the old one removed.

Anonymous’s picture

Okay, getting closer to getting it to work again. It looks like some of the Project Solr things have changed. Can anyone point me in the right direction here?

Fatal error: Class 'ProjectSolrQuery' not found in /var/www/dev/wildkatana-drupal.redesign.devdrupal.org/htdocs/sites/all/modules/drupalorg_pbs/drupalorg_pbs.module on line 68

Line 68:
$query = new ProjectSolrQuery(apachesolr_get_solr(), $text_query, $solr_filters_string, $sort, '');

Has the class just been renamed? Or should I be doing this a different way now? Any help would be much appreciated.

chx’s picture

Googling that class name shows me #1549496-9: Upgrade project to support 6.x-3.x version of Apachesolr (prep for d.o. upgrade to D7) it is being deleted but then again it never was much. Just use Solr_Base_Query.

Anonymous’s picture

EDIT - I'm having trouble with loading the node. It isn't loading anything for the nodes. Is it possible that the nodes table that is in my database isn't updated and the nodes I am trying to load don't exist?

Now I am having trouble with some of the filters. I'll keep digging...

Anonymous’s picture

By the way, I made a new branch 6.x-2.x which I am using for the support for Apache Solr 3.x. This is for drupalorg_pbs

chx’s picture

I talked to drumm and either you or I could just dump-restore the solr devdrupal db as you yourself suggested above. Such a thing took 37 minutes according to him. I can do it -- hop on IRC perhaps to discuss a good time to do it?

Anonymous’s picture

Would it be possible to just install the drupalorg_pbs and project_browser_server modules on the solr-drupal environment? I wouldn't need to do anything to the other modules, and it doesn't write anything to the database or affect other modules. It would speed up the iteration time if there are issues that need to be worked out with Apache Solr 3.x. Just a thought. Otherwise, we could see about just getting a copy of the solr-drupal environment and I can work with that. I'll hop on IRC now and see what people are on.

chx’s picture

Alas, this has been voted against. Sorry.

chx’s picture

Database is being copied over. I will do files next, set the solr env to read-only and you will be good to go in a few hrs.

Edit: you now have a copy of the solr site.

Anonymous’s picture

@chx - Thanks, I was able to get some code changed and get it half-way working again. There seems to be some errors with the project_solr module (it is happening on my site and the http://solr-drupal.redesign.devdrupal.org/project/modules site as well). I think updating to the latest dev version (6.x-3.x) will fix it, but I can't delete the files cause you are the owner. I've been borrowing some of the apache code from project_solr_browse_page() in the project_solr module, since lots of things have changed in 6.x-3.x and I don't know where to find documentation of the changes.

Currently the code from the project_solr module is only returning 19 matches instead of the expected thousands. Is there a working version of project_solr 6.x-3.x somewhere that I could take a look at?

$query->addFilter('bundle', 'project_project');
	//$query->addFilter('bs_project_sandbox', '0');
  //$query->addFilter('im_vid_' . _project_get_vid(), $project_type->tid); // If this is uncommented, I get 0 results
	watchdog('debug', '<pre>'. print_r($project_type, TRUE) .'</pre>');
  // We can only filter on bs_project_has_releases for just official projects,
  // since sandbox projects can never have releases.
  if (module_exists('project_release') && $query->hasFilter('bs_project_sandbox', '0')) {
    $query->addFilter('bs_project_has_releases', '1');
  }

This is the page with blank queries that I am using for testing: http://wildkatana-drupal.redesign.devdrupal.org/project_browser/server/q...

It should be returning all modules that have releases and aren't sandboxed. But it's only returning 19 for some reason... Could the solr index just need to be rebuilt?

chx’s picture

Fixed permissions. solr I can't answer.

Anonymous’s picture

I think it must be that the index just isn't complete. Going to this page http://solr-drupal.redesign.devdrupal.org/search/site shows only 19 projects...

dww’s picture

Correct, dev sites have truncated data sets (although it's surprising there are only 19 project nodes on that site). So, maybe it's a combo of truncated node/project data in the first place, and then a stale solr index or something. Anyway, don't get hung up on the absolute #s of projects in search results. Hopefully you should be able to ensure the basic functionality is working given the data you have, and once this is closer to live deployment we can put it on a staging server that has a more-or-less full copy of the d.o DB and test again with real data.

Cheers,
-Derek

Anonymous’s picture

@dww - Thanks for the explanation.

The Project Browser and Project Browser Server modules aren't really changing, it is just the drupalorg_pbs module that is undergoing changes. Part of the problem I am hitting is that there aren't enough projects to test the filters out properly. When all of the filters from the project_solr module are added, I get 0 results. I think the Projects that are on the dev site aren't projects which have releases, so this makes it so I can't test some of the other functionality and/or see the fields as they would be for real, active projects. The filters I am using are:

$query->addFilter('bundle', 'project_project');
$query->addFilter('bs_project_sandbox', '0');
$query->addFilter('im_vid_' . _project_get_vid(), $project_type->tid);
// We can only filter on bs_project_has_releases for just official projects,
// since sandbox projects can never have releases.
if (module_exists('project_release') && $query->hasFilter('bs_project_sandbox', '0')) {
  $query->addFilter('bs_project_has_releases', '1');
}

I'm not sure if it is showing 0 results because some of the filters are incorrect, or because there aren't enough projects to filter from. I got that code from the project_solr module, but it doesn't seem to be working correctly either...

So far things seem to be working, but without more projects to draw from, I can't test much more than I have now.

On an unrelated note, are we planning on storing the dependencies in Fields so that modules can specify if they have any dependencies on the Project pages? That would be great because then I could use that information in project browser to display dependencies and even automatically install missing dependencies.

dww’s picture

Sure, I can relate to the troubles of trying to test things with truncated data. Sometimes that gives me grief, too.

The project_solr code is a mess, and is hopefully going away entirely in the D7 port.

In terms of dependencies, that's what Project dependency is for. Sadly, that's was built as a d.o-specific solution, so I can't build interesting things on top of it in Project*, only in drupalorg*. :/ But, if this drupalorg_pbs module can benefit from it, knock yourself out. ;)

Cheers,
-Derek

Anonymous’s picture

Cool, I could probably use that in the future once I get things fully working again.

If the project_solr module is going away, what is going to replace it for using ApacheSolr to search projects on Drupal.org?

dww’s picture

The current plan is to use Search API + Views. We need to put this stuff into an issue, but we had a great call with DamZ, myself, Senpai and others on Friday morning. Some of the discussion is at #1699164: [Meta] Port drupal.org's solr search functionality to D7 but we'll want separate issues for a lot of this stuff. Anyway, we're getting off topic here...

Cheers,
-Derek

Anonymous’s picture

So should I still be trying to use Solr, or should I be just loading the queries from the database using SQL? Or try and use Search API?

dww’s picture

- Please continue to use Solr, not direct SQL queries.
- You can use either direct Solr queries or Search API, whatever's easier for you.
- Beware that Search API is not yet deployed on d.o, so that'd be an additional step (although it's something we want anyway, and there are already stable releases for both D6 and D7).
- Beware that I don't yet know much about the internals of Search API, so I'm not going to be much help if you go that route. ;)

Cheers,
-Derek

dww’s picture

Actually, upon further consideration, I'd pretty strongly advise *not* using Search API for D6. From what I understand, it likes to have separate indexes for whatever you're searching. I don't think it's easy to have Search API query against our existing Solr configuration, and at this point, there's no way we're going to backport the Search API stuff to D6.

So, just keep hacking away with D6 doing direct Solr queries to try to get this live ASAP.

Once we've got a working Search API-based (Solr-backed) project browsing system in D7, we can look at how to leverage that for this purpose. Perhaps we won't need any of this code at all, since all the project-browsing pages will be views, and we could just attach a "render this view as JSON" display directly to each view. Or something. ;) Point being, I wouldn't invest too much energy in D7 ports yet since that stuff is all about to change. Just focus on getting the D6 version working and solid so we can deploy. By then, the D7 project browsing stuff will be far enough along that we'll know how it's going to impact all of this.

Thanks,
-Derek

Anonymous’s picture

I'll keep at it with Solr then. Not much else I can test though without more data. So far it looks like it is working and it is just limited by the dataset.

Also, the 6.x-1.x branch already works with the current ApacheSolr that is on Drupal.org now, if we want to see how it does under live conditions. Then when d.o moves to use ApacheSolr 3.x, we can use the drupalorg_pbs 6.x-2.x branch, which is already working with that.

The Project Browser Server module hasn't changed and won't need to change, so that is ready for immediate deployment as well.

The Drupal 7 ports are done, but I didn't invest much time into it. There were only a few changes needed.

Anonymous’s picture

Issue summary: View changes

Updated the remaining tasks

Anonymous’s picture

Status: Active » Needs review

I'm changing status to Needs Review to see if we can get the code reviewed as listed in the opening post.

Project Browser Server is ready for deployment on drupal.org right now. drupalorg_pbs 6.x-1.x is also ready, but I don't know if you guys wanted to deploy it now or if you wanted to wait until Apache Solr 3.x is deployed first.

When D.o moves to using Apache Solr 3.x, drupalorg_pbs 6.x-2.x is mostly ready for that (I say mostly because I would have liked to do more extensive testing but the development server solr index is too limited).

Deploying Project Browser Server and drupalorg_pbs would enable me to start iterating on the actual Project Browser module with improvements and fixes, as well as begin development of an 8.x branch that will be ready for Drupal 8 Core inclusion. As it stands now, I can't code the 8.x branch until I get a working server to use for the data.

drumm’s picture

As I said in #92, Solr 3.x should be deployed first. We are cleaning up and testing it at drupalcon Munich and deploying when people are back online afterward.

Anonymous’s picture

We still need to get the code review done by someone. Who should I ask for this?

killes@www.drop.org’s picture

chx seems to be interested in this project, so I propose you ask him.

Anonymous’s picture

@chx - Mind giving the Project Browser Server module a quick run-through? It is ready for deployment, even if drupalorg_pbs 6.x-2.x is not yet. Project Browser Server could also use a review, since nothing has changed in it for a while.

Drupalorg_pbs is working but there are things that still need tweaking I am sure, but I can't really do them without a proper server to test on. To do that, I'll need to get access to a server running Apache Solr 3.x with a decent sized index of projects. I'll add this to the tasks up in the opening post.

Anonymous’s picture

Issue summary: View changes

Updated tasks

chx’s picture

Reviewing now. Will post tomorrow.

chx’s picture

I am so very sorry for not doing it -- I promise I will get back to this -- but there's almost zero chance of this happening before drupal.org on D7 it seems :(

Anonymous’s picture

@chx - No worries, I was busy too (my first kid was born 2 weeks ago). Can you or someone update this thread once the Apache Solr is working properly in the dev environments and they are using Drupal 7? Then I can continue testing and get this ready for deployment on d.o

Can't wait til d.o is running D7, it will be a big step up.

Anonymous’s picture

@chx, @dww or @bojhan - Is there any way we could get a status report on this, or a link to the issue to port Drupal.org to D7? I still would like to push this and am just wondering if there is anything I can do while we wait, or help on other issues that are holding this one up.

Also, have the development sites been updated to have a more complete Apache Solr project database table so I can continue testing the filters?

Senpai’s picture

Priority: Critical » Major

@wildkatana, the drupal.org D7 upgrade is indeed nearing completion. The site is in code integration and final QA phase now, and should be done before the end of the month. Here's a few links to things which may be of interest to this issue:

  1. The Drupal.org D7 Upgrade Initiative: https://drupal.org/community-initiatives/drupalorg/drupal7
  2. The roadmap for Project module on D7: [#1551120]
  3. ApacheSolr has to be upgraded to 6.x-3.x on drupal.org (D6) first: #1700572: Manually test all existing ApacheSolr functionality in D6 to determine what, if anything, is still broken followed by #1793968: Deploy Apache Solr Search Integration 6.x-3.x
  4. The ApacheSolr A/B checklist: https://docs.google.com/a/transparatech.com/spreadsheet/ccc?key=0Ao3Eq4z...
  5. Our ApacheSolr dev server URL: http://solr-drupal.redesign.devdrupal.org

I hope that helps you continue forward with this awesome issue! And super-duper congrats on your first child! Girl or boy, it's a future coder for sure. ;)

webchick’s picture

I think what makes that timing tricky is that wildkatana is targeting a feature for D8 feature freeze, which is 12/1. If he doesn't get a dev environment set up until after 10/31, that makes the likelihood of completing it in time very small. :\

If the whole team is going balls-out crazy on D7 migration right now though, and there's no way of providing wildkatana with a Solr index, that's fair enough, and I guess we'll just have to hope 30 days is enough time for wildkatana and chx to sprint on this. But since he's been waiting for this dev environment for literally months, it'd be really nice if we could help him out.

Senpai’s picture

As @drumm stated in #122 above, there's a desperate need to get ApacheSolr 6.x-3.x completed and deployed to the live production site before anything else can happen D7-wise. When the following issue goes green, we can move forward with this Project Browser Server issue, but not until then:

#1793968: Deploy Apache Solr Search Integration 6.x-3.x

It's a day-to-day battle to get ApacheSolr 3.x out the door, and if we can test it and finalize it before this week is over, we can deploy it at 1pm PDT on Thursday, Oct 18th during the weekly push-to-prd. Once ApacheSolr 6.x-3.x is scheduled for deployment, all attention can turn towards this drupalorg_pbs issue because we will then be in a position to offer a 3.x Solr core with a full drupal.org dataset with which to test the P.B.S.

webchick’s picture

Ok awesome, thanks for an actionable step I can point people toward who want to see this happen! :)

Anonymous’s picture

@Senpai - You rock! Thanks for all of the links and detailed status updates. I'll give them all a read and see if I can help out on any of them. Fingers crossed about that Oct 18th goal, that would be absolutely awesome if I could resume coding on this again and get things ready for D8.

@webchick - I agree the sooner the better. I am willing to prioritize this though so whether it's 45 days or 15 days I'll be pushing to get this into D8 core, since I really feel that it is ESSENTIAL to helping beginners both find and install new modules to their sites :)

Anonymous’s picture

Issue summary: View changes

updated tasks

Anonymous’s picture

Issue summary: View changes

Added some links for the status

Anonymous’s picture

Issue summary: View changes

Moved link

Anonymous’s picture

I've been doing more testing and I have hit a wall in that the node table on my dev environment is missing the vast majority of project_project nodes. This makes it impossible to test the filters effectively. Here are the projects it has:

Array
(
    [1676180] => Media: Imgur
    [1676184] => Crowd SSO
    [1676234] => Twitter Bootstrap Front Page
    [1676236] => Ranks API
    [1676276] => Call Me
    [1676284] => GraphMath CAPTCHA
    [1676644] => Drupal - Talo
    [1676658] => Aviberry
    [1676676] => Rules Batch Loop
    [1676734] => Website Thương Mại Điện Tử
    [1676742] => Commerce Immediate Login
    [1676950] => Image Base64 Formatter
    [1676994] => Opinion Monitor
    [1677008] => first2
    [1677060] => Experian Cheetahmail
    [1677152] => Conditional Scripts
    [1677186] => Facebook Field
    [1677198] => Http Client Custom Delegates
    [1677250] => ActiveHelper LiveHelp
    [1677398] => Views Content Cache - Taxonomy Term Name
    [1677430] => Library Catalog Administration
    [1677456] => Valve
    [1677460] => Cron Last
    [1677600] => Mail Statistics
    [1677864] => Webform DynamicFields
    [1678050] => Email Update on Login
    [1678054] => Mica
    [1678086] => Diff compare
    [1678114] => Simple Help
    [1678126] => Context Disable Context
    [1678168] => Change theme
    [1678188] => Parent Child Menu
    [1678224] => Drupal Commerce Connector for AvaTax Calc™
    [1678226] => phpunit2
    [1678268] => whami Map Toolkit
    [1678276] => Media: Kaltura D7 Port
    [1678362] => Node templating enhancement
    [1678454] => Commerce Promo Codes
    [1678466] => Queue Manager
    [1678476] => Devel PHP Exec Extra
    [1678548] => Browscap Block
    [1678576] => Ubercart Connector for AvaTax Calc™
    [1678636] => Teasers
    [1678674] => Role activity
    [1678764] => coderwall2
    [1678766] => Search API Mutli Index Facets
    [1678780] => White Label
    [1678824] => CSV Importor
    [1678836] => Testing callbacks (fork)
    [1678848] => Date pager select
    [1678950] => MailUp Integration
    [1679122] => Administer menu permissions
    [1679152] => Geofield Gmap
    [1679154] => File Checker
    [1679420] => Urtak
    [1679496] => LineClears
    [1679534] => Advanced scheduling field
    [1679556] => timepicker
    [1679598] => Giant Bomb
    [1679608] => Login Terms and Conditions
    [1679682] => Fieldable panels panes feeds
    [1679706] => IRCname
    [1679800] => Exam
    [1679816] => Debut RedHen
    [1679826] => Blockexport
    [1679844] => Entityreference Instead of Product Reference
    [1679936] => Better Revisions
    [1679952] => Basic Shortcodes Library
    [1679958] => Jquery Week Calendar
    [1679964] => PDF to Image Converter
    [1679966] => Puzzle Formatter
    [1679982] => rtsudoku
    [1680086] => Commerce Simple Invoice
    [1680130] => Views Contextual
    [1680564] => yandex-pinger
    [1680888] => Workflow Fields port to D7
    [1680972] => Metatag paths
    [1681202] => Instant search - Node
    [1681270] => ATAC
    [1681302] => SoShake
    [1681366] => Buckaroo | Ubercart payment
    [1681408] => Book Clone
    [1681416] => Core office hours tasks
    [1681452] => Drupal Hybrid Computing
    [1681532] => NodeMaker
    [1681616] => Search Module Enhancement
    [1681620] => AuthNewzware
    [1681622] => Patch Drupal Core
    [1681684] => ergonlogic's copy of Remote import - Provision
    [1681736] => Epsilon Greedy
    [1681742] => Bartik Responds
    [1681762] => California Sales Tax Calculation
    [1681770] => UC Gift Certificate (D7)
    [1681812] => Library Catalog Mail
    [1681842] => Sites Access
    [1681848] => Comment Antispam
    [1681866] => PagSeguro API
    [1681932] => Ti Mama
    [1681934] => MediaElement Poster
    [1681970] => Awkward Showcase
    [1682028] => uc_nab_payment
)

Most of those projects don't even have any releases, which makes it doubly hard.

Is there any way to get the node table for my dev environment (http://wildkatana-drupal.redesign.devdrupal.org) updated to a more complete copy, so I can test the filters effectively? I think the http://solr-drupal.redesign.devdrupal.org environment has a more complete node table, could we just copy it over?

webchick’s picture

You should be able to get at the settings.php files for any of these sites so feel free to do whatever database manipulation you need to on your own site.

Anonymous’s picture

@webchick - I was able to import the node table successfully with some drush commands, thanks.

webchick’s picture

Awesome. :D Keep on keepin' on! :D

Anonymous’s picture

I was able to get a lot of coding done today (about 12 hours solid haha). I got things working with the new Apache Solr 6.x-3.x, which will make changing to 7.x very easy when it is needed (hopefully right after d.o gets Drupal 7).

I also fixed a bunch of usability issues in the Project Browser module itself, and added a new function that will attempt to find missing dependencies for modules you are installing and install them recursively. This will greatly help users who are new to Drupal and don't know what's going on with the whole missing dependencies thing. I made a quick screencast that shows how that works.

http://www.youtube.com/watch?v=ddC7M6sy5lA

Anonymous’s picture

Now that (I presume) Apache Solr 6.x-3.x has been deployed on Drupal.org, and drupalorg_pbs has been updated to use Apache Solr 6.x-3.x (no changes were needed for Project Browser Server module itself), is it possible to get these two modules deployed on Drupal.org, and then make an upgrade issue for when Drupal.org gets 7.x?

I don't anticipate many changes needed to the modules just for upgrading to Drupal 7 (just the hook_perms to permissions, and a few other small stuff).

I am assuming the Apache Solr 6.x-3.x API isn't going to change between Drupal 6 and 7. Am I correct in this assumption?

If we have to wait until the Drupal.org 7.x update is done, I understand. Either way, I will need a dev environment with Drupal 7 running on Drupal.org so I can update the modules to Drupal 7.

Anonymous’s picture

Any updates on this? I take it Drupal.org is not running D7 yet, from what I read here: http://drupal.org/community-initiatives/drupalorg/drupal7

Is there any chance of it running it before the feature freeze for D8? I have both the Drupal 6 and Drupal 7 versions of Project Browser Server and drupalorg_pbs ready. Could we just deploy the D6 version for now so I can proceed with getting the Project Browser module ready for D8? When drupal.org switches to D7, we can just switch the modules. They should work just fine.

Senpai’s picture

Assigned: Unassigned » sdboyer
Issue tags: +drupal.org D7, +porting

@wildkatana, the drupal.org site will not re-launch on D7 *before* the D8 code freeze, as it's too disruptive to core developers to improve upon known workflows in the middle of heated development progress. Poll and discussion at http://groups.drupal.org/node/265698.

Pinging @sdboyer about deploying the D6 version first, and substituting the D7 version into the nightly automated builds. Gotta make sure it won't break anything on the new http://d7demo.devdrupal.org D7 site.

Anonymous’s picture

@Senpai - Thanks for the update on this. If we could go ahead and deploy the 6.x version to Drupal.org so we can get started on real development for D8 and get something ready for Core, that would be great. It is working on my staging server just fine.

There is no upgrade path required for these modules (no .install files), so we can either deploy the D7 versions on the d7demo staging site now as well or wait, either way is fine with me.

I haven't actually tested the D7 versions yet because I don't have access to a D7 staging site, but the code *should* work (I changed all of the functions to their D7 counterparts).

dww’s picture

I hope whatever we're trying to build in D8 core that talks to this isn't going to be deeply tied to the notion that d.o is using solr. I still haven't seen the interface this server is providing. So long as everyone involved in building this service keeps a clean generic API between d.o and core in mind, I'm cool with this proceeding with the D6 version deployed now. I also wish there was time for someone to do a real code review and security/performance audit, which AFAIK hasn't happened, right?

Sorry for all the delays on this over the months/years. :/ d.o is an awful lot of work with a lot of moving pieces...

-Derek

Anonymous’s picture

@dww - Sun and dmitrig both did code reviews and we got a lot of things changed and tightened up. The Project Browser Server is the API module which is very loose, and the drupalorg_pbs module is the implementation of that API, using ApacheSolr to get the results. If the methods change, it wouldn't be hard to write another implementation module using the new way of getting projects, whether that be Views, direct database queries or something else entirely.

I know you guys are busy, and I appreciate the attention this has gotten. I really do see this as a critical usability issue based on my dealings with Drupal users who more often than not have no idea how to find or install modules.

Senpai’s picture

Status: Needs review » Needs work

"I really do see this as a critical usability issue" Let's get this *fully* tested in a D7 dev server before we consider it deployable to the D6 production site.

Anonymous’s picture

@Senpai - Can I get a Drupal 7 dev server then so I can test it? Or can we try installing it on the d7demo one? It won't interfere with anything else.

tvn’s picture

D7 dev site has been created: http://pbs-drupal_7.redesign.devdrupal.org/
Solr seems to be working as well.

Anonymous’s picture

@tvn - Thanks for this. I've been trying to get the code to work and it looks like there are some differences. I'm working through them now but the project_solr module code (which the drupalorg_pbs module was based off of) seems to have changed a lot, so it's slow going.

Are we going to even still use ApacheSolr to power the project search pages? The project_solr module was disabled and when I enabled it, it didn't seem to work.

Anonymous’s picture

Here's a link to show what I mean about the project_solr module not working: http://pbs-drupal_7.redesign.devdrupal.org/search/project/project_module

Anonymous’s picture

Okay I managed to get things mostly working, with just a few hiccups left to resolve:

1)

Modules and Themes are now their own content types, which is good, but I can't filter based on this. The apacheSolr facet 'bundle' only seems to work with 'project_project'. I tried setting it to 'project_module', 'module', etc but it didn't work. This means that I am not able to filter out themes or modules, so the results combine both. Am I doing this wrong?

In 6.x-2.x this was done using the $query->addFilter('im_vid_' . _project_get_vid(), $project_type->tid); filter. But the _project_get_vid function no longer exists (supposedly because we are using Content Types now).

2)

The images that used to be attached to projects are missing. Will these be imported later? See:
Live: http://drupal.org/project/token
Sandbox: http://pbs-drupal_7.redesign.devdrupal.org/project/token

I was using those, so in the mean time I have switched it to send the screenshots if there are any, but it is less than ideal.

tvn’s picture

Are we going to even still use ApacheSolr to power the project search pages? The project_solr module was disabled and when I enabled it, it didn't seem to work.

There is an active discussion right now about the search options for D7. Particularly this comment #949372-33: Port issue views to Search API so we can have a performant backend is not promising anything good for project_solr module.

The images that used to be attached to projects are missing. Will these be imported later?

D7 upgrade is in progress, and a lot of things just aren't done yet. Here is an issue for images: #1599154: Migrate d.o project image field to D7

Anonymous’s picture

@tvn - Thanks for the links. I've seen talk of wanting to kill solr as well and move to a Views or Search API backend. In the end they are all the same to me, and I can use whatever method is decided on if and when it changes.

This issue is at a critical crossroads right now. D8 code freeze is just a couple of weeks away. If Project Browser is going to make it into Drupal 8 core, drupalorg_pbs and Project Browser Server need to be deployed to drupal.org NOW. This issue is over a year old and most of it has been complying with this or that new requirement and then waiting for people to review it. The requirements keep changing because Drupal.org keeps changing.

At this point I am convinced that it will never stop changing, and that isn't necessarily a bad thing. The point is that I will always need to be maintaining the drupalorg_pbs module as the methods and requirements change. I have been doing this for the past year+ and I am willing to continue doing this for as long as needed.

We now have a completely working version of both Project Browser and drupalorg_pbs available to be deployed, and a mostly working Drupal 7 version of the same. Drupal 7 looks like it is still a long ways off at this point, judging by how broken the staging sites I've seen are.

Is there any way that we can get Project Browser and drupalorg_pbs deployed to drupal.org now while it is running 6, and then when Drupal 7 is ready, I will update drupalorg_pbs (Project Browser Server won't need any changes) to use whatever new method we are using at that time?

This will let me focus on getting Project Browser ready for D8 Core before the code freeze so that this has a shot at actually being used by people.

Project Browser Server and drupalorg_pbs don't affect any other systems on the site. They are self-contained, so I don't see what the harm would be in putting it on drupal.org now. They have been reviewed for security and performance already. If you have to upgrade Drupal.org to 7 and drupalorg_pbs is not ready yet (though I don't see why it wouldn't be since I will continue updating it as the new methods are decided on) then it can simply be disabled until it is ready. Disabling it won't affect any other systems.

At this point I just want to get something done with this issue so that we can make some progress. I have done all that I can do right now with the 7.x branch of the modules since we are waiting on other issues to be resolved now.

In the mean time, I am going to continue development of the Drupal 8 version of Project Browser and just use the dev7 server for now even though it isn't fully functional because of those issues I mentioned. I can't use the dev6 server now anyways because it looks like the ApacheSolr server it was using has been shut down. It would be great to use a fully functioning drupal.org server for the queries though for development, but I'll start going with what I have now.

Anonymous’s picture

Well, after a day of coding, I was able to get Project Browser working on a fresh Drupal 8 install using the latest dev version. So far so good, only a few changes were needed. I'm sure more will be needed before the end.

@webchick or any other Core maintainer, what do I need to do to get the Project Browser module included in Drupal 8 Core? Should I open an issue somewhere? Should I provide it as a patch?

David_Rothstein’s picture

In the mean time, I am going to continue development of the Drupal 8 version of Project Browser and just use the dev7 server for now even though it isn't fully functional because of those issues I mentioned.

As a very interested observer (but one who hasn't really participated in this discussion until now), I just wanted to chime in and say that I think this makes perfect sense. To be honest, I would think it should even be possible to get an initial patch committed to Drupal 8 core in that state, if necessary. Obviously that would require some discussion and a critical followup to switch over to the real server... but issues get committed to core all the time that require critical followups, and usually with much, much, much less of a good reason than you have here :)

So I would think the thing to do at this point is indeed to create an issue in the core queue with a patch. (There's an existing issue at #395478: Plugin Manager in Core: Part 3 (integration with installation system) I've seen which might be relevant, not really sure though and it's kind of old so a new one wouldn't hurt either.)

Anonymous’s picture

I have opened an issue here: #1841788: Add project browser to get Project Browser put into Core. Crossing fingers!

dww’s picture

A few quick replies based on some recent comments:

- Thanks again to wildkatana for your patience and endurance dealing with d.o hassles. I know it sucks to be in your shoes. You're handling it really, Really well. Thanks!

- Solr on d.o isn't dying. It's just not a good solution for the *issue queue* listings. That doesn't have any impact on the project browsing pages, which is what you care about. project_solr.module has gotten a ton of work in the last couple of weeks, so be sure you're looking at the very latest code before you get too worked up about any problems you're encountering. ;)

Again, I'd really be opposed to all of this if the API between the client and server knew anything about the backend the server is going to use to retrieve the answers the client wants to know. So long as the API is clean, and we can change the server without breaking the clients, I'm happy. Sounds like you're mostly just banging your head against solr for drupalorg_pbs, which makes sense from a system architecture standpoint. That *should* be the pain point. If nothing else cares, that's a good sign the rest of the system is well designed.

- Not having actually reviewed any code here, I support the tactic of deploying the D6 version on the live site so that you can continue to get the client ready for core. Probably by the time d.o is actually ready to launch on D7, you'll have the D7 port fully working. If not, so long as you're happy with this system being disabled, I don't see any reason to keep blocking this. My primary concern would be deploying this on D6 and then having it be Yet Another Blocker(tm) for the D7 launch, but if it can just be turned off if it's not ready, it's not a blocker. ;)

Cheers,
-Derek

Anonymous’s picture

@dww - Thanks for the input, and thanks for the work on project_solr. Without it, I would really have been lost (I haven't been able to find any real documentation for it, so your module was about all I had haha). If ApacheSolr is here to stay for the Project Listings, that's fine with me. And yes, the project_browser_server module itself hasn't changed at all really, just the drupalorg_pbs (which uses ApacheSolr to serve the queries). So you are correct about that.

And YES, please feel free to disable the module if it is blocking anything. I am going to continue working on the D7 version and can guarantee a working version by the time of the D7 launch, but like you said, it shouldn't be seen as a blocker if it is not. :)

And getting a fully working server in place will be VERY helpful in getting work on the client completed, especially since there is an issue to get it put into Drupal Core now. So if we can do that, I am all for it! :)

dww’s picture

Assigned: sdboyer » Unassigned
Status: Needs work » Needs review

Great. Given that, I'm not sure 'needs work' and assigned to sdboyer makes sense. Seems like this is basically ready, huh?

The next question is when should deployment actually happen? Normally, we try to deploy stuff on Thursdays. I don't think it's reasonable to do this right now, but next week is Thanksgiving. I should probably hop on IRC and try to find some other infra team members to discuss this.

The other question is how available you are to support this once it's live on the production site. If anything starts going wrong, we'll need your help to debug and resolve problems. I can't take this on myself, since a) I don't have time and b) I don't know anything about the code. So, I can't deploy it in good conscience if there's no explicit agreement from you to be available for follow-up support once it's deployed. I'm sure you'll say "of course" but I needed to officially ask. ;)

-Derek

Anonymous’s picture

@dww - If we could get it deployed sooner that would be better IMO. If there are any issues or errors (there shouldn't be, it worked on the staging site just fine), just disable it for now. I'm not going anywhere, Drupal is what I do all day every day, so I fully intend to continue maintaining this indefinitely. And I'm particularly available this week and weekend. Next week is thanksgiving like you said, so I won't be as available.

Note that the latest dev versions are what should be used.

dww’s picture

Status: Needs review » Needs work

Cool, thanks. Generally, we prefer to deploy from official releases when possible. Could you to tag some official releases of whatever we need to deploy? It doesn't have to be 6.x-1.0 -- it could be a beta1 or even an alpha1. Just something tagged is good.

Also, can you update the issue summary with an ordered list of all the steps that need to happen to actually deploy this? Links to the relevant project pages (and releases, if possible), any configuration that needs to happen, permissions, etc. Basically, assume that the person deploying this on the live site doesn't know anything about the module(s) and spell out exactly what they should do. We're not stupid, just ignorant about your code. ;) Make sense?

Please set this back to 'needs review' when the deployment instructions are done, and ideally pointing to some tagged releases.

Thanks!
-Derek

dww’s picture

Issue summary: View changes

Cleaned up the motivation message

Anonymous’s picture

Status: Needs work » Needs review

I have updated the issue summary with a Deployment Checklist that links to the proper releases (I tagged them). Thanks for pushing this through! :)

Anonymous’s picture

It would be nice if we could get some security reviews for this so we can get it deployed. It's been reviewed by sun, but I think we might need another security review? This has been sitting ready for the past few weeks now.

Since #1841788: Add project browser will need this, and there are some bugs with the 7.x version while we wait for some of the 7.x related issues to resolve that are confusing people who are reviewing the Add Project Browser issue, I think we should get it deployed as soon as we can, as in yesterday :)

Anonymous’s picture

Since #1599154: Migrate d.o project image field to D7 is marked as fixed now, how can I get at those image fields on http://pbs-drupal_7.redesign.devdrupal.org? I didn't see the image field when I did a print_r on the $node object. Has it not been moved to my dev environment yet? Can I get it moved, or just a reimage or something?

webchick’s picture

Yeah, needs to be re-imaged, the dev environments don't automatically pull in upstream improvements, in order to allow for stable development environments.

tvn’s picture

wildkatana, instructions on re-imaging dev sites: http://drupal.org/node/1182848

Anonymous’s picture

Thanks tvn: #1858426: I want my drupal.org development site re-imaged

This isn't a blocker though and this issue is still ready to be deployed right now (as per the instructions in the OP).

sdboyer’s picture

@wildkatana - the only question wrt the D7 upgrade process really is: what sort of an upgrade path is there from 6 to 7? we do need to try to keep the build from breaking with this...if possible.

i'm not exactly sure how to get you a good environment to test that upgrade path, but if you've done it to a reasonable degree of satisfaction, i don't want to block getting this out there and deployed. so long as you understand that we may be making intense demands on your time for a couple days if the upgrade path DOES break :)

jonhattan’s picture

Here're my 2 cents

@sdboyer, wildkatana said in #143:

There is no upgrade path required for these modules (no .install files)

@wildkatana great work!

Anonymous’s picture

@sdboyer - jonhattan is correct, nothing is going to break on Drupal.org. The 7.x version is already half-working (it is being used as the current server for #1841788: Add project browser and I am going to get it fully working once I get my dev environment re-imaged (the images field was missing, along with a few other things, that need to be pulled in).

cweagans’s picture

Priority: Major » Critical

I'm not sure why this was demoted to major in 130. There was no explanation. This is still really important. If the project browser is not going into core, we need to let it mature in contrib and that can't happen without deploying the project browser server first. @Infra team, is there anything that anyone can do to help move this along?

cweagans’s picture

To clarify, I will spend some time on this if there's anything that can be done to move this along quickly.

douglasmiller’s picture

@cweagans there is another issue in the infrastructure queue that is related to getting the d.o site functional again. #1897088: I need a dev server to test the d8 project browser module

The summarized version is that Wildkatana requested that his d.o dev site be re-imaged and all of his configurations were wiped in that process. I was given access to the d.o dev server and have enabled and configured the modules that are necessary for the project_browser module to work. The Solr index that the d.o dev server uses does not include the Project listings though.

I don't know much about how the dev servers work, but presume that there is a default Solr server (stagingsolr.drupal.org) that excludes the Project listings. Discussion about a similar problem starts around comment #90 in this issue.

For now, I've enabled an example module that returns static results.

webchick’s picture

Note that in order to try and pre-emptively address catastrophic failures like this issue, we are aiming to establish a Drupal.org Software Working Group governance structure. People involved in this issue might have thoughts to share over there, so cross-linking.

Anonymous’s picture

Actually in this case I had complete backups before requesting reimaging. So there wasn't any issue here with that. The problem has been trying to get ApacheSolr working on the dev environment with a good project database to draw from.

tvn’s picture

Issue tags: -drupal.org D7, -porting

removing D7 upgrade tags

tvn’s picture

one more time

sabotagenl’s picture

this would be a neat feature! Subscribing.

cweagans’s picture

@sabotagenl, You don't have to comment to subscribe anymore. You can just click the "Follow" button at the top of the page.

douglasmiller’s picture

Deployment instructions for server running Drupal 7

  1. Download and enable the Project Browser Server module to provde the API.
  2. Download and enable the Drupalorg PBS module which implements the API.
  3. Grant the access project browser server permission to anonymous and optionally authenticated users.

Dependencies

douglasmiller’s picture

Issue summary: View changes

Updated the Deployment Checklist

groovedork’s picture

Sooo... is this going into core or not? It would increase the usability of Drupal quite a bit in my opinion (as an interaction designer).

Project: Drupal.org infrastructure » Drupal.org customizations
Component: Drupal.org module » Miscellaneous
webchick’s picture

Version: » 7.x-3.x-dev

Hm. Is this some kind of issue bulk mover script, and if so, when do the rest of us get access? :) *drool*

tvn’s picture

webchick, it's public! Just write a function like this:
http://cgit.drupalcode.org/drupalorg/tree/drupalorg/drupalorg.install#n1680

webchick’s picture

Aw. ;( I was hoping there was a UI. :)

webchick’s picture

Bumped #1093650: Provide VBO support for issue management again. And now shutting up. ;) Sorry for the noise.

chx’s picture

Issue tags: +sad chx
Anonymous’s picture

Since this has stalled out, I'm going to push for a stable release of Project Browser as a Contrib module, and host the server myself. I still would love to see something like this put into Drupal core eventually, but maybe having a good solid contrib would be a better first step.

I spent a few days getting it cleaned up and working. Look for a beta release of the Project Browser module later today: https://www.drupal.org/project/project_browser

drumm’s picture

Issue tags: -

Drupal.org now has a generic API, using the RestWS module, https://www.drupal.org/api. A big advantage of this is that it is no Drupal.org-specific code and covers all content types, not just projects.

However, it doesn't handle Solr searches. I think the best way forward could be a RESTful API for Apache Solr Search in general. If there is a safe way to expose Solr directly, or a module for this, that could work.

Anonymous’s picture

@drumm

That is very good news, but like you said, if Solr isn't supported, there may not be much that can be done. The release for Project Browser is now up, so we can at least start iterating on that module's bugs/features while we work simultaneously on a better/more permanent server solution.

I'll be reviewing the API docs to see if I can implement any of that in the current server or client modules.

Anonymous’s picture

I took a look at the API and it looks very neat, but doesn't quite do what we need it to.

A Solr API would be awesome. Also, it would be great to begin exposing other data points, like number of installs.

drumm’s picture

Giving some more thought to this - I think a way to add RestWS query parameters that Solr handles for filtering & ordering would be ideal. Assuming RestWS has hooks to do that, and something roughly equivalent can be done in D8. Then the API results would be formatted the same.

Also, it would be great to begin exposing other data points, like number of installs.

That would be a patch for project_usage module that either exposes the custom data store, or more-drastically moving to a non-custom data store. A good first step might be creating a project field to keep the project totals. All fields are in RestWS without additional work.

sarmiliboyz’s picture

Is there any update on this? It would be great if drupal modules and/or themes installation could be done via admin area just like wordpress do.

AaronMcHale’s picture

AaronMcHale’s picture

It's possible this may no longer be needed as I'm pretty sure it's possible to search packages.drupal.org and get results back, can't remember the exact URL/syntax to query.