This is a simple module that takes the search results from an ApacheSolr search and uses them to generate a CSV file. The CSV file and a basic control (on / off checkbox) is displayed in a custom block that must be placed on the search page. When the box is checked, all searches result in a new CSV. The files are saved in the default/files/ folder and are named with an MD5 hash of the ApacheSolr Request string used to generate the search results.

The results saved into the CSV aren't just what you see on the page, they are ALL results for the query. Because of this, some queries that return large numbers of results can kill the server. That's why the checkbox was implemented. To further aid the user, I threw in some simple code that displays the total number of search results next to the search field after a search has been performed. I've also extended the PHP time limit for the function that does all of the work.

I'm hoping that, with the help of others, the above problem can be mitigated more gracefully. I'd also love to see some new features added to this, like permissions control, for example.

Please test, use, and enjoy!

thanks,

Sunil

http://drupal.org/sandbox/dayer4b/1822412

http://drupalcode.org/sandbox/dayer4b/1822412.git/shortlog/refs/heads/7....

This is for Drupal 7

I've contributed code to a couple of other contrib modules:
http://drupal.org/project/ldap (developed LDAP Authorization Organic Groups and helped debug LDAP Authorization module)
http://drupal.org/project/datavizwiz (made several fixes and did significant styling work)

Comments

fr3shw3b’s picture

Status: Needs review » Needs work

Hello dayer4b,

firstly here you can find lots of essential code formatting things to change:
http://ventral.org/pareview/httpgitdrupalorgsandboxdayer4b1822412git

Manual Review:

1. Well my first observation is that there isn't a file doc block in the module file which the PAReview tool also picked up on.
http://drupal.org/node/1354#files

2. You need to remove all commented out dpm tests.

3. Where you have created the custom variable apachesolr_csv_generate you need a .install file using hook_uninstall() and variable_del to remove the custom variable on uninstall.

4. Naming of functions and variables must only use lower case letters and underscores.

5. Full stops are needed at end of comments as they are explanation of what is happening in the code and fully understandable grammatically correct English.

6. files[] = apachesolr_csv.module in the info file is unnecessary as the module file is automatically searched for and added to be executed.

dayer4b’s picture

Thanks for the review. I made the changes that you asked about, including running it through ventral.org. That was very helpful.

Any more points?

fr3shw3b’s picture

Status: Needs work » Needs review

You need to set it to needs review for reviewers to look for it, I might have time to review later today though.

hhhc’s picture

Hi,

#1:
you should add the checkout code/link to your project page

#2:
when I first search and then clicked the checkbox, a scrolling wheel was indicating that sth happened but actually no output, only after a reload of the page a link appears; it seems that the toggling on/off does not work properly (I'd expect that once I deactivated the checkbox that the link disappears...)

#3:
the exported file (in Excel) does not display special characters (Umlaute) correctly (ü->ü ö->ö etc).

#4: manual review:
apachesolr_csv.module: if you save the settings of the CSV file in the settings table (line 50) then this module is not thread safe, 2 users with different settings will interfere each other. use settings table only for global configuration.

klausi’s picture

We are currently quite busy with all the project applications and I can only review projects with a review bonus. Please help me reviewing and I'll take a look at your project right away :-)

alexmoreno’s picture

dayer4b, as you point, you have contributed to some other projects, but it should be important for your own approval process that you could review at least 3 other projects like we are also doing. You can find them here: http://drupal.org/project/issues/search/projectapplications

heddn’s picture

apachesolr_csv.info

  • dependencies should probably include apachesolr_search

apachesolr_csv.module

  • line 89/295 and comment line 77: perhaps use drupal_hash_base64() or drupal_hmac_base64() instead of md5. http://drupal.org/node/845876
  • line 128: htmlspecialchars() duplicates check_plain()
  • line 160-258 seem to need some work. Lots of commented out code and a @todo
  • variable apachesolr_csv_generate should probably be boolean TRUE/FALSE
  • apachesolr_csv_block_view() could probably use a render array to make things cleaner/easier so you don't even need the form implementation. http://drupal.org/node/930760
wuinfo’s picture

Status: Needs review » Needs work
dayer4b’s picture

Edit: this is a reply to hhhc's comment

I'm sorry for the slow responses. To address your concerns:

#1: I'll add this soon.

#2: This is actually the expected behavior. I know it's strange, I'll try to come up with a better method.

#3: That's a problem. I'll look into it. Do you have any recommendations on proper character encoding?

#4: I can understand how that would be a problem. This is related to the problem in #2. If the CSV could be generated safely on every search then neither of these would be a problem. Do you have any recommendations on where to store this variable data to make it thread safe?

As an alternative to the problems in #2 and #4, I suppose the CSV could be generated at the time that the link is clicked, but that would introduce a large delay after the link is clicked.... perhaps a different interface is called for.

dayer4b’s picture

@heddn

I've addressed a few of these issues. I agree that these are problems, especially the @todo. The node type could easily be placed into a configuration page. I'll work on that.

thanks!

PA robot’s picture

Status: Needs work » Closed (won't fix)

Closing due to lack of activity. Feel free to reopen if you are still working on this application.

I'm a robot and this is an automated message from Project Applications Scraper.