Does this module support Windows or not? If yes, how to configure the path of helper applications?

Thanks
Arda

Arda,

I am using Windows 2003 and IIS6, and I got it to work.

helper path: D:\xpdf\pdftotext.exe %file% -
Directory: D:/root_site_folder/files

I am running 5.x and I noticed in another issue someone suggested using escape characters like this: D:\\xpdf\\pdftotext.exe %file% - but I didn't have to do that. And you do need to give your internet guest account access to cmd.exe.

Also, the helper application is dificult to set up, so you should test that separatly to make sure it works. I had trouble because pdftotext came with instructions for linux/apache.

Maybe 6.x is different from 5.x, but I think it will work.

Brendan

Sorry if it is a stupid question but how can i give the internet guest account access to cmd.exe ?
I am using apache on windows.

I finally got it to work on apache on Windows(with version 6.x-1.6 of search_files)
My main problem was that the function shell_exec didn't return anything for me so I had to use WshShell->Exec

I had to modify the hook_update_index function as follow:

/**
* Implementation of hook_update_index()
*
* lists all the files in the director(y/ies) and puts the files
* into the "search_files_files" table
*
* then indexes X(configurable) number of these files
*/
function search_files_update_index() {
$helpers = search_files_get_helpers(); // only update the list of files in the directories once per day if(TRUE) {//if (variable_get('search_files_last_index', 0) < (time() - 86400)) { variable_set('search_files_last_index', time());$result = db_query('SELECT * FROM {search_files_directories}');
while ($directory = db_fetch_object($result)) {
search_files_list_directory($directory->directory,$directory->id);
}
}

$index_number = (int)variable_get('search_cron_limit', 100);$sql = "
SELECT
*
FROM
{search_files_files}
LEFT JOIN
(
SELECT
*
FROM
{search_dataset}
WHERE
type = 'search_files'
) AS dataset ON {search_files_files}.id = dataset.sid
WHERE
(
dataset.reindex IS NULL OR
dataset.reindex != 0
) AND {search_files_files}.index_attempts <= 5
LIMIT %s
";

$WshShell = new COM("WScript.Shell");$result = db_query($sql,$index_number);

while ($file = db_fetch_object($result)) {
$full_path =$file->full_path;
$file_name = explode('/',$full_path);
$file_name =$file_name[count($file_name)-1];$file_extension = explode('.', $file_name);$file_extension = $file_extension[count($file_extension)-1];

if (in_array($file_extension, array_keys($helpers))) {
// record that we are attempting to index the file in case it hangs
$increment_sql = " UPDATE {search_files_files} SET index_attempts = index_attempts + 1 WHERE id = '%s' ";$increment_result = db_query($increment_sql,$file->id);
if ($file->index_attempts >= 5) { // indexind failed too many times, record this to the log and continue watchdog('Search Files', t('failed to index %full_path after %attempts attempts', array('%full_path' =>$file->full_path, '%attempts' => $file->index_attempts)), array(), WATCHDOG_ERROR); continue; } // %file% is a token that is placed in the helper's parameter list to represent the file path to the attachment. // We need to put the filename in quotes in case it contains spaces.$quoted_file_path = '"'. escapeshellcmd($full_path) .'"';$helper_command = preg_replace('/%file%/', $quoted_file_path,$helpers[$file_extension]); //the following command doesn't work on windows ?? //$file_text = shell_exec($helper_command); //then using WshShell->Exec instead... try {$file_text = $WshShell->Exec($helper_command)->StdOut->ReadAll;
} catch (Exception $e) { watchdog('Search Files', t('Exception caught during the indexing of %full_path : %exception_string', array('%full_path' =>$file->full_path, '%exception_string' => $e->getMessage())), array(), WATCHDOG_ERROR); continue; }$file_text = search_files_convert_to_utf8($file_text); //echo "$file_text";

search_index($file->id, 'search_files',$file_text);
}
else{
search_index($file->id, 'search_files', ''); } } }  Once you have replaced this function in the search_files.module file, you have to: 1. First be sure to start clean by reinstalling the search_files module 2. Add the repertory you want to index with "/" (ie: C:/xampp/htdocs/drupal/sites/default/files) 3. Delete the TXT helper 4. edit the PHP helper to have this path: c:\\windows\\system32\\cmd.exe /c c:\\pdftotext.exe %file% - 5. move the pdftotext.exe file to c:\ 6. go to Site configuration/search settings and click on reindex site 7. run cron manually enough times to index your files 8. you can now try a search :) Hi, I want to use the search file module with drupal6.9 on windows Installed it. The search of pdf attachments works. But i want to search .doc and .xls files, too. This doesnt work. The helpers for doc or xls are not for windows. Is there a solution for Windows? There are a number of helpers for Windows/DOS, have you found the ones you need? any one plz tell me how to install helpers(pdf,doc,xls,ppt,rtf) files in windows machine . give detail link or document. thanks in advance. rgds Yuva Hi This is my configuration: Directory list (absolute path): D:/xampp/htdocs/observatorios/sites/default/files Helpers: PDF: D:\xampp\htdocs\helpapps\pdftotext.exe %file% - DOC: D:\xampp\htdocs\helpapps\catdoc.exe %file% XLS: D:\xampp\htdocs\helpapps\xls2csv.exe %file% PPT: D:\xampp\htdocs\helpapps\catppt.exe %file% In the case of PPT, DOC and XLS helper apps, be very careful to avoid using directory names with spaces or longer than 8 characters (i.e ../helperapps/.. instead of ../helpapps/...) because they cause troubles. Regards Im confused, so i cant just have the helper files on my website directory uder "/drupal/default/files/"helperfile"" and have it find the doc and pdf words in search? Keep in mind it would not be on the localhost at all but hosted with Dreamhost. Hi I wonder if you can help me PLEASE, I installed the search files module under windows what I need to do is be able to search a directory with txt files for a persons name or info in the document and return the results (files) for a person to open, my txt helper is as follows c:\\windows\\system32\\cmd.exe type %file% and my directory C:\xampp\htdocs\dot\sites\all\files the problem is that it is not returning any results. At the point that it returns results we will have a report viewer that will open with an interpretor and display the reports in PDF. Please help. Hi I undestand the settings should be something like this: Directory list (absolute path): C:\xampp\htdocs\dot\sites\all\files Helper apps: Why are you using "c:\\windows\\system32\\cmd.exe %file%". Shoudn't you use catdoc if you want to search into txt files? Instead, I would use the following TXT helperapp, (using your actual path): D:\xampp\htdocs\helpapps\catdoc.exe %file% The first thing to check is if your helperapp is working. To do so, install catdoc, open a cmd window and type the following command to check if you get back some results (using your actual path). e.g. D:\xampp\htdocs\helpapps\catdoc.exe C:\xampp\htdocs\dot\sites\all\files\example.txt Hope it helps PD. I am no longer using Search files. I have switched to solr + solr attachement. Hi I haven't been able to make (catdoc/catppt/catxls) work until I have changed the path using names of eight characters. In the case of pdftotext, as it is a more recent application, it causes no problems. Regards subscribing  Status: Active » Fixed ayyurek, please try 6.x-2.x-beta4, it has been tested on WAMP, see #559414: search_files compatibility with Windows AMP Also the helper configuration described at #340013: configuring helpers on Windows environment comment #7 works well.  Status: Fixed » Closed (fixed) Automatically closed -- issue fixed for 2 weeks with no activity. I installed the search files module and also installed the pdftotext helper application for windows in C:\xampp\htdocs\helpapps\pdftotext.exe %file% and the Directory path C:/xampp/htdocs/websites/epicdev/sites/default/files but nothing is working for me.. Could anybody please help me to solve this problem. I don't know where I went wrong. Try adding a '-' after your pdftotext path: C:\xampp\htdocs\helpapps\pdftotext.exe %file% - Regards  Version: 6.x-6.x-dev » 6.x-1.6 Priority: Critical » Normal Status: Closed (fixed) » Active Thanks for the information concerning how to configure windows helpers here. I've followed your instructions installed the pdftotext helper on my Windows 2003 server. Have also confirmed that its working using windows CMD prompt - the content in any PDF are showing in the CMD window when i enter the pdftotext path and the path to a pdf file. I've also installed WAMP 2.0,Drupal 6.14 and the module search_files 6.x-1.6 on the server. The module is activated and have been given user rights. (the option "server files" are showing when you search on the site. Also activated the search_files index function using manually start of cron in drupal admin. The problem is that I got no hits for any content in the PDF files when searching... To isolate the error I've checked the database - and could see that there exists file data in table search_files_files, full_path = "C: wamp www sites default files/Anstallningsbekr_arb_0.pdf" and index_attemps=1. However the column for searchable content (search_dataset.data) are empty for all records of type file_search. Configuration pdf path C:\wamp\www\helpapps\pdftotext.exe %file% - Directory path (same as used by attached items) C:\wamp\www\sites\default\files Any advices? Is it possible to activate som kind of error log for the search_files index process? Seems like something goes wrong here and the result are empty rows in the search_dataset.data coulmn. Finaly I found an error in this code: function search_files_update_index() { ....$sql = "
SELECT
*
FROM
{search_files_files}
LEFT JOIN
(
SELECT
*
FROM
{search_dataset}
WHERE
type = 'search_files'
) AS dataset ON {search_files_files}.id = dataset.sid
WHERE
(
dataset.reindex IS NULL OR
dataset.reindex != 0
) AND {search_files_files}.index_attempts <= 5
LIMIT %s
";


it should be like this:

function search_files_update_index() {
....
\$sql = "
SELECT
*
FROM
{search_files_files}
LEFT JOIN
(
SELECT
*
FROM
{search_dataset}
WHERE
type = 'search_files'
) AS dataset ON {search_files_files}.id = dataset.sid
WHERE
(
dataset.reindex IS NULL OR
dataset.reindex = 0
) AND {search_files_files}.index_attempts <= 5
LIMIT %s
";


focus on this chunk of code : dataset.reindex = 0

Thnx.

Wow!! Good Job. Its very good solution. The problem is solved now. Thank you...

Hi there!

I've tried to configure search_module by following your instructions but no files are indexed.
I really don't understand why...

First I installed search_module 6.x-1.6
I configure module with :

pdf path
C:\wamp\www\website\sites\default\files\pdftotext.exe %file% -

Directory path (same as used by attached items)
C:\wamp\www\website\sites\default\files

Finally I cleared cache & ran cron.

Did I missed something?

Nope, just try another helper for windows, i found catdoc is works well both for .doc an .txt.

here my configs:
- .doc
D:/xampp/htdocs/helpapps/catdoc.exe %file%
- .txt
D:/xampp/htdocs/helpapps/catdoc.exe %file%

and note that windows helpers do not compromise file name more than 8 chars long.

if it doesn't work (cause i forgot where is another bugs exactly), you can try my alternate new module at http://drupal.org/node/800664#comment-2980354

finaly, let me know if my module works for you.

thank you.