I am thinking my move to CentOS6 caused my use of pdf_to_imagefield to stop working. I am getting this error...

pdf_to_image_shell_exec did not create image .../sites/default/files/453-0.jpg as expected. sh: gs: No such file or directory convert: Postscript delegate failed `.../sites/default/files/files-webpage/2015/220/pdf-shorty.pdf': No such file or directory @ pdf.c/ReadPDFImage/611. convert: missing an image filename `.../sites/default/files/453-0.jpg' @ convert.c/ConvertImageCommand/2800.

This may be a related issue https://www.drupal.org/node/826194.

The exact comment the module is trying to run is

/usr/bin/convert -density 50x50 -trim +repage -colorspace sRGB '.../sites/default/files/files-webpage/2015/220/pdf-shorty.pdf[0]' '.../sites/default/files/453-0.jpg' 2>&1

So maybe there is a problem with my convert / gs / apache configuration. However, I did notice in the command above that the location of the pdf file is right. However, the destination (if that is the second filename in the command) for the resulting jpg image is not located where my chosen image field would store the image. If I am understanding the command syntax, that would be a problem.

Any help?

Comments

dman’s picture

Step 1 is to try that command yourself, when logged in as the web user. Even that is not conclusive (shell $PATH settings are actually different in practice) but if you fail there, then we can expect the rest to fail.

As to the file path names - if you are using some other mechanism such as filefield_paths to define your image storage - that can only happen *after* the image is created. It's a bit convoluted, as the two processes happen at different times, but we have to make the image first, then let filefield_paths say where to store it later.

I think what you are seeing it the usual problem with 'gs' not being in your (web shell users) PATH. " gs: No such file or directory ". The rest of the messages are just cascaded from that.

webservant316’s picture

In the pdf_to_image.module in this function pdf_to_image_shell_exec($command) this line of code below is zeroing out my $PATH so that convert and gs are not accessible. Actually in my image tool kit set up I specified '/usr/bin/convert' so that command is found, however, when 'convert' tries to call 'gs' it cannot find the executable.

// This looks like a no-op, but it's actually sometimes neccessary
// (maybe just under dev desktop)
// in order to allow gs to be found! Mad.
putenv('PATH=' . getenv('PATH'));

In my configuration exactly the opposite is happening because this line of code zeros out the $PATH!

Recently I did move from CentOS5 to CentOS6 so perhaps there is a setting in Apache / Litespeed / PHP that is causing your intention with this line of code to fail it's purpose.

Also most important.. when I comment out the line the function succeeds!

Any thoughts?

dman’s picture

Yeah well, if you found that specifying /usr/bin/convert was necessary - because it would not find 'convert' on its own ... then it follows that when 'convert' internally goes looking for 'gs' - that will not be found also, and the same (ineffective) PATH lookup is failing there.
OTOH, I have also seen versions of convert that were compiled '--with-gs' or something, and either just knew where to find it already, or had it built in or something. I don't really know the internals there.

That mad code there was what (eventually) made an install of Acquia Dev Desktop 1 (on OSX with fink I think) work, and didn't have any bad side effects on the Ubuntu boxes I also tested on.
As you can tell by looking at it , you would not expect it to make any difference one way or the other - but it does, so it is near-inexplicable black magic.

The real trick is to just ensure that your web process has the full and necessary PATH set in the process environment at a higher level. Then no messing around with paths here would even be necessary. However I found that wacky to document authoritatively because server setups differ so much.

webservant316’s picture

More information here.

I use the Litespeed alternative to Apache because it is faster and apparently when Litespeed is run in simple suEXEC mode then getenv() fails to deliver. However, when Litespeed is run in suEXEC daemon mode then getenv() works properly. Who knows why! In my use case I turned on the suEXEC daemon mode and restored your module code to original and all works well.

Sigh... I guess when a module depends upon system executables one could expect challenges of this nature. I also saw a Litespeed article that discouraged the use of the getenv() function for whatever reason. However, I am not sure that can be Drupal PHP guideline as the function is there for our use.

In the end your module printed a decent error message and 1 day later I am happy with a solution. Though I am not sure someone with out system and PHP knowledge would be able to swim through that.

Thanks again.

dman’s picture

Status: Active » Closed (works as designed)

suEXEC stuff? Right, well, don't get me started on the vagaries found there!
So yeah, sounds about normal to me:

I found that wacky to document authoritatively because server setups differ so much.

Glad you found a way through!
I guess other work-arounds would be possible - like instead of asking you for path_to_executable, we said define your preferred $PATH (because we can't always trust that your webserver will give us a meaningful one) ... but that all gets a little more mad each time. Something to think about though.

I'd pretty closely followed the same approach as imagemagick.module htmltidy.module, and a few other commandline-using utilities did. Basically, start by hoping the command was found, but provide heavy diagnostics if not, then let the user override their own system-specific path to the executable if needed.
I guess for this time, there just happened to be two paths needed (and I don't get to set the other one even as a parameter)

webservant316’s picture

I think if you start requiring $PATH as a configuration parameter you crossed a line to function that is supposed to be managed by the server. Where would you stop then? I think this just comes with the territory with modules like yours that depend upon system resources. Like you said, print out as much detail on the error message as you can.

Perhaps if the convert function fails you could add even more detail by checking to see if both 'convert' and 'gs' are available on the $path. You could also print the $path in the error message as well. Knowing that 'gs' was not available on my path would have lead me to the problem faster.

webservant316’s picture

wow ran into this again after a move to a new server!

Once again the cure is running Litespeed with suEXEC.