First of all let me say THANK YOU to shenzhuxi for this awesome module.
This something I came about myself and think is useful to share.
What I observed is that the PNG directories which File Viewer produces by default are extremely large in size with respect to the original PDFs. A small pdf (mostly text, 36 pages, ~120 kB) would yield a directory of ~10 MB!
After following another comment (don't exactly remember where) I saw that the creation of the png files is done by the fileviewer_pdftopng
function located in fileviewer.module, line 211.
function fileviewer_pdftopng($filepath, $destination) {
$cmd = 'pdftohtml -xml "' . $filepath . '" "' . $destination . '/meta/text.xml"';
shell_exec($cmd);
$cmd = 'pdftoppm -png "' . $filepath . '" "' . $destination . 'page"';
shell_exec($cmd);
}
In particular, the default command is pdftoppm -png
, which uses the default values of pdftoppm (150 dpi, color image, anti-aliasing, etc. For more info: man pdftoppm
). These create extremely large png files.
To address this we need to hack the sites/all/modules/fileviewer/fileviewer.module file on our Drupal installation and use additional options for the pdftoppm
command :
-add the -mono
option to return monochrome files (the -gray
option did not decrease filesize by much for mostly bw texts).
-the -aa no
revokes anti-aliasing and also decreases file size dramatically (the respective file size was 1.7 MB). The downsize is that at 150 dpi the fonts are somewhat jagged.
-the -mono -aa no
combination has no additional effect. It seems that anti-aliasing is the main cause for the file size increase.
-a fiar compromise is -aa no -r 300
which increases resolution to 300 dpi and decreases the jaggedness of the fonts without extreme size overload (4.2 MB)
Comments
Comment #1
thanasis57 CreditAttribution: thanasis57 commentedUpon second thought, I think it would be great if there was a GUI for the users to select which pdftoppm options should apply to each specific file they upload.
...maybe in the form of tick boxes, requesting a particular selection to override the defaults.
Unfortunately, I am no developer to be able to implement such a feature...
Comment #2
konrad_u CreditAttribution: konrad_u commentedThanks for this info thanasis57
settings that worked for us
pdftoppm -png -r 120
@ https://inkbok.comreferences:
http://linux.die.net/man/1/pdftoppm
it's worth noting that version 3 has
-jpeg
as option http://www.manpagez.com/man/1/pdftoppm/Comment #3
konrad_u CreditAttribution: konrad_u commentedComment #4
thanasis57 CreditAttribution: thanasis57 commentedYes, I saw that, but I did not have time to test whether FileViewer can display jpeg's.
I will try it out and maybe save some file size. I will report back when I do.
Comment #5
thanasis57 CreditAttribution: thanasis57 commentedAs I tried to test the -jpeg flag, I found out that the module no longer worked!
After several attempts, I tried out the pdf module and it works A LOT better, without the need to create png/jpeg files.
So I abandoned File Viewer for pdf.
Comment #6
konrad_u CreditAttribution: konrad_u commentedBoth of those modules are created by the same author and they differ depending on what functionality you need. If you just simply want to embed pdf within your website pdf module is great.
Our case was different and we need to stick with fileviewer as it prevents user from downloading or copying text.
And yes jped is a no go
Comment #7
thanasis57 CreditAttribution: thanasis57 commentedThanks for the input. Any ideas as to why the module suddenly stopped working? It is not something critical any more, but it keeps bugging me.