CVS edit link for Peritus
I'm PHP programmer and use Drupal about 1,5 from now. Recently I finished my last module, which converts a PDF file to ImageFields and I think its good idea to share it with Drupal community. The work was performed with collaboration of user OnkelTem (
http://drupal.org/user/239962) who is requesting CVS account too since he will be the co-maintainer of the module.
Module is implemented as a widget for FileField CCK field. It takes PDF file on its input, renders PDF on pages using IM, and then places result to associated ImageField. Actual task on conversion is perfomed by cron, which takes setting like how many PDFs are processed at once.
This functionality was initially requested by OnkelTem in this thread http://drupal.org/node/773050
Nikita
Comments
Comment #1
elamanAttaching my module...
Comment #2
avpadernoHello, and thanks for applying for a CVS account. I am adding the review tags, and some volunteers will review your code, pointing out what needs to be changed.
As per http://drupal.org/cvs-application/requirements, the motivation message should be expanded to contain more details about the features of the proposed module, and it should include also a comparison with the existing solutions.
Comment #3
dman commentedI like the idea. So much that I want to try it.
But first:
Documentation: pretty good.
1* Code style: Needs the normal amount of trivial tidy-up: (from coder.module)
pdf_to_imagefield.module
pdf_to_imagefield_convert_pdf($pdf['filepath'], $path .'/'. $pdf['fid'] .'.jpg', array(), array('-density '.$file->density_x.'x'.$file->density_y));foreach(filefield_get_file_references($pdf) as $nid => $vid) {} else {$command = implode(' ', $extra) .' '.escapeshellarg($source) .' '. implode(' ', $args) .' '. escapeshellarg($dest);pdf_to_imagefield.install
pdf_to_imagefield_file.inc
$pdf->density_x = preg_replace('/^([0-9]+)x[0-9]+$/', '\1' ,$file->field['widget']['density']);$pdf->density_y = preg_replace('/^[0-9]+x([0-9]+)$/', '\1' ,$file->field['widget']['density']);sites/all/modules/modified/pdf_to_imagefield/pdf_to_imagefield_widget.inc
pdf_to_imagefield_widget.inc
$imagefields[$key] = $filefield['widget']['label'] .' ('.$filefield['field_name'].')';It would be best if you fix that on a first module for review.
2*
Nowadays we use module_load_include()
3*
hook_init() runs for every page in the site, every time. Not a great place to check for something that might just once ever be needed. Could you move this check into somewhere better, like an admin page or hook_requirements()?
Looks like a code error there - $resolution is undefined. (not a code review thing, just a possible bug and PHP notice)
Good Drupal code by creating a widget for your special field. Generally fits with the Drupal way well. I didn't see much code that should have been Drupal-ed more. Builds on top of imageAPI/Imagemagick, despite it being a bit awkward. It does access an ImageAPI_ImageMagick _private function - but the reason for that is documented. ImageAPI needs to be prodded like that a bit still. Using that func is still better than reinventing an equivalent one.
I tried it out! Basic UI (content type & field config) seemed to work pretty well, with adequate validation and error-prevention. I could figure out what was required from your description, but a 4-step simple 'Setup/Install' guide in the README as well would be handy. No need for "download and enable" steps, but saying that you have to create the unlimited imagefield before the pdf field would be nice.
4* Module did not work for me at first run, and the cron message didn't suggest what could be done about it.
No pdf previews generated to the book node from emerald.pdf file. But it is still in cron list.(In my case it seems my imagemagick path hadn't been configured yet - a suggestion to check the imagemagick configs would be appropriate in case of an error)
I then enabled some debug, got a bit further and encountered
... Which I eventually solved by Setting the path to gs in the imagemagick delegates.xml file because I couldn't set my server $PATH.
And then it worked Most excellently!.
So installing isn't 100% smooth, though I know that's out of your control - we just need more error-checking and troubleshooting clues. 4*
Suggestion A note to the user explaining why the PDF isn't immediately converted, and maybe a link to cron or an option to update immediately using batch API would be really nice.
Anyway, that was a full module test, not just a code review...
Security uses normal file uploads through Drupal correctly. It sanitizes and quotes the filenames that will be used on the commandline for conversion. Uses existing Drupal roles etc to control access.
If you can address #1, #2, #3, #4 above, I'd be really happy to see this really useful module in use!
Comment #4
dman commentedX-post removed tags, sorry.
FYI, Folk have been Wanting this sort of functionality for a while now. I don't think there is anything else in this space.
Comment #5
elamanComparsion:
There is no other modules which provide creation of raster copy of PDF.
Features:
Module is based and supposed to function with modules:
The module relies on ImageMagick toolkit and is supposed to function via 'convert' CLI utility.
Comment #6
elamandman, thanks a lot. I'll do my best to fix this problems in short time.
Comment #7
elamanFixed.
Fixed. I've used ImageField as an example, so this piece of code is from there.
What if i delete this check? This was also copied from ImageField. I've no idea, how it's possible to disable/uninstall FileField module while dependent modules are enabled?
Still thinking about how to notify user about not found IM...
Comment #8
elamanOops, forgot to attach file.
Comment #9
avpadernoComment #10
dman commented#2 Yeah, you'd see that code in earlier modules. It works, but this is a new part of the API.
#3 delete it totally. If it's from imagefield it's only there for imagefields benefit and would run anyway. I thought it was for some special case that you had encountered. And no, I wouldn't think you could disable that module by accident.
#4 just need some "what do I do now" text. Tell user to check imageapi_imagmagick, (how to) check dependencies on the server, see the README. what does something being left in the cron list mean? For me it didn't seem to mean it would try again so..?
Comment #11
OnkelTem commentedHi, I'm co-developer of the module.
dman, first, thank you for interesting in this and for quick and extensive replies.
#4 makes me think we should do some extra tests either on widget configuration page or on module's settings page.
1) Test if "-density" will work on a server. I have a setup atm, where specification of -density breakes everything - convert generates crappy JPG file while Drupal with watchdog keep silence. Maybe some version check? Yet to find out, i'm working on this.
2) Test that issue with delegates.xml
Comment #12
dman commentedThe normal case with convert and gs would be debugging on a case-by-case basis. If you have ghostscript installed, it should be available in the $PATH. In my case it was for users, but not for the Apache daemon. It's fixable if you are able to restart the machine and have root etc.
But fixing it is not your problem, just testing if gs works at all would help installation - is all I'm saying.
These things are normal when you require other libraries on random servers. All that's expected is making it a bit easier to find where the problem is. I just documented my travails as a part of that!
Comment #13
elamanChanges:
- New file: imagemagick_test.pdf (6,5kb)
- Changes in pdf_to_imagefield_widget.inc
New function:
and changed existing function:
Comment #14
elaman:)
Comment #15
dman commentedThanks, those are great ideas. An actual test run even.
I'm really liking this!
Lets get CVS on so we can start supporting it!
Due diligence has been done on the CVS application.
And it's a kick-ass feature!
Comment #16
avpadernoThe code is accessing a not defined variable; this would cause the error message to not appear associated with a form field, and the user could not understand which form field contains the not correct value.
Does one of the modules from which the proposed module depends require filefield.module? If filefield.module is not required, then the line after the one I reported will raise an error.
It is better to merge the include files into the module file, if they are always being included. To split the code into different files make sense when the files are included in some specific moments, as it happens for the files containing menu callback, which are automatically included from Drupal when the specific path is visited.
Comment #17
elamanFixes:
1. File Deleted.
2. Fixed.
3. This module depends on FileField module, so it's required.
4. Merged into the module file.
Changes:
New:
Comment #18
OnkelTem commentedfuture TODO:
* GS (Ghostscript) version check. For example, having problem with version 8.15.x - couldn't parse latest .PDFs, created in Adobe Acrobat 9 Pro. Anyway running software released 5 years ago is not very good idea. (GS 8.15.x was released in 2004)
* An option to generate images right now, upon node save. This could be a checkbox, say "[ ] Generate now", turned on/off by default according to the module settings. New access permission for this could be added too.
* Option to choose target format. Currently JPEGs are created, while in certain cases PNG images might be preferable or required.
Comment #19
avpadernoThank you for your contribution! I am going to update your account.
These are some recommended readings to help with excellent maintainership:
You can find more contributors chatting on the IRC #drupal-contribute channel. So, come hang out and stay involved.
Thank you, also, for your patience with the review process.
Anyone is welcome to participate in the review process. Please consider reviewing other projects that are pending review. I encourage you to learn more about that process and join the group of reviewers.
I thank all the dedicated reviewers as well.
Comment #20
pupil commenteddoes it realy work?
Comment #21
dman commentedhttp://drupal.org/project/pdf_to_imagefield
Comment #24
avpaderno