As I worked on #2149207: Can't save node or create derivitive with image field using s3fs destination and #2109545: Image style generation is very slow, I realized that Drupal reads the same image file several times during certain requests, especially when uploading new images and creating derivatives. Many of these reads are done through repeated calls to getimagesize(), which reads in the first few kilobytes of an image to pull its EXIF data.
I'm thinking about whether it would be a worthwhile performance improvement to keep an in-memory cache of all images (up to a certain max size, say 5MB), which get read from S3. That will make repeated reads of the same image during a single request many, many times faster, since they'll be read from RAM rather than re-downloaded from S3.
This may or may not be especially easy to actually implement, though. If it's possible to just keep a reference to an S3fsSeekableCachingEntityBody object, and load that object whenever the same URI is requested, it would be super easy. But I'm not sure if that'll actually work.
| Comment | File | Size | Author |
|---|---|---|---|
| #13 | imageinfo_cache-2153991-13-cache-fs.patch | 13.06 KB | mikeytown2 |
Comments
Comment #1
mikeytown2 commentedThis might help
https://drupal.org/project/imageinfo_cache
Comment #2
coredumperror commentedHmmm, that wouldn't really remedy the problem that this issue references. It would just push the performance lost to a different request.
Comment #3
mikeytown2 commentedIn the D6 version it cached the results of getimagesize(); that was the module's original purpose. We were using a S3 FUSE mount and this helped drop our page load times by over 10x.
Something similar could be done in D7.
Comment #4
coredumperror commentedHow did it do that? I'd love to be able to cache just the results of getimagesize(). I looked through the code, but there's so much other stuff going on that I couldn't find the place where it actually did the getimagesize() cache.
Comment #5
mikeytown2 commentedIn D6 the vast majority of the calls to getimagesize came from theme_imagecache() & theme_imagefield_image(). imageinfo_cache_theme_registry_alter implements hook_theme_registry_alter in order to modified the theme layer so that imageinfo_cache_theme_imagecache and imageinfo_cache_theme_imagefield_image run in front and cache the data; and then they call the original theme functions.
In D7 this issue was supposed to be fix in core but if you can point to where multiple calls to getimagesize is happening in core we might be able to patch core or work around the issue.
Comment #6
mikeytown2 commentedWhere getimagesize gets used in core:
image_gd_get_info
color_scheme_form_submit
Popular contrib modules that uses it.
https://drupal.org/project/imagemagick - image_imagemagick_get_info
https://drupal.org/project/imce - imce_image_info depending on configuration
https://drupal.org/project/print - theme_print_pdf_tcpdf_header
Plan of attack would be to implement a fake image_toolkit so that calls to image_get_info will try our cache first.
Other modules that use getimagesize according to gotta download them all via
grep -l -r -i "getimagesize(" ./Comment #7
coredumperror commentedIIRC, it was
image_gd_get_info()which was calling getimagesize() multiple times on the same image, during image upload and derivative creation requests. It's been a while since I last looked into it, though.Comment #8
mikeytown2 commentedImplementing our own image_toolkit shouldn't be too hard. We would modify the global $conf variable in hook_init except on the toolkit admin page.
Other contrib modules implementing hook_image_toolkits
So manual testing wouldn't be that hard to do.
Comment #9
mikeytown2 commentedRelated info #1822320: IMCE should call Drupal's image_get_info() rather than getimagesize() directly
Comment #10
coredumperror commentedI don't currently have time to work on s3fs, since I'm busy with other high-priority tasks at work right now. If you want to take a crack at implementing this, I would certainly appreciate it!
Comment #11
mikeytown2 commentedMoving this issue over to imageinfo_cache. Seeing how pantheon has some custom magic going on in here in order to cache the data, I can see how a generalized option will be useful in D7.
Comment #12
mikeytown2 commentedWill be working on this tomorrow
Comment #13
mikeytown2 commented