Problem/Motivation
To quote @deviantintegral:
For Drupal 7, we found that the S3 API was very slow for image styles - on the order of several seconds by the time all of the stat and write calls were done. What I ended up doing was initially serving the image style from a local temp file, and then uploading the file to S3 in a shutdown handler. We should investigate something similar here:
https://github.com/justafish/drupal_amazons3/blob/7.x-2.x/amazons3.modul...
Setting 'cache' => TRUE
should help for some of this, but image styles are indeed still slow to generate.
The original issue came up in the flysystem_s3 issue queue, #2528630: Support image style generation.. It was later also reported in the flysystem_gcs issue queue, #3285371: Saving and loading content takes way too long when multiple image styles are used.
Steps to reproduce
Taken from #3285371: Saving and loading content takes way too long when multiple image styles are used:
1. set one image field to have multiple values and GCS as storage. Also, for faster testing, you can disable the required alt field.
2. set an image style or responsive image style for displaying the image field value.
3. upload at least 10 images. (this is the first place where the slowdowns occur)
4. click save.After clicking save the user is redirected to the canonical node page, so the image styles are generated (probably). The user has to wait until every image is generated to get a response from the server. This takes more than 30 seconds sometimes. If a site has responsive image styles, then this process will run even longer because of the multiple image styles for multiple breakpoints. Also, if someone generates jpg and webp variants (with imageapi_optimize_webp), then again this needed time multiplies until infinity.
Proposed resolution
Don't wait until the remote upload is complete before sending the generated image derivative to the client:
- If an non-existent image derivative is requested (any path starting with
styles/
), a temporary URL is generated to a Flysystem Drupal route. This URL is returned to the browser instead of the direct image URL. - The browser requests this URL to get the image.
- A check is done to make sure the source image exists. If it doesn't, a 404 is returned and no further action is taken
- If in the meantime the image derivative was created (should be rare), redirect to the remote adapter URL.
- If the image still doesn't exist, generate the derivative and store it locally in a temporary folder. The temporary file is now sent to the browser without having to go through the remote adapter. Once the response is sent and the image is displayed in the browser, Drupal will copy the temporary image to the remote adapter.
- If the image is requested again and the temporary file still exists, that file is returned, which saves us a trip to the remote adapter.
- Once the temporary file is copied to the remote adapter, that URL will be sent to the browser instead of the Flysystem Drupal route.
Remaining tasks
- Review the proposed implementation
- Write tests
- Update the documentation
User interface changes
None.
API changes
None.
Data model changes
None.
Release notes snippet
None.
Comment | File | Size | Author |
---|
Issue fork flysystem-2661588
Show commands
Start within a Git clone of the project using the version control instructions.
Or, if you do not have SSH keys set up on git.drupalcode.org:
Comments
Comment #2
twistor CreditAttribution: twistor commentedComment #3
deviantintegral CreditAttribution: deviantintegral at Lullabot commentedComment #4
deviantintegral CreditAttribution: deviantintegral at Lullabot commentedI've got this most of the way locally - TODOs are registering and cleaning up temp files, and copying derivatives from temp to S3 instead of generating the image again.
I filed #2685905: Refactor ImageStyleDownloadController so derivatives can be generated by contrib modules which significantly simplifies our code, as we don't have to re-implement image generation in it's entirety. This patch copies that class with a small addition to add a method added in 8.1.x.
Comment #5
deviantintegral CreditAttribution: deviantintegral at Lullabot commentedHere's an update that:
* Saves temporary files to {file_managed} so cron can clean them up.
* Adds the routes and path processor for temp-then-redirect images.
* Refactors image generation to the controller.
* Adds a bunch of docblock comments.
Still a few TODOs on this, so not quite ready yet.
Comment #6
deviantintegral CreditAttribution: deviantintegral at Lullabot commentedThis patch:
I think this is ready for an initial review, though it needs tests. I'm not particularly happy with how large the ImageStyleRedirectController class is, but we're kind of stuck with it if we want to use the upstream controller class for lock and image handling. Any ideas?
Comment #7
deviantintegral CreditAttribution: deviantintegral at Lullabot commentedOh, and I forgot to mention, this removes the generateImageStyle() method in favour of a generateImageUrl() method. Generating images in serial as URLs are generated is a significant performance issue, and any adapters implementing it should probably be refactored to delay image generation until an inbound request is received.
Comment #8
twang CreditAttribution: twang commentedMay be not a good idea - but is that possible to have a queue here?
Does the sequence of calling the two method matter?
Why have to do this?
I saw this string duplicated several times in this method. Assign it to a var may make the code look cleaner.
Comment #9
deviantintegral CreditAttribution: deviantintegral at Lullabot commentedEach image upload is a separate HTTP thread, so I don't think a queue will save us anything. We could queue the actual S3 uploads off to a drush background process, but in high traffic scenarios I think it's best to make sure that S3 is populated as soon as possible. I recently discovered imageinfo_cache which is 7.x only but lets you select specific styles to pre-generate. That might be worth investigation (but not as a part of this patch).
The flush call sequence does matter - you have to clear the output buffer before calling flush().
This patch removes the $this assignment for the anonymous function. You need to do that in javascript as the scope of $this changes. PHP 5.4+ allows access to $this so the $t variable isn't needed. However, phpcs is throwing an inspection error on it, but I think that's just a bug in the inspection.
Good call on the string constant - this update adds one.
Also, this fixes not creating directories for image derivatives.
Comment #10
deviantintegral CreditAttribution: deviantintegral at Lullabot commentedThis fixes an extra slash in a regex and a missing slash in the path processor, breaking image style routes.
Comment #11
twistor CreditAttribution: twistor commentedI think we want this to happen on the kernel.terminate event. That should save us from having to flush ourselves. If that doesn't work, we should look and see how symfony does the flush. There are some tricks to make it work with fcgi and whatnot. For Apache, I don't think this works quite the way as expected, but I've only tested that when delivering HTML, not images.
Can we keep generateImageStyle() and deprecate it for removal before 1.0?
Shouldn't this also be conditional on the image module?
This doesn't seem to address caching, but I'm ok with that. We can do that in a follow up. Thanks!
Comment #12
deviantintegral CreditAttribution: deviantintegral at Lullabot commentedI was all excited about using the terminate event, but I've found two issues:
Any ideas? The existing method is working for me with mod_php and fast cgi on Apache (under an Acquia environment), but we're having trouble with an nginx setup (that I'm testing a fix for now).
Comment #13
deviantintegral CreditAttribution: deviantintegral at Lullabot commentedAfter testing this out with mod_php, it looks like responses actually do get flushed with mod_php, even though the docs say otherwise. I filed an issue up at
https://github.com/symfony/symfony-docs/issues/6520
This patch:
Comment #14
twistor CreditAttribution: twistor as a volunteer commentedThis is looking awesome!
Thanks for chasing down the flushing weirdness. I remember this problem before when trying to get poormans cron to run after flush. I also remember running into it when trying to get testbot to pass with fcgi. I'm surprised that it's working for you. I'll have to set up an apache instance to test for myself.
Is this needed?
Can this be a more specific exception, \RuntimeException() would work?
Can this try block be only the minimal amount of code? If send() can throw an exception will it be caught here? Unless that's the whole point, and Drupal would explode if we let the exception leak.
I don't think this needs to be static, since this is a service. Are we worried about container changes?
This gets skipped for local images, correct? Because of path processing order.
Comment #15
deviantintegral CreditAttribution: deviantintegral at Lullabot commentedAh! There's a FileNotFoundException both in HttpFoundation and in Filesystem. That explains a good bit of confusion I had. I think since it's all thrown in the controller I'll move it all to the HttpFoundation exception.
Agreed, fixing.
Also cleaned up the try / catch block to a single line in the try.
Good call - fixed in this patch.
Yes, that was intentional to not expose this code to local images. I didn't see any value for the local adapter where a redirect would lead back to yourself. Is there something else here?
Comment #16
juampynr CreditAttribution: juampynr at Lullabot commentedflysystem, not image. However, @file docblocks are not needed any more in classes, interfaces and traits.
Comment #17
slasher13re-roll
Comment #18
slasher13addresses #16
Comment #19
twistor CreditAttribution: twistor as a volunteer commented@slasher13, thanks for the re-roll. #16 wasn't about removing all of the @file references, just the ones introduced by the patch. Can we remove those changes? It's adding 16kb of noise to the patch.
Comment #20
juampynr CreditAttribution: juampynr at Lullabot commentedRemoved unrelated
@file
removals from the patch.Comment #21
twistor CreditAttribution: twistor as a volunteer commentedRe-roll
Comment #22
twistor CreditAttribution: twistor as a volunteer commentedoops. Had some other crap in there.
Comment #23
twistor CreditAttribution: twistor as a volunteer commentedDid some more reviewing. Mostly minor cleanups/changes.
Do we have a testing strategy planned?
Comment #24
twistor CreditAttribution: twistor as a volunteer commentedI removed the return value and the thrown exception from
copyToAdapter
. It seems to me that if one image fails, it still makes sense to try the next one. At least in the case of a network error.Comment #25
twistor CreditAttribution: twistor as a volunteer commentedWelp. I've run into a hell of a bug. At certain times the redirect is cached incorrectly. This causes old images to be served. I spent quite a while debugging.
I think the path modification in FlysystemImageStyleRedirectProcessor is causing the problem. The dynamic page cache is serving a cached response based on the route. This is problematic since the route only has image_style and scheme after FlysystemImageStyleRedirectProcessor runs.
I dug around in the context/render caching system, but haven't solved it yet.
For now, I'm just disabling caching on the redirect.
I've removed the temp file saving. I'm not sure what that's for. If we're worried about stale files, wouldn't it be easier to just remove them on cron? Also, what about deleting the temp files after the copy?
Comment #26
deviantintegral CreditAttribution: deviantintegral at Lullabot commentedI've been struggling with this one too. Mocking all of the HTTP calls is going to be pretty complex. While this is really for remote adapters, realistically it should work for the local adapter too. Perhaps that's how we can do functional tests for this?
Oy! That is nasty. I'm OK with disabling the cache for now, and fixing that as a followup if the performance side matters.
I don't think we want to delete files after the copy, because other threads could be in progress caught between the file_exists() call and actually returning the file. We could manually add a cron worker to clean up old files, but isn't that the point of registering a file as temporary anyways?
Comment #27
twistor CreditAttribution: twistor as a volunteer commentedSounds like we need a lock.
I suppose it's a case of familiarity breeds contempt. I honestly despise Drupal's database handling of temp files. My main problem is hammering the db 4 times-per-second per-thread.
Also, there was a race condition in your last patch that removing Drupal temp storage eliminates.
Comment #28
mcrittenden CreditAttribution: mcrittenden at Phase2 commented(Closed #2805037: Stale image styles served from temporary:// under some circumstances and moved the info here)
We're using the patch from #25 and having an issue that seems to be related with it. Basically, it seems like stale image styles can be served from temporary:// under some circumstances.
To reproduce:
0. Use S3 (not sure if this is necessary to reproduce or not)
1. Update an image style
2. View an image of that image style
3. Change the image style again
4. View the image again - you will see the stale image from #2 if you did it quickly
Looks like the issue is that editing an image style DOES correctly flush those images from S3, but it DOES NOT clear them out from temporary://, which is where Flysystem put them before they got sent to S3 in the first place. So what happens is that in
ImageStyleRedirectController.php
, it sees that the image doesn't exist on S3, so it goes todeliveryTemporary()
which itself goes togenerateTemporaryImage()
, which finally callsgenerate()
.In that function, you see this line:
Note that
$derivative_url
here is the URL for the file intemporary://
, which does still exist if you did it too quickly for the OS to clear out your /tmp directory. Thus, you get a stale image. I've worked around this locally like so:Comment #29
twistor CreditAttribution: twistor as a volunteer commentedI've managed to hack around the cached redirect, I'm not sure if it's correct, but it's working. The hack is to add a hash of the filepath back on the URL, so that Drupal realizes this is a different URL.
#28 should be fixed as well, but I'm not excited about it. As was stated in #26, we don't want to delete the file, since another thread could think it still exists. What we're doing here is reading the file into memory, if it exists, and serving it that way. That seems to work fine, we loose some optimizations regarding X-Sendfile. The problem is that it's possible threads could generate the image style multiple times.
Comment #30
twistor CreditAttribution: twistor as a volunteer commentedThis should fix the caching issue.
Comment #31
twistor CreditAttribution: twistor as a volunteer commentedComment #32
DamienMcKennaRandom crazy idea - how about creating items in the queue and then add an AJAX callback to trigger the queue on the next page request? That way the items go into a queue so e.g. an S3 outage / slowdown wouldn't break the save process, but the callback would trigger the file(s) to be pushed up so there wouldn't be this delay until the next cron cycle.
Comment #33
Dave ReidI've been having an issue with this patch with an S3 private bucket. The image derivatives display once correctly with the redirect path, and then once the derivative is saved to S3, when it attempts to generate a URL, I keep getting the 'itok' query string appended to the URL, and Amazon rejects the signed URL because of the additional query string data in the URL. If I remove the itok query, then the generated URL works. I wonder if we should just always route through the redirect path instead. I wonder if other providers would have the need to strip the itok value from the URL.
Comment #34
twistor CreditAttribution: twistor as a volunteer commentedI haven't been around lately, but wanted to mention an idea I found that might work as a stopgap for S3.
https://github.com/thephpleague/glide/issues/170#issuecomment-272943966
Comment #35
twistor CreditAttribution: twistor as a volunteer commentedAlso, if anyone in this thread wants commit access, I'd be happy for the help.
Comment #36
MiSc CreditAttribution: MiSc commentedComment #37
lhridley CreditAttribution: lhridley as a volunteer and commentedThought I'd chime in on this, as I've been going round and round on the issue of successfully applying image styles and storing derivative images in object storage successfully (we are using GCS for object storage, and using Flysystem and Flysystem_s3 to interact with GCS).
To get a solution that successfully writes images to the object store via the streamwrappers provided by the above two modules AND also write the derivative images (thumbnails and other images generated with image styles), we finally had to resort to patching a dependent library.
There is an outstanding PR on github for twistor/stream-util (https://github.com/twistor/stream-util) which is a dependency for this module. By patching this library for the outstanding PR on the project (https://github.com/twistor/stream-util/pull/1), we were able to resolve the issues we were having generating and writing derivative images to the remote object storage, and subsequently retrieving and displaying those images with Drupal.
Because the patch is explicit for version v1.0.1 of twistor/stream-util, I also explicitely declared that version as a dependency in compser.json, pinning the project to this partcular version. Hopefully the outstanding PR will get merged soon, in the meantime this did resolve the issue I was having on two different projects with derivative images.
Patch is attached, please note this is NOT for this module, but for the twistor/stream-util library.
Comment #38
lhridley CreditAttribution: lhridley as a volunteer and commentedAdding an updated patch to replace the one from #37
Comment #39
ThomasDik CreditAttribution: ThomasDik at Synetic commentedAdded patch for 2.x beta version.
Comment #42
DieterHolvoet CreditAttribution: DieterHolvoet at Minsky commentedComment #43
DieterHolvoet CreditAttribution: DieterHolvoet at Minsky commentedComment #44
DieterHolvoet CreditAttribution: DieterHolvoet at Minsky commentedI created a MR based on the last patch. I had to fix a bunch of stuff to get it working for me:
I also opened an MR in the flysystem_gcs project, integrating the new code here in an actual adapter: #3285371: Saving and loading content takes way too long when multiple image styles are used.
Now both MR's are in a working state, this one still needs tests.
Comment #45
lhridley CreditAttribution: lhridley as a volunteer and commentedComment #46
lhridley CreditAttribution: lhridley as a volunteer and commented@dieterholvoet Would you be kind enough to change your MR to apply against the 2.1.x-dev branch? I'd like to see if the automated tests will pass, it appears that your MR does apply to 2.1.x as well as 8.x-1.x.
Comment #47
lhridley CreditAttribution: lhridley as a volunteer and commentedComment #48
lhridley CreditAttribution: lhridley as a volunteer and commentedComment #49
lhridley CreditAttribution: lhridley as a volunteer and commentedComment #50
DieterHolvoet CreditAttribution: DieterHolvoet at Minsky commentedI rebased the branch against 2.1.x, there weren't any conflicts. Changing back to Needs work because there aren't any tests yet - unless you decide that's not necessary.
Comment #51
lhridley CreditAttribution: lhridley as a volunteer and commentedhttps://www.drupal.org/pift-ci-job/2730529
Existing test coverage appears to be adequate, however this patch does fail one existing tests. In looking at the test it appears that perhaps the test needs to be updated.
Can you take a look at the failing test, determine if the failing assertion on `src/Unit/Routing/FlysystemRoutesTest.php:161` needs to be revised, and submit a revision as part of this MR?
If the assertion is still valid, note that here as well, and we'll need to work on this a bit more.