Calls to Image::isValid() return false when the file doesn't exist on the local environment, regardless of whether or not it exists in the origin.
Expected behaviour:
stage_file_proxy should intercept calls to Image::isValid() and try to obtain the image from the origin if it doesn't exist locally.
| Comment | File | Size | Author |
|---|
Issue fork stage_file_proxy-2928564
Show commands
Start within a Git clone of the project using the version control instructions.
Or, if you do not have SSH keys set up on git.drupalcode.org:
Comments
Comment #2
geek-merlinInteresting. Solving this would mean re-implementing SFP as a stream wrapper.
Can you tell the use case, why this was queried in the first place?
Comment #3
nicrodgersAfter enabling SFP on our site, it worked well most places, but in a few places images weren't showing up. On further investigation, the failures were coming from a few places in our custom modules and themes.
They were all doing this type of thing:
We've since refactored it so that width and height are set to NULL if isValid returns FALSE, which works fine, but it does mean we no longer have a way to tell whether or not an image is *actually* valid or not.
Comment #4
geek-merlinYeah, so this is a limitation of the current design. I'd like to see a stream wrapper (or equivalent solution) but this needs someone to volunteer.
Comment #5
_Archy_ commentedComment #6
_Archy_ commentedHere is an initial implementation for this. The method is open for debate.
I have made two additional modes of operation:
* CORE_STREAMS: this adds functionality for the core streams
* BRUTE_STREAMS: this adds functionality for all stream wrappers added by all modules
To enable the stream wrappers add in settings.php $settings['stage_file_proxy']['proxy_mode'] = 1; // 2 for BRUTE
If you want to test the functinality of stream wrappers use the above described image->isValid
Minimally tested. Seems to work. This is a WIP patch. Don't review comments / coding standards.
Comment #7
geek-merlinWow, i'm impressed! 😎
From a first look: Code looks solid and well-crafted.
Setting to needs-rewiew so we hopefully get feedback from other maintainers.
We also need to discuss how we release a change like this.
Comment #8
_Archy_ commentedJust taking some notes on things to do:
* have to add some more description on the settings page for the proxy modes and how they can be set
* generic configuration for different origin scheme paths is missing, so that has to be done too
* need some error handling in the stream trait so that in certain cases won't break the whole website (a special case I have also encountered is when the request is timing out it will clog the primary request)
* documentation / code quality / commenting
We have to discuss:
* the fact that I am using eval to proxy all streams possible (I think we can keep it as this mode will only be used in development)
* for the moment I have dumped common functionalities in service called FetchService, I am not quite sure that it is a good name, maybe SfpService?
Comment #9
_Archy_ commentedNaively taught that somehow private files gonna work magically through just the browser. Anyway I have found a way to make it work seamlessly. I am taking advantage of the fact that the database will be the same and thus the users data will be the same (**):
0. Generate private key used to encrypt passwords on origin. This will be available on the local env because (**)
...
n+0. Log in locally as the user.
n+1. If the file being fetch is private set double encrypted users password as authentication header on the request.
n+2. On the host site the authentication is intercepted. Decrypt the password and if the UID and password match in database then allow login.
...
This almost works like Basic auth with the difference that the actual password sent to authenticate is in fact the already encrypted
password of the user. For additional security we encrypt it one more time with AES. Add HTTPS above this and I think there is strong security.
I am using AES for encryption so I have added a composer dependency: "phpseclib/phpseclib": "^2.0".
I have not yet addressed the issues described in #8. Also have to do lots of more testing.
Sidenote: testing this feature is a nightmare with docker environments, can't get XDEBUG work with guzzlehttp.
Comment #11
_Archy_ commentedWrong patch.
Comment #12
_Archy_ commentedComment #14
yobottehg commentedI think we should limit this issue to get a file_stream_wrapper inside stage_file_proxy with all needed settings and automated tests and tackle the problem with private files in #2879046: Compability with private files
Comment #15
_Archy_ commentedYou cannot make it work with streamwrapperS without making it work with private stream, as the two primary streams are public and private.
It would take a lot of effort to separate at this point. Both the stream wrappers and the private fetch mechanism have common code dependency.
Comment #16
_Archy_ commentedEventually I will take out the private fetch mechanism and disable the private stream for the moment. Then when merged I ll rebase upon the patch conataining both and add it to the other issue.
Comment #17
yobottehg commentedI thought for some time about this solution and in general i like it.
One problem is see is in the time of GDPR most people will only have a sanitized database locally without the same user but a user created over drush uli. The current approach will not work then.
Perhaps we can default to use the session from the remote site by default with a message if you are not logged in remotly (if the stream wrapper return 403 or 301 redirect) and then add an option to use the local user for authentication against the remote.
What do you think?
Comment #18
_Archy_ commentedI am not quite sure what this will mean exactly. Will we have atleast the IDs of the users? Can you elaborate.
I would like to add that having someones encrypted password is far from a security issue since they are barely breakable, I don't think that it would expose anything about privacy, but that's my two cents.
I am not sure about this but I think sessions are deleted from DB. Also even if they are not then if a user was never logged in we cannot use that because he never had a session. Adding here about GDPR, having a users session is WAY WORSE than having it's encrypted password.
I don't quite understand what you mean by this. It does not in any way affect the fetch mechanism for private files if you are or not logged in on origin on any browser or whatnot. The guzzlehttp has to authenticate on total separate connection. Either way if you are then how would the use case you are thinking about differ from the currently implemented one in the eyes of GDPR?
Comment #19
_Archy_ commentedI have removed the private fetch functionality. Now it only contains stream wrapper logic. This patch is untested as I had no time, uploading for structure review only. Refactored code a bit. Changed proxy modes. Added stage file proxy streams as services so now any module can integrate with SFP. Moved a lot of logic from fetch info service to SFP stream wrapper wrappers (new interface).
Still have not addressed #8. Also adding @TODO for some basic unit tests.
Comment #20
_Archy_ commentedDid not upload interdiff as there were too many changes.
Comment #21
yobottehg commentedA little bit more explanation from my side.
There are multiple ways developing locally when you don't have the same users as on remote:
- Installing via config_installer profile when only the structure is created
- content is imported via default_content
- Importing a sanitized database (deactivated users without passwords or no users at all) where you create an user via drush uli.
I'm just saying the approach with the login to remote is really nice but i think it will not suite all cases.
Thats why i'm saying default to just fetch the files from remote which will result in 403 when you are not logged in remotly. But should be 200 ok if you are logged in and have the permissions.
Comment #22
_Archy_ commentedAdded commenting and cleaned codesniffer issues.
Little bit more refactoring. Renamed classes, added interfaces.
Added error logging.
Added descriptions on settings page for proxy modes and states.
Added install and readme documentation.
Updated drush command to work with new services.
I will add some tests after this patch is validated.
@yobottehg About private files. Maybe we could log in through guzzle with user ID combined with site hash salt. But I think that it would reduce the security. Please NOTE that SFP has to be installed on the origin site otherwise the method I have implemented won't work. The SFP authentication protocol could be a target for hackers.
Comment #24
geek-merlinMy gut feeling as maintainer is this grows a bit large without having some more eyes on it.
(I'd espesially like to have @greggles put an eye on this.)
If the changes can be split into multiple issues, i'd be for it.
And if the private files feature needs some (erm...) "backdoor", i'd be for putting that into a separate sub-module.
That said: great stuff!
Comment #25
_Archy_ commentedFixes PHP lint issues. Hopefully remains on review this time.
@axel.rutz Yeah we need more eyes on this. I don't think that it's a good idea to break it into more issues. Note that I have removed the private file fetch authentication mechanism from it. At this point the patch is only about refactoring SFP to use stream wrappers.
Comment #27
_Archy_ commentedComment #29
_Archy_ commented...
Comment #30
yobottehg commentedI'm wondering. Is there any possible way to really end to end test stage file proxy with drupalci?
Is there an interest in moving the code to github and setup end to end testing there with travis / circleci?
Comment #31
_Archy_ commentedI am thinking of a mock like http client service replacement injected in the fetchmanager used with browser test.
Comment #32
_Archy_ commentedAdded end to end test. The idea is to replace the client service with a mock one that returns fake responses. There are two simple tests: one for request drupalGet and one for direct stream.
Comment #34
_Archy_ commentedFixed some issues with coding standard. I think the test is breaking because no dependency declared on test module.
Comment #35
_Archy_ commentedComment #36
_Archy_ commentedCan we get a review on this? Kinda want to continue working on private fetch mechanism, but it depends on this patch.
Comment #37
yobottehg commentedi think it should say public here instead of *scheme*
Perhaps an additional comment on what types are available?
Does this work? in_array returns a bool ?
LEtting this on needs_review as this needs review of one of the maintainers
Comment #38
yobottehg commentedSome more after trying out this patch.
The configuration regarding settings.php and so on is to difficult now. It's not very clear. We should set sensitive defaults which correspond with stage_file_proxies current behaviour and then let the rest for configuration.
Comment #39
geek-merlin@_Archy_: Thanks for PMing me on this and driving this thing forward! Alas, with a big change like this it's always the risk of sites breaking and mainenance nightmares. So special appreciation for your tests!
Unfortunately i'm verrrry low on capacity these days so i can moderate this but can not provide deep reviews. What always helps (and what i do for complex patches i want in, especially if they contain some refactorings) is rebase the work into a series of as-simple-as-possible patches and upload as a git multipatch.
So @yobottehg, great you're chiming in with reviewing, and my gut feeling is like this: If at least both of you say the code is solid and does its job, i'll commit it to dev and roll a RC. Likewise with the private-files followup. As soon as we have some 100 adaptors without big breaks, we'll go stable.
So after a 4-eyes code review and RTBC, feel free to PM me. Rock on! ✌
Comment #40
_Archy_ commented@axel.rutz Thanks for the input. I will try to get some more eyes on this from other maintainers or developers. I would make a git multipatch but have not found anything on this matter on how to do it. I guess it would mean breaking up the changes into multiple commits that introduce different parts. Till now I have come up with the following: core mechanics, stream wrappers, configuration and tests. But unfortunately I don't know how to create a multipatch. So if you can point me to some documentation or tutorial of how to do a multipatch I'd be glad to do it.
@yobottehg Addressed your concerns:
1. This I wrote like that as the scheme is variable (*scheme* can be replaced by public, private or any other supported stream wrapper id)
2. I have added a link towards the class that defines them and has thorough description of each.
3. Yes in_array return bool.
4. The default was set but not working correctly as I was not checking the existence right. Updated in StageFileProxyServiceProvider.php
+ I have also added 3 more tests for: styles root, hotlinking and different origin base path.
Comment #41
geek-merlinIndeed! The handbook does not cover multipatches. Shame. So i rolled a page on this:
Making maintainer-friendly "git am" single and multi-patches
It's made up just from my head, so the commands there may carry dumb mistakes (and correcting those gives extra karma points).
Comment #42
andypostRe-roll for drush command, it needs a bit of work to use drush's
--useror core'saccount.switcherservice to login but not really sure that local and remote users could be same, moreover GDPR does not allow itBtw previous comment was
// @TODO: Need a way to login user with this command for private files.minor nit - trailing white space at the end of line
Comment #43
deviantintegral commentedThere is an issue with accessing images on https, where the port is set to port 80 instead of 443. This fixes that by calling setPort() when you call setDomain().
Comment #44
juampynr commentedLooks good! Just a few notes on docblocks and inline comments:
I don't understand this phrase. Can you review it?
servers?
by stage file proxy.
Comment #45
_Archy_ commentedSo I have been thinking about the private files issue and it seems to me that as of the GDPR issue and the fact that I cannot really see an actual use case to have the actual private file locally, we could just simply generate a sample file for the private files. We have that functionality in core already for FileItem and ImageItem.
Comment #46
crzdev commentedPatch seems quite solid, just found one problem, when origin site is installed in a subfolder, port is concatenated to the end of the "domain" value, despite that scenario. Just adding a re-rol of the patch having that into account.
Example: "http://www.domain.com/subfolder/site..." will request: "http://www.domain.com/subfolder/site:80..." instead of "http://www.domain.com:80/subfolder/site..."
Maybe a better solution will be to have an specific variable to define site subfolder.
Attaching patch.
Comment #47
geek-merlinLast interdiff looks good (untested). I'd really love if we can roll this home soon.
Comment #48
geek-merlinLooks like this needs a re-roll.
Comment #49
wellsAdding an attempt at a re-roll, but this is a pretty complex rebase...
I found this issue by way of some problem we are having involving SFP, Imagemagick, and Blazy text format filters. I believe I was able to get this rerolled correctly and it does apply cleanly to 8.x-1.x, but I have not actually been able to successfully test it and unfortunately I have run out of time to spend on the issue.
Also, annoyingly, the interdiff is failing (it looks like potentially because of the renaming of INSTALL.txt to INSTALL.md?).
We'll need someone who was previously using this patch to test the new one and see if it still resolves other use cases. It's possible that ours is just a case that is not fully resolved by this patch alone.
Comment #51
_Archy_ commentedComment #52
wellsAttaching an updated patch that removes the no longer relevant test case. Looks like the tests added with the patch are working, though.
Comment #53
geek-merlinThanks a lot for moving this forward!
> [Patch] that removes the no longer relevant test case
Can you elabarote why no longer relevant? I don't get why.
Comment #54
wellsRe #53:
That case only tested the method
\Drupal\stage_file_proxy\FetchManager::styleOriginalPath, which is removed by this patch. I verified that was indeed removed by the patch in #46.Comment #55
kevineinarsson commentedTook a stab at rerolling #52 latest dev and fixed an issue in src/FetchManager.php where a use statement for
UrlHelperwas missing leading to crashes.Comment #57
kevineinarsson commentedFixed most coding standards issues. No clue why the test is failing. Manually verifying that the trailing slash is removed works.
Comment #58
wellsWhy is this reroll needed? Does #52 not apply to dev or are you just cleaning up some standards issues? Please add an inter diff against #52.
Comment #60
kevineinarsson commentedYep, #52 does not apply to dev. I fixed what I could, but the Drush commands need to be looked over in the rerolled patch.
Comment #61
geek-merlinJust to re-trigger the bot.
Comment #63
thomwilhelm commentedTested #60 and all works really nicely for me with responsive image styles after I enable the proxy stream wrappers. However I found when the file isn't found on the remote server, it generates a bunch of warnings as the error logging line needs updating from the original method signature.
I've also found some problems when trying to load files with spaces in the filename. For example the raw file on the server is:
"Map 01 Regal Peacock.mp3"
However the URL requested by stage file proxy with this patch and proxy mode enabled is:
"Map%252001%2520Regal%2520Peacock.mp3"
It's like the URL is being double encoded somewhere or another.
Comment #64
thomwilhelm commentedOK so I've figured out the spaces problem. From the original patch this line was removed from ProxySubscriber::checkFileOrigin.
$uri = rawurldecode($uri);Which meant later on when
FileFetchInfo::getUrlPathwas called, the original URL was still URL encoded, but by callingUrlHelper::encodePathit got double encoded.From my testing, I actually think it makes more sense not to call rawurldecode in the first place, so I've actually removed
UrlHelper::encodePath. I'll test further on my local but I can't think of a case why this decode/encode would be needed. As this module is a proxy it makes more sense to me to just pass the original request directly to the origin server.Will report back in a couple of weeks with any findings.
Comment #65
thomwilhelm commentedOK looks like I jumped the gun with my previous patch. Whilst not decoding the request path in ProxySubscriber and not encoding it when requesting the file from the origin server works just fine, it causes a lot of problems in other areas! (Hence it being decoded in the current dev release).
I added rawurldecode in the most logical place I could find in the new patch. As soon as I do this it fixes all the previous issues.
Comment #66
edurenye commentedThe patch does no longer apply.
Comment #67
grndlvl commentedRe-rolled against latest release 8.x-1.1. Hiding since not against dev.
Comment #68
edurenye commentedThis service gives me this weird error:
The service "stage_file_proxy.fetch_info_service" has a dependency on a non-existent service "stage_file_proxy.stream.private".When I remove the service the error is gone. BTW where is this service used?
Comment #69
mibfire commentedHow does this patch work? Do i have to install the patched SFP module on remote site too? Thanks
Comment #70
smustgrave commentedComment #71
smustgrave commentedComment #72
geek-merlinWow, if there's traction for a 2.x release, getting this in will allow us to throw out quite some stuff.
Comment #73
smustgrave commentedCan say I’ll keep an eye on the queue and can merge stuff as they pop up
Comment #74
smustgrave commentedNot sure if it matters but #3282542: Add a lock around retrieving upstream files was merged
Comment #75
trakoczi commentedI tried patch #67 with version 1.1 and private images fail to download from origin server. Since there is no patch for version 2, is there any other approach to make private files work?
Comment #76
kunalkursija commentedPatch #67 on stage_file_proxy 8.x-1.1 worked for me. However, The application that I am working on serves the private files from the overridden path(not
system/files). Hence I had to save the patch locally and make below changes:$settings['file_private_path'], So I updated the FetchManager.php file's code$file_dir = $this->fetchInfoService->getLocalSchemePath($fetchInfo->getScheme());to$file_dir = \Drupal\Core\Site\Settings::get('file_private_path');. This ensures that stage_file_proxy saves the private files at the location from which they are read/served.getUrlBasePath()&getFetchInfo()of SfpPrivateStream.php class has hardcoded the pathssystem/files&system/files/stylesrespectively. However, In my case, the files were being served from different paths and hence I updated it to suit my need.The above changes made the patch #67 work for me.
Comment #77
smustgrave commentedThis is the last ticket needed for the 2.0.3 release fyi.
Comment #78
nicrodgers@smustgrave it may be worth carrying on with 2.0.3 without this one, and adding this in when it's ready.
Comment #79
johnle commentedI did my best to try to reroll this to version 2.1.1, there is quite a bit of code refactoring to 2.0 that made it a challenge. This is working, but probably could use some more tweaking + testing + cleanup. One thing I've notice is that some of the code from 1x. is no longer in 2.x like the webp support code.
Comment #80
smustgrave commentedComment #81
smustgrave commentedStarted a new branch for just this feature
Rerolled what I could #3175045: Error retrieving images from styles that use ConvertImageEffect will have to be tested again as that was using the variables we removed.
Also removed all deprecated functions and interfaces.
Comment #83
smustgrave commentedNeeds some work for upgrade paths but StageFileProxyServiceProvider needs some work as it's breaking the module from being installed and I can't see why.
Comment #84
deviantintegral commentedWe just ran into a case where images failed to load when using the file_mdm module. Our theme needs to load the source image first to pull out some exif information, but the call to
\Drupal\file_mdm\Plugin\FileMetadata\FileMetadataPluginBase::loadMetadataFromFilefails because it doesn't exist.I have a horrible, horrible hack in place which works by swapping out the exif plugin implementation. But realistically, this would be better if Stage File Proxy swapped out the public stream wrapper so it could intercept file_exists() calls and download as needed.
Comment #86
eelkeblokI merged in 3.x and one thing led to another... I *think* there was still a problem with trying to add dependency injection to the stream wrappers, unless I misunderstood how things are supposed to work. The stream wrappers end up being registered with PHP as actual stream wrappers, and Drupal/Symfony dependency injection doesn't kick in there, so I was getting errors of the constructor being called without the proper arguments. So, I removed the arguments and replaced them with some methods calling the Drupal global object.
I did my best capturing the spirit of the changes in the 3.x branch, I hope I didn't break anything.
Comment #87
eelkeblokThe config validation that is enabled on tests is tripping up saving the settings form. I tried turning the origin_path into a sequence, but that wasn't the (entire?) solution. FWIW (not sure if this is documented anywhere), I got the config validation to kick in on my dev environment by putting this in an active services.yml (e.g. development.services.yml, *if* that's enabled), to make debugging easier:
Comment #88
eelkeblokSlowly chipping away at test failures. Seem to have arrived at a pretty fundamental one; the test is using the public stream wrapper to test if the original file was created. Not confident this is a great idea, maybe we need to introduce an "unwrapped" version of the original stream wrapper for the test.
Comment #89
andypostRe #88 sounds good 👍 idea
Comment #90
geek-merlinGreat how this is moving forward!!
Comment #91
eelkeblokI restructured the branch a little because it seems some previous merge conflicts were not resolved optimally.I rebased pervious commits on that merge.
Comment #92
smustgrave commentedThanks for keeping up with it. I’ll admit I wanted to get this into 3.1 but little over my head
Comment #93
penyaskitoThis looks great so far!
Comment #94
eelkeblokTests should now be green 🤞
The last hurdle was the assertion on the derived image being created. For some reason, the test couldn't assert the existence of the file, eventhough manual testing seems to work fine. Since we are basically depending on core doing its thing after we've downloaded the original, I opted to remove the check for the style image.
Comment #96
eelkeblokApplied the review points from @penyaskito and created a new MR because the old one's target was 3.0.x.
Comment #97
eelkeblokWRT the origin_path configuration, there seems to be some doubt as to whether support for private files belongs in this issue. I would say, no. There already is a lot of value in converting only public files to using stream wrappers (e.g. the use case with responsive images, where the responsive image module tries to open the physical file to get information on its size), and adding private file support into the mix complicates this issue. I think my next task will be removing support for private files from the merge request. Maybe we should consider re-opeing #2879046: Compability with private files and postponing it on this?
Comment #100
smustgrave commentedI could open a 4.0.x branch and we can merge this in if we think it's ready? That way we still have 3.1.x stable just in case?
Comment #101
firewaller commentedOpening up a new branch would be ideal since this feature has been such a long time coming and it would be nice to have something.
Comment #102
smustgrave commentedI'll start a new branch but one part I'd like to revert is the origin_path being a sequence
Comment #103
eelkeblokIt's been a while for this issue, sorry about that. I'd like to clarify what you meant by reverting the orgin_path being a sequence, @smustgrave. Having looked at it again, I think the point of it being a sequence is that the idea is that each wrapper (public and private) get their own setting. Quoting from the Configuration Schema cheatsheet, "There are two base list types defined by configuration schema. If all keys can be predefined in the schema structure, then you should use a mapping, otherwise you should use a sequence (even if the list’s keys are strings)." Since the settings form loops over the stream wrappers, I do believe the second situation applies (even if the module itself only defines two wrappers, it should be possible for other modules to extend this - although I'm not 100% sure this single settings makes sense for all situations; I am actually looking into extending this for S3 right now).
So, I'm a little confused what you meant. Another interpretation might be that you meant to actually revert the entire private file system support, which I would not find a bad idea, as mentioned before. From #2879046: Compability with private files I did get a sense this might actually be pushing it a bit too far, adding private file support in here, and we might have to take a couple of steps back, in that regard.
Comment #105
eelkeblokSorry for the unstructured mess. Having read #2879046: Compability with private files and this issue again, I don't think it is a great idea to lob private file support in with this feature right now. I believe there is nothing in the way of authentication in the private file code in this feature at the moment, which is good in the sense of not opening up any security holes, but is bad in the way that the feature will be very hit or miss, and will probably be the source of many support requests; AFAICT, there needs to be anonymous access to the files in question, which is far from a given in the case of private files, obviously. Correct me if I'm wrong, of course.
Either way, it seems a better idea to me to introduce the stream wrapper feature without private files initially, and treat it as a separate feature request. Of course, we can create a draft MR there to add back the private file code that is in MR 89 for those that are using it right now.
I opened MR 110 that is supposed to be 89 but without the private files, but there were merge issues which I am still in the process of resolving. The same is true for 89, but given the above I don't think I will be putting in the work to make that mergeable, for now.
Comment #106
eelkeblokSetting to needs review real quick to get some more eyes on this, although I expect there is some more work to be done. I updated the MR by merging in 4.0.x and solving a bunch of merge conflicts. There were actually some tests in this branch that have since been made obsolete by tests added to the main branch, so I removed the duplicate testing code. I also had Claude add some tests for the stream wrapper functionality. One thing I realised just now is that there is actually *some* support for private files in the 4.0.x branch currently, that this branch is actually removing. I am not sure whether that is a good idea, although I think the current private file support is actually troublesome and incomplete. Removing it might go a bit too far, though. Input welcome.
Comment #108
eelkeblokUndid some of the more reckless AI stuff.