Problem/Motivation

CNAME behavior changed between version 8.x-3.0-alpha16 and 8.x-3.0-alpha17
Before version 8.x-3.0-alpha17 it was possible to use a root-relative path.
So it was possible to add a folder to a proxy pass to the s3 bucket (for example to avoid CORS issues)
Now this is not possible anymore.

Steps to reproduce

https://git.drupalcode.org/project/s3fs/-/blob/8.x-3.0-alpha16/src/Strea...

vs.

https://git.drupalcode.org/project/s3fs/-/blob/8.x-3.0-alpha17/src/Strea...

Proposed resolution

Add a additional configuration value for the root-relative path or restore the old behavior.

Issue fork s3fs-3278421

Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Comments

kayt0n created an issue. See original summary.

cmlara’s picture

This was done back in #3194619: Allow PRESIGN url's to work with CNAME, VersionID and custom_GET_args with some followup in #3197648: Undefined offset: 1 in __construct to fix some bugs.

A CNAME and Domain Name have a very specific meaning in the networking world, it is suppose to be a hostname only. We do slightly deviate from that in allowing a port to be specified for some backwards compatibility for dev labs that had been used in the past (real examples were seen in the support queue.)

The ability to use "root relative" paths is hard to determine why it was ever explicitly added. It can be traced back to 7.x-2.x around the same time some CORS issues were being fought with, however no issue or justification for the change is logged, it actually appears it may have been code that should of been reverted after another code fix went in the same day. Additionally the module documentation was never updated to indicate this was intended to be used, conversely the documentation updates have all been to indicate this is suppose to refer to hosts.

Since you mention CORS, is that the actual issue your encountering? If so can you just set a CORS policy on your bucket to resolve the issue? Is there some other scenario that this feature is necessary for?

kaytr0n’s picture

My usecase is a bit special in this case because I use Drupal as a headless CMS in a microservice environment.
In this environment, the istio service mesh is configured that only the CMS has access to the S3 bucket. However, I think that a similar setup in an intranet makes sense.
But I agree with your statement about the CNAME (nevertheless, I would say that changing the undocumented behaviour is a BC but never mind ;-) ).
Maybe a new alternative configuration should be introduced, for example "use_proxy_pass" and "proxy_pass_root", which implements the desired root-relative behaviour?
This would give an administrator the possibility to configure a CNAME or a ProxyPass.

cmlara’s picture

"I would say that changing the undocumented behavior is a BC" This is certainly a valid point. Now that we have the 8.x-3.x branch in Beta I would certainly think 2 or 3 times before puling the code and likely would (I hope) generally lean towards leaving the feature rather than removing it. When the module was in Alpha I was much less apprehensive about major rewrites and your unique deployment certainly fell through with me making changes and pulling out code that didn't appear to make sense.

"Maybe a new alternative configuration should be introduced, for example "use_proxy_pass" and "proxy_pass_root", which implements the desired root-relative behaviour?". I respect that you were using it and now that it is gone it is a burden for you, however this is the first report I've seen looking through years of issues that this was actually utilized which makes it hard for me to justify adding it back in without more reports requests.

If this is a feature you really need I belive you could extend \Drupal\s3fs\StreamWrapper\S3fsStream to replace getExternalUrl() and overridden the deceleration of the stream_wrapper.s3fs service (also public and private may be necessary depending on your deployment) to use your class. This shouldn't be a significant amount of code if you decided to do so. You would actually end up taking out most of getExternalUrl() and just leave yourself with a few lines (similar to the core PublicStream class) that return your sites hostname with a subfolder instead of all the complex logic.

Alternatively I might suggest (if you can do so with your architecture) that your Pods web servers could also listen to a second domain (s3.your.domain) and always proxy those back to your S3 storage. This would allow using the cname feature as is without change.

kaytr0n’s picture

Thank you for your answer. Unfortunately, I don't think that using a fixed subdomain as CANME solves my problem, as I then get problems with the mesh internal and external routing/DNS resolution.

Would the feature proxy_pass have a chance to be integrated in the source code of the module, even if I seem to be the only one who has this problem?

cmlara’s picture

Would the feature proxy_pass have a chance to be integrated in the source code of the module, even if I seem to be the only one who has this problem?

At the moment I don't believe it would. I have concerns about having code that is rarely used increases the maintenance burden long term and that this should have an alternative solution with existing configuration options (a number of alternatives listed below of varying complexity)

In addition such a feature I believe would be incompatible with any presigning based features, which includes urls presigned for access control and urls that need to be presigned to implement handling by AWS, forced save as is a built in one, contrib could introduce additional through hook_hook_s3fs_url_settings_alter(). This means we would also need to implement configuration validation to prevent mixing these features along with some method to alert should a hook attempt to use these features. This would add to the long term maintenance burden.

In fairness presign also does not work with CNAME through a proxy either (due to hostname being a part of the request) however it does work with buckets hosted similar to AWS that use CNAME to directly link to the bucket service.

Regarding mesh resolution:
I'll admit I do not know your environment or isito. I would think however you handle routing public www.example.org to point to your existing service pods allow you to route s3.example.org to your pods as well.

Put another way, you likely are already doing everything required already and you just need to add additional configuration for an extra domain.

Alternative options:
I can think of other ways s3fs could be deployed that you could reach your goal, however they all would require some likely significant restructuring of your existing architecture. I do not necessarily mention these to suggest they are a easy alternative given for you now that you already have deployed content but rather to provide suggestions for methods anyone who may find this in the future and be considering options.

Private:// based storage:
Since the buckets servers are intended to be unreachable to the public web this is close to the private:// takeover which relies on Drupal reading the files from the bucket and serving them up to the user, rather than proxying to the bucket. Less performant as Drupal has to bootstrap for reads rather than allowing a dedicated proxy/webserver to stream the content.

CNAME + Folder Prefix:
If this was a new install I could suggest using cname pointing to your domain name, and using a 'folder prefix' in s3fs. That would mean all objects would generate as http://www.example.org/folder_prefix/path/to/object which would put the request in the same get in the same location as a 'proxy pass' feature would, though it does so by moving the location of the objects in the bucket.

CNAME + Targeted paths to pass to bucket:
Similar to a folder prefix, if all your content is under a specific path such as /files/ or a directory structure '2022-*-*/' you could still use CNAME set to your current domain name and just route those paths through to your bucket pods.

This also works well for public:// takeover if not using s3:// as public:// always has a prefix (default of s3fs-public) that could be captured and use for routing. This one actually does not require a significant structural rewrite of the environment, but does requre being able to identify paths that should be routed to the bucket.

jonathanshaw’s picture

I'm pretty ignorant of network and DNS stuff, but I think I've hit a similar issue. I'm reporting it here in case it helps.

We use Flexify running on an Azure VM as an S3 API proxy for Azure blob storage. Flexify seems to require presigned URLs to be presigned using the ip/hostname that Flexify recognises itself as having, i.e. the IP of the VM or hostname of that.

However, in order to allow https access (necessary to prevent the user's browser from issuing insecure content warnings) we need to use a proxy like Azure Front Door as the easiest way to get ssl for the VM.

Therefore we need the S3 url to be presigned using the VM's ip/hostname that Flexify expects, but the actual hostname of the external presigned url needs to be that of the Front Door.

Specifying the VM http IP as the custom host in the s3fs settings works fine, athough obviously it bypasses Front Door and is therefore not https. Specifying the FrontDoor hostname as the CNAME in s3fs settings looks like it should work, and does correctly generate correct looking urls; but when they redirect to the Flexify VM IP they get a Flexify 403 error suggesting the signature is invalid.

What does work is to not use the s3fs cname setting, and instead decorating the S3fs Stream Wrapper and overriding the getExternalURl method to simply swap in the Front Door hostname in the external url generated by the parent implementation.

cmlara’s picture

Status: Active » Closed (won't fix)

Closing this feature request as won't fix. There hasn't been a large request for this feature from the community and it has technical concerns associated with it.

The 4.x version ability to use Drupal Delivery for every scheme may mitigate some of the OP original request.