Problem/Motivation
We want to be able to create file entities by uploading files.
This feature has been discussed, and argued over in the Core serialization and REST issue queues: #1927648: Allow creation of file entities from binary data via REST requests, but hasn't come to a consensus. The latest approach being explored is for a completely separate endpoint for binary data upload, which doesn't seem like it particularly fits JSON API's philosophy.
Having some basic functionality to create and update File entities for the majority of simple cases would be a very useful thing to have in the JSON API module.
I can see two straightforward approaches for accomplishing this:
- When POSTing (or PATCHing?) a file entity, permit a "data" attribute, which contains the Base64-encoded contents of the file to be created (or modified).
Obviously, this approach has limitations: There is overhead in request size due to the encoding. Also, you won't be uploading gigabyte-sized video files this way, as that will cause post_max_size or out-of-memory failures.
However, this approach has significant advantages: It works 100% inside the JSON API standard--no use of multi-part requests or binary data mixed into the request body. It's straightforward and probably understandable by most developers. It probably wouldn't require much change to existing serialization code, just add a bit for the Base64 decoding. And, it will work great for many use cases, such as image uploads. Documentation could be used to clarify this method's limitations.
Developers who need support for larger files could still write custom code, perhaps create a separate upload API, etc.
- Use an API endpoint that operates in parallel to the JSON API spec, and accepts a "multipart" HTTP Content Type. This can be made the be JSON API-like for consistency. See http://discuss.jsonapi.org/t/file-uploading-with-multipart/71/5
This might be a bit more difficult to implement, and require some experimentation to get right.
The advantage of this approach is that binary data could be sent as-is, and even broken up to allow uploads of any size.
Proposed resolution
Research the file upload support in REST core and leverage it as much as possible. Then provide the needed supporting code to make it JSON API compliant (maybe check http://discuss.jsonapi.org/t/file-uploading-with-multipart/71/5).
TBD
Remaining tasks
Come up with an implementation strategy and make it happen.
Comment | File | Size | Author |
---|---|---|---|
#48 | interdiff.txt | 3.56 KB | Spleshka |
#48 | jsonapi-support-for-uploading-files-2785345-48.patch | 8.92 KB | Spleshka |
| |||
#38 | jsonapi-add-test-dependency-2785345-38.patch | 273 bytes | Spleshka |
|
Comments
Comment #2
e0ipsoComment #3
hampercm CreditAttribution: hampercm at Acquia commentedCombined with info from duplicate issue #2868787: Support POSTing and PATCHing of file entities.
@e0ipso Feel free to change / holler at me if you don't like ;)
Comment #4
e0ipso@hampercm I have conflicting feelings about multiple operation write requests (meaning uploading a file, creating 3 tags, and posting an article). You can see some of the reasoning why the JSON API community can't reach a consensus in that front: #2760403: [FEATURE] Multiple operations in one request
Do you know the https://www.drupal.org/project/subrequests module? That is the current solution to multi-operation write requests. I have not tried it yet with the binary file upload endpoint. If you do, please post back to this issue or to the documentation.
That being said, I'm not opposed at all to the idea of supporting base64 encoded files (to the extent that the server / consumer can deal with the payloads). It's not a priority at the moment, but it will get in if someone provides a patch for it.
Comment #5
skyredwang#4 is consistent with the current recommendation from JSONAPI community: https://github.com/json-api/json-api/issues/246#issuecomment-282192434
Comment #6
e0ipsoI'd like to move forward with both options in parallel.
1. base64 encoded files
This will be very convenient, however there may be some potential issues with the different combination of permissions. I would like to call the special property something like
"encodedData"
instead of"data"
(to avoid confusion).I assume we will need to create a
FileNormalizer
that takes precedence for files and inherits fromEntityNormalizer
. That class will look for the special property to save the binary and reference the URI in the file entity.2. Multi-part upload endpoint
I am OK if this does not follow the JSON API spec in the response. That will make it easy to have this happen in core instead (unless we already have that ability, in that case we're done).
Comment #7
SpleshkaHappy with the suggested approach. My understanding is that we're leaving the 2nd option for core contributors to provide more comprehensive solution, which though doesn't comply with jsonapi standard because of additional http requests to a seperate endpoint. Having this said, I'm going to work on the implementation of option #1 for json api.
Comment #8
hampercm CreditAttribution: hampercm at Acquia commentedYes, I'm thinking that waiting on Core's implementation of file uploads probably makes more sense than implementing #2 in the JSON API module. Existing JSON API libraries and clients won't be able to utilize a custom method without adding implementation-specific code, which misses much of the value of using JSON API in the first place.
Comment #9
e0ipso@hampercm exactly, that was why I wanted to have both options. Will the solution described in #6 for 1 be consumer friendly enough?
@Spleshka sounds good. The scope of this issue will just be solution 1.
Comment #10
SpleshkaGreat, thanks for the update guys.
I've got one thing to clarify before we get started: am I right saying that currently there is no case / example in JSON API which would support creation of content entity together with a new content entity as a relationship? Initially I was thinking about something like this:
However, according to the JSON API specification it looks like we have to create a new file entity first, and then post a new node entity with relationship to the recently created file. The source code confirmed my concern, see this - it is currently required for a related object to have uuid specified (which is correct, according to the specification).
So my Qs are:
1. Am I missing anything obvious? Is there a way to create a content entity with relationship to the not yet existing entity?
2. I had another idea about usage of "meta" property of the related object instead of "data", which actually might eventually work out. However my worry here is if it a valid approach at all.
Comment #11
e0ipsoLet's start with the two calls approach first, please.
To combine 2 calls or more into one you can use https://www.drupal.org/project/subrequests.
Comment #12
e0ipsoIt just occurred to me, did you check https://www.drupal.org/project/better_normalizers? Maybe that fixes this problem or we can use it as inspiration.
Comment #13
SpleshkaSpeaking about https://www.drupal.org/project/subrequests - I'm not sure if this will be possible to use this module to combine two requests: the second request has to use the value returned from the first request (file uuid). I've briefly checked the module and haven't found this feature there. Going to check Better Normalizers module to see if it solves the issue.
Btw, I might be raising more high level issue now, but have you thought about saving of object in the relationships in the same POST / PATCH request for the parent entity? It looks like JSON API specification doesn't strictly rule this. This would make forms creation / integration on the decoupled frontend much more pleasant task. Avoiding multiple requests would result in a much better dev experience. Even having great initialive like https://www.drupal.org/project/subrequests doesn't seem to resolve all issues with related entities.
Comment #14
e0ipsoSubrequests should totally be able to do that.
Comment #15
SpleshkaVery nice, thanks! Wondering how I've missed that. Anyway, I think it's a part of larger conversation and I've created a separate issue for this #2874751: [DISCUSSION] POST of new entities from relationship alongside with parent entity.
Comment #16
SpleshkaI took a look at Better Normalizers. Here are my thoughts:
Comment #17
SpleshkaAfter having really interesting discussion above, the final plan is:
1. Stick to #6 as a high level concept.
2. Use Better Normalizers as an inspiration.
3. Files should be uploaded to the separate JSON API endpoint
/jsonapi/file/file
. Keep Subrequests in mind as a solution for multiple operations per single http request.Discussion is still opened regarding:
1. Shall we mark files as temporary until it's linked to a content entity?
2. Shall we provide an option to output file entity as base64 data?
Comment #18
SpleshkaMateu or other guys, do you have any strong opinions about the discussion topics in #17?
Comment #19
e0ipsoIt depends. I think it only makes sense when the binary is created via the
base64
encodedPOST
request. In the future, when core allows uploading files with multipart uploads, that will be handled by them.I would defer that to core.
No option. For now I think we don't need to output
base64
data. However that may change in the future if people keep asking about it. Paving the cow path they call it now :-)@Spleshka let me know if I missed anything.
Comment #20
SpleshkaMateu, not sure I undertand:
What is the other suggested endpoint to upload a file using JSON API? I thought that we will use
/jsonapi/file/file
endpoint for this feature.Comment #21
e0ipsoJust a note https://www.drupal.org/node/2682341#comment-11965256 is describing that
file_entity
has extra support for creating files. Maybe worth exploring how they do it too.Additionally, in #6 I advocated for
"encodedData"
. However I see now that other projects use"data"
. Given that I think that"data"
is better.Comment #22
e0ipsoThat resource is correct. In general we should worry more about the new (de)normalizer class that inherits from
EntityNormalizer
, so the resource path is transparent here.Comment #23
SpleshkaRight, now I see that we're on the same page. Finally it's time to let the code magic happen :)
Comment #24
SpleshkaI've been working on this issue recently and eventually got it working locally. However, during the development I've found several issues which made me think that this feature can't be added to JSON API:
1. File upload permission. Drupal Core doesn't not define permissions to operate with File entity. Therefore, it's not allowed to create new file entities. What I mean here is that this code always returns FALSE:
I don't think it should be JSON API's job to add the File entity permissions and to implement hook_file_create_access() to make file uploads work.
2. Security issue. Drupal Core can validate a file only if it's attached to the File Field. Thefore there is no pure file upload validation. It exposes the door for malicious file uploads. We could have some basic file upload validation implemented, but unfortunately Drupal Core doesn't contain list of safe file extensions / mime types to validate against. I don't think we should reinvent the wheel and put into JSON API opinionated list of supported file extensions, mime types and max file upload limit.
Actually, the second issue makes me think that Better Normalizers exposes a security vulnerability, as there is no file upload permission check, as well as no file validation. However, it's a separate story.
Proposed Solution
Both mentioned issues are solved in File Entity module. Therefore my proposed solution would be to create a new contrib which would use File Entity as a depencendy, and then add (de)normalizer to get the file uploads working properly. In the first version it will support only "api_json" format, but it should be very easy to extend it to support other formats.
Comment #25
e0ipsoI'd like to have this in the JSON API module. We could have a check on the presence of File Entity on both
supportsNormalization
andsupportsDenormalization
. And then add documentation on how to deal with files.Comment #26
SpleshkaMateu, could you please clarify: you mean you'd like to have this as a submodule of JSON API which ships together with the main module, or get it implemented straight into the main module? In the first case we could use power of File Entity module, while in the second case we will have to define additional permissions / configs just for file upload (and potentially override them when File Entity module is enabled, to keep the consistency across all files management).
Personally I'd really prefer the first option, because the second looks more like a workaround. However, you're the lead of this project and I trust your judgment.
Comment #27
e0ipsoStraight into the main module.
I'd like to only have upload support when the File Entity module is enabled. If you need upload support, then you need to enable the File Entity. I prefer to have symbiosis than over-abstraction. If for any reason you cannot enable File Entity but still need file upload, then you can write your own custom module.
In general, the approach of submodules works good for most scenarios. However this module is more of the style enable it and you're done instead of configure to meet your exact needs via config an submodules.
Thoughts?
Comment #28
SpleshkaOkay, this approach is a good trade-off I think. However, 2 potential issues kick in:
1. Testability. I'm not sure that Drupal.org test bot can support third-party contrib modules if they are not part of module's dependencies (needs additional info).
2. I've heard (tell me it was not just a dream) that JSON API might eventually end up in Drupal core. So I assume it can't contain references to third-party contrib modules.
Have I missed anything?
Comment #29
e0ipsoDid you check if #2831274: Bring Media entity module to core as Media module will give you what you need? I think that patch has both the permissions and validations. It could be an alternative to depending on File Entity.
Comment #30
Spleshka#2831274: Bring Media entity module to core as Media module This is interesting. I took a look at the resulting patch and think that in the future it can be a good replacement for the File Entity integration. I'm assuming that for now it's okay to rely on File Entity, but in the future in might be replaced by Media when both get in core.
The question regarding testing is still opened. Will update the issue as soon as I find out the answer.
Comment #31
SpleshkaFound resolution for the first issue in #28 regarding testability. If we add test_dependencies to the module's
*.info.yml
file then Drupal.org's test bot will download necessary dependencies. However, there's a small issue with this, see this article: if we add test_dependencies and tests into the same patch - it will not work. But I'm sure we will work around this.The main thing now is that we know the approach to take, no more blockers. Happy to move on with the development.
Comment #32
SpleshkaAttaching the first patch version just to get the balls rolling. File uploads already work, though we need to test it additionally.
My next step will be to cover new functionality with tests. Critics and feedback for the given patch are welcome.
Comment #34
SpleshkaLooks like check of enabled File Entity during containers registration was not the best idea. Fixed the issue, re-submitting the new patch.
Comment #35
SpleshkaReplaced usage of
trait_exists()
, as it seems to be unstable.Comment #36
SpleshkaNotes on usage:
1. Install File Entity module.
2. Configure permission to upload new files.
3. Send
POST
request to/jsonapi/file/document
(instead ofdocument
you can use any other file entity bundle):Comment #37
SpleshkaGot inspired by examples in Drupal core and enhanced registration of new
serializer.normalizer.file.jsonapi
service on the fly depending on File Entity availability. The main logic remained untouched.Comment #38
SpleshkaGetting to tests coverage. Mateu, due to the issue described in #31 with test dependencies, do you mind committing the attached patch ahead of the main one? Otherwise all my upcoming patches with tests will be failing.
Comment #39
e0ipsoThis looks pretty good. Thanks!
Some feedback:
I don't think you need to get this complex. Just declare your normalizer normally, then in
supportsNormalization
andsupportsDenormalization
check if the module is active.Change to FileInterface::class
Will this take into account multibyte characters?
We want to inject this service to the normalizer.
We want to inject the
currentUser
service.Again, let's avoid
\Drupal::service
. Use dependency injection instead. This comment also applies elsewhere.Comment #40
SpleshkaThanks for the feedback, @e0ipso.
1. In the
FileEntityNormalizer
class we useUploadValidatorsTrait
which comes with File Entity module. So without the module the class throws error, because the trait is not found. Alternative solution here is to copy-paste the necessary trait's method toFileEntityNormalizer
- then we will be able to define the service in.yml
file and check File Entity availability insupportsDenormalization()
. However, I'm not sure that this is the best approach.2. Done.
3. Yes, see Unicode::strlen()
4. Fixed
5. Fixed
6. Fixed
Comment #42
SpleshkaThe further progress is blocked by #2877589: Fix failing test in the dev branch, because tests fail.
Comment #43
SpleshkaThe blocker is cleared now, submitting the patch for re-testing.
Comment #44
SpleshkaAll tests pass now, which is great. Our next step should be getting this patch into the module. Otherwise, as I've already mentioned, all tests relying on File Entity will be failing, because d.org test bot download dependencies before patch is applied, and not after.
@e0ipso, let me know if you have any questions or concerns.
Comment #45
hampercm CreditAttribution: hampercm at Acquia commentedThanks for your hard work on this! I've been focusing most of my time on other work, and lost track of this. It was great to come back and see how much progress there has been!
Some review notes/questions based on #40:
If the POST specifies a MIME type, should this take precedence over the guessed MIME type? How reliable is the guesser?
I can see this as a requirement for security reasons, but I think we should add a "hook" of some sort here to, for example, allow setting the destination path based on the JSON API request's "uri" attribute. Maybe a
getDestination()
method that can be overridden in a subclass?This code seems familiar. Did you pull it in from elsewhere? Is it possible to reuse the code from its original module/class/location rather than duplicating it here?
Comment #46
hampercm CreditAttribution: hampercm at Acquia commentedIn response to #44, #40, #26, #27: This may be an indication that this functionality really does belong in a submodule which has a dependency on the File Entity module. Doing so would be consistent with how other Drupal contrib modules have generally been organized.
I definitely like the idea of keeping JSON API as much of a "one-click install" as possible, but this may be a case where sticking to that ideal could lead to more confusion, rather than less.
Comment #47
e0ipsoThanks for your review in #45 @hampercm! Setting the issue to needs work.
I think it's fine to assume that as new contribs get turned on, you can gain new features. Enabling file field will also have the ability to let you filter by the image alt, we're just adding a new benefit to that. Ideally we wouldn't need an extra core/contrib module to unlock that feature, but I doubt that after all these years those permissions will come to happen in the file module.
Comment #48
SpleshkaThanks @hampercm for the review. Here's my feedback:
1. Very good question. I took a look inside of FileEntity module, and it looks like file mimetype gets set explicitely (see FileEntity::preSave(). So there is no option to send mimetype from the frontend app. Having this said, there is no point in setting mimetype at jsonapi end as well. I've removed it.
2. Right now you can set filename to something like "subfolder/filename.jpg" and it will put the file to the correct subfolder. I believe it covers most of the use cases (definitely worth adding to the docs). If developers will want something more advanced - I advocate for a separate issue dedicated to the specific use case.
3. Great memory :) It was taken from file_save_upload. Unfortunately, there is no chance we could reuse some of that code, because it's just a huge function, which contains only several relevant partials.
Comment #49
e0ipsoCan we create a follow up ticket to cover:
I'll review the update later today.
Comment #50
hampercm CreditAttribution: hampercm at Acquia commentedRe #48:
Good to know there is a work-around without requiring custom code, but that approach feels a bit awkward: it is inconsistent with how the JSON API operates when GETing a resource. I'd prefer using the
uri
attribute when POSTing files, and deriving the filename from that value, if that is possible.Thanks for the details :) I figured it was probably something like that.
Comment #51
SpleshkaYup, done #2878200: Provide documentation for files upload.
Fair enough. It provides better DX for frontend devs. However, I've got two concerns:
1. How fair is to require from frontend developers to understand the file streams of Drupal? This
public://
and other stuff?2. Will it open any potential security issues, if frontend apps will decide how to store files?
Comment #52
clemens.tolboom#38 is not committed. Should it?
Comment #53
SpleshkaYes, it should go into the module ahead of the main patch.
Comment #54
Wim LeersGreat work here!
However, test coverage is definitely very much needed for this to land.
@Spleshka: have you looked at #1927648: Allow creation of file entities from binary data via REST requests at all?
Comment #55
Wim LeersComment #56
Spleshka@Wim, yes, I did. It's a bit different case, because the referenced issue uploads file to the field, which has better core support than a pure file upload. However, it has several takeaways which potentially applicable to this story, so I assume it makes sense to bring them to the table:
1. If specified destination is not writable - the file upload fails, while the patch in #48 tries to make the destination writable first, if possible. The first approach seems to be safer, while the latter - more developer-friendly. Not sure if there is any security issue in letting frontend app to decide which folders to create within the
public://
steam. Please let me know if there are any.2. The filename is taken from the
Content-Disposition
header. I doubt this is applicable to jsonapi, I just found it curious.3. The default filename is set to
uniqid()
until explicit filename is specified.Personally I can't imagine a good use case where random filename with unspecified mimetype would be beneficial. However, I'm opened for other opinions.
4. File is being written to the temporary folder first (broken by 8192 bytes pieces), then moved to the destination. I'd really appreciate if anyone can explain the rational behind this. I feel it might be potentially beneficial for jsonapi's case as well.
Comment #57
Wim LeersThis sounds dangerous…
This is still being debated. Also: a big part of the reason is simply that the request body is in
application/octet-stream
format, to allow the file to be uploaded efficiently. Therefore there can't be JSON in the request body. Hence a request header is used.That is just WIP code, you can see the
@todo
just below.Note that the core patch allows uploads of an arbitrary size, without violating the PHP memory limit.
Please post your remarks as a review on #1927648: Allow creation of file entities from binary data via REST requests rather than observing differences here. Let's collaborate! :)
Comment #58
e0ipsoWe talked about this in the API-First meeting yesterday and concluded that the best way to move forward it to land the core issue and have uniformity in here.
https://youtu.be/aeDjN8VCus0?t=23m20s
I'm tempted to postpone this, but I want to get @Wim Leers and @Spleshka's opinions on postponing this until core has resolved this.
Comment #59
SpleshkaHey guys, thanks for the feedback.
My favorite answer : yes and no. I usually become paranoic every single time I need to let the code to change something in the file system. However, it seems to be the only viable way to let frontend app to decide the folders structure. In the related core issue #1927648: Allow creation of file entities from binary data via REST requests it's possible to set the file upload folder in field settings and it will be created and prepared by Drupal. II have already mentioned this issue in #16. Unfortunately, the direct file upload doesn't have this option available. We don't want to turn the root
public://
folder into the files hell. If there's a good security reason to disable this feature, then we might want to hardcode the uploaded folder topublic://jsonapi_files
or provide a small configuration option for a list of allowed folders to use.Oh yes, I'm all up for this! I'm just getting back into active contribution after several years of offline stuff, so still getting up to speed. I've decided so far to avoid participiation in too many issues at the same time: I'd better work on 5-10 issues per month with guaranteed delivery than being everywhere and nowhere at the same time :) As most of other contributors, I do this outside of working hours, so have to be very careful about the time usage, just to make sure it brings the max benefit into the community.
Looks like I missed a good party yesterday! My 2 coins (or 50 cents) regarding the file upload issue:
1. I'm not sure if there's a good reason to postpone the progress. What would I do is to make the patch look good for everyone, get it in, and then keep monitoring the related issue for possible beneficial takeaways. Then we can simply enhance the feature through follow-ups. I would not expect any dramatical changes in the API, because the core issue works around file field uploads, which will never be the case for JSON API.
2. I want to be 100% clear that I'm not against the idea of putting the patch into a submodule for JSON API, or JSON API Extras, or even as a standalone contrib. The goal here is always to get the feature working, and I'm happy with either option we all find the most beneficial.
Comment #60
e0ipsoMaybe that is the solution. We can have a separate contrib project where we host this code. It should apply cleanly, since you are not making use of internal APIs. That extra contrib
jsonapi_files
(?) can depend on File Entity now and release a new major version to depend on Media Entity in the future (if needed).If we ever discover that the solution in core needs to be ported/adapted to
jsonapi
, then we can do so and deprecatejsonapi_files
.Does that sound like a good compromise?
Thanks so much for helping out here, of all the available contrib <3
Comment #61
SpleshkaIt does, happy with this option. If no objections, I'll create the contrib within the next day or two. Possible names:
I did mention it in Twitter and going to repeat myself: you deserve a good help with maintenance of all your headless contribs. JSON API is not just yet another module, it's the whole development philosophy.
Comment #62
SpleshkaAs promised, got the module published: meet JSON API File. I've made a small enhancement to it: according to #50, the
uri
field became mandatory instead offilename
, which makes more sense to me. All other issues could be created against the new module.Thanks for all your input guys, file uploads is finally available for JSON API!
P.S. Gave co-maintainer perms to @e0ipso, because I trust him enough to manage submodules for JSON API :)