Closed (fixed)
Project:
JSON:API
Version:
8.x-2.x-dev
Component:
Code
Priority:
Major
Category:
Bug report
Assigned:
Unassigned
Reporter:
Created:
16 Feb 2017 at 02:25 UTC
Updated:
11 Sep 2018 at 19:59 UTC
Jump to comment: Most recent, Most recent file
Comments
Comment #2
e0ipsoThis is a great observation. This is my 2c
What do you think?
Comment #3
hampercm commented#2 sounds reasonable.
Comment #4
clemens.tolboomHaving a list of entities of which some are 403s should be filtered out and no information about their 403 state should be returned as that is information disclosure.
I have 2 articles of which one is unpublished.
Requesting a single entity resulting in a 403 is ok as one knows the UUID somehow.
Comment #5
clemens.tolboomIMHO the collection never should contain 403 items.
We are now required to add filters to prevent 403 items like
Comment #6
wim leersI agree that #2 is the right approach.
#3:
That is not true. See how core's REST module handles this: #2808233: REST 403 responses don't tell the user *why* access is not granted: requires deep Drupal understanding to figure out.
Comment #7
wim leersRelated: #2930231: JSON API 403 errors don't tell the user *why* access is not granted: requires deep Drupal understanding to figure out.
Comment #8
dawehnerI'm not 100% sure whether using a 403/404 is the best idea.
Just imagine the following case: You are a developer not aware of Drupal at all. You are accessing a URL, you get a 200 back. Your code has some condition to handle the 404 case, but it should not happen, given you handle authentication properly. One day someone deletes a bunch of content from the site and suddenly your app shows some error message.
In other words I think there is a semantic difference between a 404/403 of a collection resource and a 404/403 of individual bits embedded in that resource. A 403/404 is a client error, but well, in this case it was certainly not an error the client did, it is simply a state change of the server.
Comment #9
e0ipso@dawehner isn't that the same that happens with a front-end that requests node/2 after it got deleted or unpublished? But we accept that requesting node/2 after deletion/unpublication will yield 404/403, right?
Comment #10
dawehner@e0ipso
Interesting point/question!
Well, if you think about it, if you know that /node/2 is deleted, you no longer have to request it. It would not be true for collections.
Maybe the right approach would be to look out for public APIs (github, some API best practises etc.).
OT:
The guzzlehttp library throws an exception by default when you have a 404/403 result.
Comment #11
gabesulliceThe decided approach in #2943176: Spec Compliance: when error object 'id' key exists, it contains the individual resource URI, instead of "a unique identifier for this particular occurrence of the problem" was to include only a resource identifier for the forbidden resources, which makes this issue outdated. As there will be values in the data regardless of every entity being inaccessible. Therefore, a 200 is acceptable.
Comment #12
wim leersPer #2943176-34: Spec Compliance: when error object 'id' key exists, it contains the individual resource URI, instead of "a unique identifier for this particular occurrence of the problem", #2943176 actually hasn't reached a conclusion yet. Reopening.
Comment #13
gabesulliceI'm gonna quickly try to summarize some background for my thinking:
All of this leads me to believe:
The status code for a "collection resource" should be entirely distinct from any or all of the entities represented at that resource.
Unless there is an access control mechanism that is explicitly a prerequisite for accessing any list of entities of a given type that we can leverage, then it must be assumed that any consumer must have access to any collection resource.
It is completely consistent with REST to return a 200 response to an empty collection, even when all those entities have been elided for access reasons.
The proposed resolution that "[we should return] a 403 Forbidden response instead, without any meta errors" would therefore be inconsistent with the REST idea that resources and representations are distinct.
The original post touches on our problem when it states: "...and all of the entities of that type are not accessible." Unfortunately, we simply cannot know when that is true for "all of the entities" of a type.
Let's refocus on the counterintuitive piece: "a successful HTTP status (200) and a "meta" section filled with an error for each [denied] entity"
Also, let's remember that the meta errors themselves have
"status": 403piece which is itself counterintuitive because status codes are a response-level concept.I think the ideal solution is to:
meta.errorssection of document (removing entities for access reasons is neither a client nor a server error)Comment #14
wim leersI like #13! Well-reasoned. Thank you for taking the time to lay out your rationale 👍❤️
Assigning to @e0ipso for feedback, since he's the one who designed the current approach. He likely has more insight into and opinions about this than I do.
Comment #15
e0ipsoDitto.
Only one consideration, let's move these changes to 2.x.
Comment #16
gabesulliceWoot! 🤘
Updated the IS.
Comment #17
gabesulliceComment #18
gabesulliceComment #19
wim leers➕
Wonderful to reach consensus :) Well done, @gabesullice! 👏
Comment #20
wim leersCan the consensus here help us get to consensus on #2943176: Spec Compliance: when error object 'id' key exists, it contains the individual resource URI, instead of "a unique identifier for this particular occurrence of the problem"?
Comment #21
wim leersIdea: whenever some content is inaccessible, we should send a 206 response? https://httpstatuses.com/206
Comment #22
e0ipsoI'm not sure 206 is meant for this particular case, but for range requests.
Comment #23
wim leersGood point. It's just that is an accurate description here. But I think you're right; it's only intended for range requests.
Comment #24
wim leersComment #25
gabesullice@Wim Leers assigned this to @e0ipso in #14 for his feedback, @e0ipso gave that feedback in #15. Therefore it no longer needs to be assigned to @e0ipso.
Assigning to myself!
Comment #26
gabesulliceFirst, let's see if any exceptions besides access related ones actually bubble into
meta.errorsso we know if that needs to be taken into consideration.We can identify those because all access errors have an
idmember that looks like:/$resource_type_name/$resource_id.Comment #27
gabesulliceInteresting! That's good to know :) I've been working on this locally, but don't have something to POST yet (🤣). In the meantime, I think this is a major issue for 2.x.
Comment #28
wim leersCan you briefly explain what's so interesting/what this means exactly?
Comment #29
gabesulliceThis is the only place in the code that
meta.errorsare added/surfaced in the document. Meaning that if I handle the all inaccessible entities at a higher level—say by adding disallowed access results as an argument toJsonApiDocumentTopLevel(this is just an off-the-cuff idea)—then all the merging and tracking ofmeta.errorscan just disappear... because the only possible errors were access errors!Comment #30
wim leersVery interesting indeed :)
Comment #31
gabesulliceI think that this will be easier to accomplish after this is fixed.
Comment #32
wim leers#2985426: Spec compliance: `related` routes should return 200, not 403, if field access is allowed but the related resources are forbidden landed!
Comment #33
wim leersI believe #2949807: Spec Compliance: Error responses are missing the `jsonapi` top-level member should land first, and may make things simpler here.
Comment #34
wim leers#2949807: Spec Compliance: Error responses are missing the `jsonapi` top-level member also landed! 🚀
Comment #35
wim leersThis is now the last remaining !
Comment #36
gabesulliceSince this is the last remaining blocker for 2.x and we are very enthusiastic about releasing a beta ASAP, I recommend that we be pragmatic like we are being w/ revisions. That is, like revisions, let's review the HTTP API aspect of this patch with scrutiny and then stabilize this after.
With that said, this patch does exactly what I would want from the HTTP API perspective and matches points 2+3 of the proposed resolution we came to consensus on after #13. I still need to validate that collections only ever return 200 and then update the rest of the test expectations for it if we are agreed on this approach.
Comment #37
gabesulliceHere is an example of a response in this format.
Comment #38
e0ipsosomehow I thought it would be a JSON Pointer: `item#0` ➡️ `#/data/0`, `#/included/7`, …
Comment #40
gabesulliceHuh, that never occurred to me. Is that the case with error pointers today? They just point at
/datadon't they? IOW, they don't track position do they?I chose the
item#Nbecause: (a) the spec generally treats links object keys as link-relation types (although it's not an officially in the spec) and (b) the spec disallows lists of links.So I went with
item+ an incrementing integer with no specific regard for document position. See https://tools.ietf.org/html/rfc6573#section-2.1.Comment #41
wim leersWRT JSON Pointer
I think it is the case today. For example:
\Drupal\jsonapi\Controller\EntityResource::createIndividual()\Drupal\jsonapi\Controller\EntityResource::checkPatchFieldAccess()We just don't have any examples of us pointing to a numerically keyed item in a data structure ("arrays"), we only happen to have string keyed items ("objects").
You even had something like that in #2943176-21: Spec Compliance: when error object 'id' key exists, it contains the individual resource URI, instead of "a unique identifier for this particular occurrence of the problem" :)
WRT link keys
Hm, that's tricky :(
General
How could you ever generate an actually accurate/usable index or position? The omitted resources are … omitted, and therefore the subsequent resources take the omitted resource's place. So if there's 3 resources, they'd have indexes/positions 0, 1 and 2. Imagine the one in position 1 is omitted because it's not accessible. Then saying
/data/1or#item1is omitted is rather futile, because the resource in position 2 has received position 1.IOW: I'm not sure I understand either approach. It feels like I'm missing something? 🤓
Comment #42
gabesullicemeta.errorsobjects never have any other pointer than/data. Even when they are for included resources. Remember,meta.errorsdo not appear in error documents and they only appear for access denied cases in collection/related routes.My interpretation of the spec has always led me to believe that pointers are for identifying the location of an error in a request document. I.e. for field validation errors. Not for identifying where something came from in a request document. I believe that latter case is a concept unique to our implementation. It's not without merit, it's just not the current case and it's not something standard.
I agree that position would be incredibly difficult to do and we're not doing it today. Perhaps we can leave that feature on the table and use #2993654: Provide JSON Pointer as a link on a resource to ease matching with included items to point to the link object when an include has been removed. This wouldn't work for removed primary data items though.
Comment #43
wim leersThat's what I was getting at in my "General" section in #41!
That's why I was confused to see
item#0in your #37 screenshot.Upon re-reading, I see that I missed . That makes more sense. However, I can see how a passerby would interpret
item#3as meaning "the fourth item".I expected something like:
But you're right that the JSON API spec seems to imply that links should be keyed by their (IANA) link relation type. There isn't a clear one. So let's just make them have unique keys:
Actually, we may want to list the reason; that may be relevant or even required in some use cases:
and actually, having looked at https://www.iana.org/assignments/link-relations/link-relations.xhtml, I … end up finding the
itemlink relation type, so we can make that slightly better:… of course at this point I've reached essentially the same solution as #36/#37. But I think not having sequential numerical IDs is essential here. Something like what I proposed makes it far clearer that the
FOOinitem#FOOis not a meaningful number, but just something to generate a unique ID.Comment #44
wim leersIOW: I thought I disagreed with you, and started to explain why, and in doing so I ended up agreeing… 😅 😭
I wonder if reading the #37 screenshot also made @e0ipso jump to conclusions like it did for me!?
Comment #45
e0ipsoAh, yes. Thanks for the extra context. At some point we talked about having the unaccessible resource objects without any fields (just type and id) populated in place. In that scenario it would make sense to have a pointer. I had a mix of realities, I think we are good to go with the current approach.
Let me increment the counter of the times where we moan for the poor hypermedia controls of JSON API. http://gtramontina.com/h-factors
Comment #46
wim leersThanks for pointing to http://gtramontina.com/h-factors/, I hadn't seen that yet!
Same here. And me thinking "Okay I'll explain what I mean", trying to do so in #43, and then concluding that Gabe Was Right™ was … funny and humbling 😊😃
In the interest of making this go faster, I took @gabesullice's #36 and pushed it further. I did implement it with
item#node--article/bb3c7dfe-16d0-433b-a6f9-cdc83b8bd354, i.e.item#<resource type name>/<ID>, to avoid the potential for misinterpretation that @e0ipso and I both made, independently! 😇Comment #48
gabesulliceThis is awesome ^ Thanks for sharing it :) Hypermedia is a sad, sad story in JSON API indeed. I'm very excited for profiles to land so we can fix it :)
Consensus! 🎉
Purely aesthetic things:
#looked good when it preceded a numbering. Now I like that the whole thing looks more like an identifier.Next up, fixing tests.
Comment #49
gabesulliceWhoops, forgot a string.
Comment #50
gabesullicePosting some slow going progress.
Comment #52
gabesulliceSome more. The essence of most of these changes is that some errors don't have an
idand many don't have asource.pointer. This makes it hard to generate any kind of link. For that, I added aselflink onto all errors. This patch and the previous is mostly finding and updating that as necessary.Comment #54
gabesulliceCleaning up a little and hopefully making this a bit greener.
I'm tempted to make this
viainstead ofself. Anyone have thoughts about it?Comment #56
gabesulliceMade it
via. I like this because it avoid some confusion with a resource objects'selflink, which generally refers to the route where that resource can be retrieved. For example, a resource object in a collection does not have aselflink for the route it was retrieved at, it has the route where it can specifically be found. But an error shouldn't really be "found" anywhere. In essence, it was just "delivered via" its parent page.Comment #58
gabesulliceHm, not sure what happened there.
Comment #59
gabesulliceComment #61
gabesulliceThis should clear the last functional test failures 🤞
Comment #63
gabesulliceThis should be green.
We still need to clean up all the test code that says it should be deleted in this issue and clean up new CS violations.
Comment #64
gabesulliceOkay, this cleans up all the test code that was supposed to be removed by this issue and also by #2943176: Spec Compliance: when error object 'id' key exists, it contains the individual resource URI, instead of "a unique identifier for this particular occurrence of the problem" which this issue subsumed.
I made another change to the link keys that will need a +1 from @e0ipso and @Wim Leers. Instead of the resource type and UUID, I just MD5 the index.
Why? A few reasons. First, because too many errors don't have an ID at all! In that case, I tried using the pointer, but the pointer was often wrong too.
Most importantly though... I don't want these keys to represent some kind of API, they're just a way to make unique keys. Clients will inevitably try to deconstruct the key for some purpose. That won't be good if JSON API 1.1 or some profile tries to improve the linking situation. IOW, if we put some language in JSON API about this, we won't be up a creek having to break BC and we won't be sending a mixed signal.
A resource identifier is in the
hrefeverywhere that one applies already and that's what's supposed to be used for parsing.Now moving on to CS violations.
Comment #66
gabesulliceEasy fixes.
Comment #67
wim leersI … was hoping that the patch would remain close to the size that it was in #46… it went from 5K to 40K. Alas.
Most of it is due to
I'm very concerned that we're making responses more complex and making such a big set of changes only to accommodate our tests. Furthermore, the actual logic changes that we're making do only concern "meta errors" that do have an ID set.
So I dug in and tried to find another way. It's taking longer/is harder than expected/hoped. Stay tuned. (Posted this already so you guys know what the hell I'm doing!)
Comment #68
gabesulliceIn IRC, @e0ipso suggested we use a true random value, not a hash of the index. This does that.
Comment #70
gabesulliceI don't think I agree that we're making responses more complex.
We're removing the ID from all errors so that they're not on some but not others. In it's place we're adding a consistent and meaningful URL to every error in its place.
The tests are simplified because I was able to remove numerous chunks of code that existed only to work around the inconsistency of an ID. There are many changes to it because I had to add the
$via_linkeverywhere, but the flip side of that is that every error is consistent.At each stage, I asked myself if I was happy with the API for errors this was setting up for the 2.x and I think it's reasonable. The only thing I would change that is not already done here is that I would also remove the
codekey from errors.Can you point to what you think is adding complexity that wasn't already there?
I can certainly think of how this could be simplified as we generate errors. But my intention here, as I said in #36 was to make the HTTP API better and then work on the code to generate that API better, later.
Comment #71
wim leerset cetera
Comment #72
gabesulliceI already addressed those:
I think that's adding assertions about consistency. Not increasing complexity in any meaningful way.
Comment #73
gabesulliceFixed the sorting that caused the previous errors.
Comment #75
wim leersBut do we truly need that
vialink?D'oh, of course we do … we need it to that the omissions can point to the omitted resource, which is the part of the issue title. Sorry for not realizing that sooner.
I tried to address my concerns by stepping through the patch's changes with a debugger. I noticed that it also worked correctly for
::testGetIndividual()'sdoTestIncluded(), which ends up ingetExpectedIncludedResourceResponse(). And there too, I saw that the 403 response for an inaccessible relationship also resulted in a correctviavalue. But then I didn't see anymeta.omittedappear in the final response. I then realized that's also not happening today. So then why do we need thevialink on 403 relationship responses?(I specifically dug in
doTestIncluded()because that's the only test coverage that is failing in #46! If it's not showing up in the end result of a response with includes, then we can also just make it not deal with meta errors, and tests would also pass… right?)But I'm not sure yet we need it for a 400 response because a query parameter violated the JSON API spec?
IOW: AFAICT we only need this
vialink forcollectionresponses, but we're adding it both to all normalized HTTP exceptions ("errors" in JSON API context), and we're adding it to all test coverage.I'm still not fully convinced. I feel like I'm missing something obvious?
Comment #76
gabesulliceCS violations and making the
$via_linka non-optional parameter in a more logical position in a couple test methods.I'm not ignoring you @Wim Leers. I'll respond in a second, I'm just optimistic that I can convince you, I just want to get testbot to run this now.
Comment #78
gabesulliceExactly. But please don't apologize!
That's correct. I think what you've spotted is an inconsistency that comes out of this issue: #2956084: Impossible to raise an error when an `include` is requested for an inaccessible relationship field.. Correct me if that's a wrong assumption.
Honestly, we don't need a
vialink in errors under the on error responses anywhere. But I do think its nice to have a consistent API where all objects under either thedataorerrorskey either have aselforvialink.An interesting benefit that one might get from this is for tools like Rollbar which monitor client-side errors. With this, you have an easy way to see what URL was the source of the problem. Without it, error documents themselves contain no
selflink, so you'd have to either "remember" which URL you sent the request to in your log message or pull fromwindow.location(but that often won't be helpful because apps rarely share show the same URL in the browser that the client is making).In essence, it makes the error self-contained. The error object itself can answer "where did I get this 403 from?"
I think a query parameter violating the spec is actually a great reason to put it there! Error objects have
source.parameterto inform the client which parameter caused the violation and thevialink can show you that parameter in context (perhaps your URL generator messed something up). With it there, you have everything you need when you doconsole.log(errorObject); you don't need to cross-reference it with the network tab.Comment #79
gabesulliceApparently, manipulating arrays by reference is hard.
Comment #80
wim leersOf course! I should've thought of that. This is why it's important to fix inconsistencies, because they make maintenance harder, even for somebody who's been working in a codebase for many months straight :D
Another good point.
And again a good point.
Consider me convinced!
Nit: I think using the repository is overkill.
Doesn't this mean that
['meta']['errors']effectively cannot ever exist anywhere anymore? If so, why not s/meta.errors/meta.omitted/ everywhere?Correct me if I'm wrong: this @todo is about exactly that?
I think it'd already help if this accepted not just
string, but alsoUrl, so that this->setAbsolute()->toString()stuff is done in the helper, rather than being repeated dozens of times.Nice use of
next().This will of course need a CR. And actually, I think having that CR will make reviewing and committing this simpler :)
Use
static::$resourceTypeName. See\Drupal\Tests\jsonapi\Functional\ResourceTestBase::testGetIndividual()for an example.👏
👏
👏
Nit: this is dead code, can be removed 🤘
Comment #81
gabesullice1. I agree that it is not pretty 😞. But we need to do it or else this will cause a fatal exception if JSON API Extras has overridden the type name. It'll cause the route name to not be found.
2. This is there because I wanted to get the tests right, release a 2.x-beta ASAP, and fix the _actual_ code ASAP. Better to get a beta out now than let the perfect be the enemy of the good, so to speak.
3. OHHHH, you mean I should see the forest through the trees? Excellent remark. Fixed.
4. Thank you!
5. Will do.
6. Done.
7-9. :)
10. Done!
Comment #83
gabesulliceGah, that's was silly.
Comment #84
wim leers#81
AFAICT you just missed a single spot:
static::$resourceTypeNameWould you mind already creating a follow-up for this? I want to make sure we don't forget.
Comment #85
gabesulliceDone.
Comment #86
wim leers👌
Comment #89
wim leersI asked for this change, but it was wrong, because this method (
\Drupal\Tests\jsonapi\Functional\ResourceResponseTestTrait::getExpectedIncludedResourceResponse()does:i.e. it digs deeper into the include path, following it down to the next level, until there is no next level. After following the first level relationship, using
static::$resourceTypeNameis therefore using the wrong resource type name!Fixing on commit.
Given that @e0ipso has given his +1 multiple times in the issue and actively contributed to changes in the patch in chat and in this issue, going ahead and committing this!
Comment #90
wim leersThis still needs a change record though!
Comment #91
gabesulliceI made many.
Writing them made me feel very good about what was accomplished :)
Comment #92
wim leersThat's exactly why I asked you to write them, because you did the bulk of the work here. I also find that writing CRs is therapeutic — it gives closure and a sense of accomplishment!
Comment #93
wim leers5 days after we finished this, somebody filed an issue to complain about how it used to work: #2995522: Omit meta errors from output 🙏👌😁