Problem/Motivation
This core bug blocks this JSON API contrib module issue: #2831137: Remove the need for ?_format=api_json: assume the use of the 'api_json' format for routes managed by JSON API.
Observation 1
Currently, the fallback for \Drupal\Core\StackMiddleware\NegotiationMiddleware::getContentType()
(i.e. if no ?_format=
querystring is specified) is this:
// Do HTML last so that it always wins.
return 'html';
And \Drupal\Core\StackMiddleware\NegotiationMiddleware::handle()
does this:
// Determine the request format using the negotiator.
if ($requested_format = $this->getContentType($request)) {
$request->setRequestFormat($requested_format);
}
The end result is that $requested_format
is never empty. It always is set to 'html'
.
Observation 2
Now let's look at \Symfony\Component\HttpFoundation\Request::getRequestFormat()
, which is what the router (and controllers) use to determine the request format
/**
* Gets the request format.
*
* Here is the process to determine the format:
*
* * format defined by the user (with setRequestFormat())
* * _format request parameter
* * $default
*
* @param string $default The default format
*
* @return string The request format
*/
public function getRequestFormat($default = 'html')
{
if (null === $this->format) {
$this->format = $this->get('_format', $default);
}
return $this->format;
}
Result
The result is that Request::getRequestFormat()
's $default
parameter is broken in Drupal, because NegotiationMiddleware
always sets it to 'html'
. Even when there is no _format
request parameter.
In other words: NegotiationMiddleware
breaks the Symfony Request
API.
Proposed resolution
Only let NegotiationMiddleware
call Request::setFormat()
when there actually is a _format
request parameter.
Remaining tasks
TBD
User interface changes
None.
API changes
None.
Data model changes
None.
Comment | File | Size | Author |
---|---|---|---|
#88 | 2854543-88.patch | 15.82 KB | Wim Leers |
#88 | interdiff.txt | 1.32 KB | Wim Leers |
#85 | 2854543-85.patch | 15.64 KB | Wim Leers |
#85 | interdiff.txt | 1.97 KB | Wim Leers |
#72 | 2854543-72-PASS--51_plus_test_plus_67.patch | 15.55 KB | Wim Leers |
Comments
Comment #2
Wim LeersComment #3
Wim LeersI marked this a major bug, because this breaks one of the foundations of Symfony that we're building upon.
Comment #4
Wim LeersClarifying title.
Comment #5
Wim LeersComment #6
dawehnerShould we rename the method now?
What about using
assertNull
though.Comment #7
Wim LeersDone.
Also, the best proof that this doesn't break anything is the fact that this passed all of core's tests. That means 100% of requests worked fine with this change. I think that's pretty convincing.
Comment #8
dawehnerThank you.
Comment #9
neclimdulI'm not sure its true that the test suite passes since we changed it. I'm pretty sure from what I remember of that original issue that was explicitly the desired behavior. We didn't want anything to fall back on Content-type because we didn't trust reverse proxies to correctly handle the Varies header.
Comment #10
alexpottShouldn't we add a test with code similar to what the contrib module is trying to do. So that we can be sure it works? At the moment the only tests changed here are unit tests which seem pretty tied to our assumptions about the way the code works.
Also I think @neclimdul's comment needs addressing. However it's going to be hard to reason about because this behaviour from the routing merge...
Comment #11
Wim LeersYeah, commit
223049a02c9e8b07e3755f05c6bcdab2094acf39
is A) tiny, B) not associated with an issue so probably was a git merge, C) has zero test coverage.I'm pretty sure @neclimdul is wrong about this one. The thing is that the test coverage that I'm modifying is the content negotiation test coverage. But when you use Symfony, a big part of the content negotiation irrevocably lies in
$request->getRequestFormat()
:i.e. its handling of defaults is absolutely essential. Which is why there's no need for the content negotiation system to explicitly set HTML, because that's already the default. Which is why #7 passes tests.
With this change, we're not at all relying on anything in content negotiation, nor anything
Content-Type
- orVary
-related. We're relying onRequest::getRequestFormat()
.Is this a flawed, confusing, hard-to-debug design? Probably. But we adopted the Symfony HttpFoundation component, with its pros and cons, its strengths and its flaws. The current code is breaking
Request::getRequestFormat()
. And that's the problem this is fixing.Working on the test coverage.
Comment #12
dawehnerDon't we have that already?
\Drupal\KernelTests\Core\Routing\ContentNegotiationRoutingTest::testFullNegotiation
does that alreadyComment #13
Wim LeersI worked on test coverage, as I promised. But it wasn't the 15-20 minute task I thought it was. I spent hours debugging.
…
Turns out that because
\Drupal\page_cache\StackMiddleware\PageCache::getCacheId()
also calls$request->getRequestFormat()
, you always get'html'
back, unless you're authenticated and hence that code is not executed.And because of the logic in that method, whatever
getRequestFormat()
returns the first time, it will also return in the future. This makes sense, because otherwise it'd be hellishly unpredictable.So… we could say the flaw is that
PageCache
usesRequest::getRequestFormat()
, but that is the Symfony API. I'm afraid that the only possible conclusion is that the Symfony design is completely flawed here. It simply cannot reliably support the "default format for a given route" concept. It's possible that they consider this an edge case. But this is a foundational component. Such a component should support many use cases.And regardless of whether the fault lies with Symfony's API design or Drupal's use/integration of/with it, there is one thing here that's undeniable: I was convinced this is correct. @dawehner RTBC'd this and in #12 even argues that we have sufficient test coverage.
If both @dawehner and I are both unable to reason about the abstractions, the layers, and what will work and what will not, then we've got a clearly flawed system. :( And we're both far more experienced than an average Drupal developer in these areas.
So, I am forced to give up on this issue. And frankly, on fixing the routing system. The best we can do with this routing system, is fix as many symptoms as possible. But there's no way to fix the root cause of this. It's a flaw inherent in its design.
Comment #15
Wim LeersEh, this should have been
'format:json'
.But you can see in the failed test the error that you get instead. So the test is slightly wrong, but it still proves the point I made in #13.
Comment #16
dawehnerHere is a way to solve this issue. We override the request object and let it behave like we want it to behave, aka. not storing the default value, but still have the same fallback. I actually believe we could fix that upstream, but here is a patch to prove the point.
Comment #17
dawehnerLet's see what the testbot says
Comment #19
klausiNever directly use ->get() on the request object, since that method behaves differently depending on Symfony major version. Use ->query->get() instead.
Interesting. So everywhere Request::createFromGlobals() is called we need to make sure the factory is set.
It looks like we can't put this in a more central place like DrupalKernel. We could put it into the constructor, not sure if that is a good idea.
Or maybe we should introduce a \Drupal::createRequestFromGlobals() method that makes sure the factory is set. Then we could replace all our Request::createFromGlobals() calls.
Comment #20
dawehnerLOL, well, that is a copy from upstream code.
Note: Here is the corresponding PR: https://github.com/symfony/symfony/pull/21805
yeah i thought about that for a second, and wasn't sure this would be a good idea at all.
Good idea. Well, let's see whether we can just get the symfony PR in, as this solves the problem as well.
Comment #21
Wim LeersThe upstream PR landed. See https://github.com/symfony/symfony/pull/21805. It's in Symfony 2.7.25 and 2.8.18. Drupal core is currently on 2.8.16.
Comment #22
Wim Leers#16: are you sure this is correct? I think the PR just results in
PageCache
defaulting tohtml
, and then our controller defaulting tojson
. But that meansPageCache
will have cached a JSON response with a HTML cache ID. And it means that\Drupal\Core\Cache\Context\RequestFormatCacheContext
will lie/be broken in a similar way.So unless I'm mistaken, the PR that was merged actually causes more problems than it solves. :(
EDIT: To be clear: I'd love to be wrong here.
Comment #24
Wim LeersBump. Stumbled back on this thanks to #2852587-20: Allow REST resource config entities to specify a default format.
Comment #25
Wim LeersComment #27
Wim LeersSee the CR at https://www.drupal.org/node/2954953 and the discussion in #2937942: REST views: regression in 8.5.x: view with display using 'csv' format now returns a 406 response, need to add ?_format=csv to URL to make it work. Fixing this would ease some of the update pain to 8.5!
Comment #28
mpdonadioOk, this should backout the changes now that https://github.com/symfony/symfony/pull/21805 is in.
DefaultFormatTest and NegotiationMiddlewareTest pass locally. Let's do a full run to see what else happens.
Comment #30
Wim LeersLooks like 100% of the failures are due to the weird "403 when it should be a 406" bug, for which we have #2805279: Routing system + authentication system + format-specific routes (e.g. those in rest.module) = frustrating, unhelpful 403 responses instead of 406 responses to fix that.
This patch is apparently making that problem worse … but also consistent. So in a way, this is actually better. I think we should also look into #2805279, and hopefully solve both at the same time here. But first, let's try to get this to green.
Comment #31
Wim LeersThis introduces new behavior in core: when a route supports a single format, that format will be picked automatically. So we need to expand our test coverage.
Comment #33
Wim LeersA problem that still exists is that
\Drupal\page_cache\StackMiddleware\PageCache::getCacheId()
callsRequest::getRequestFormat()
to look up a potential cache hit. And that defaults tohtml
. But I guess that's only a cosmetic problem: the cache ID incache_page
is not exactly the most important thing. What matters most, is that things behave as expected.Comment #34
Wim LeersGreat, all failures in #28 were fixed by #30!
But in #30 we now face a new set of failures, later in the test coverage: all
basic_auth
tests are failing because what is expected to be a 403 is now a 401.Comment #36
Wim Leers#31 has new failures, because thanks to the extra test coverage it adds, Page Cache & Dynamic Page Cache are being warmed already. Easily remedied.
Comment #38
Wim Leers#36 should have the same remaining failures as #30. Those are the
basic_auth
failures. Because whenbasic_auth
is required and you try to access it anonymously, you don't get a 403 as the patches above are asserting, but a 401!We could remedy this by moving the test coverage that #31 + #36 added to happen after we are authenticated. Then the 403s will be 403s always, and not 401s just when using
basic_auth
.Comment #39
Wim LeersFix more fails.
Comment #42
Wim LeersComment #44
Wim LeersThis should be green then!
Comment #45
Wim LeersFix CS violation.
Comment #46
gabesulliceNice! A few questions and a nit :)
This is the significant change that will fix the root cause of this issue.
Small suggestion: why not
$formats = NULL
then$format = is_null($format) ? [static::$format, 'foobar'] : $formats;
? Then for 'no format' just pass an empty list[]
.Good note. Am I missing how a single, non-html format is tested?
Nit: missing newline
Comment #47
Wim LeersprovisionEntityResource()
has meant "provision a REST resource for the currently tested entity type for the currently tested format". Requiring an explicit list of formats to be passed would change that.Comment #48
Wim LeersAlthough this of course is proving that this is not quite working yet. At least not for this edge case.
I guess that's what @gabesullice meant.
(This is an edge case because the controller itself is using
$request->getRequestFormat()
, which is not something that'd typically happen.)Easy fix: in
\Drupal\Core\Routing\RequestFormatRouteFilter
, we must persist the format that we picked.Comment #50
gabesulliceFull disclosure: that's not what I noticed. I did indeed overlook that the routes were different even though the controller was the same, so they indeed had only one format :)
Should we keep the incorrect format call and assert 406 here too?
Comment #51
Wim LeersThis change caused problems for those cases where there is a canonical URL, i.e. where there is a HTML response at that same URL.
Comment #52
Wim LeersThere was no incorrect
?_format=…
query string? And if we specify?_format
, then we're no longer testing the default format.Comment #53
gabesulliceNever mind me, I was being dense.
This looks good to me now :)
Comment #54
codesmithI applied the patch in #51. I updated my View -> Rest export display to have "json" be the only "Accepted request formats" in Format -> Serializer -> Settings. Now I no longer need to have the
_format=json
query argument.Thanks!
Comment #55
pturner1989 CreditAttribution: pturner1989 commentedI haven't worked out why yet but this patch isn't working for the Relaxed module which should return JSON based on CouchDB protocol.
/relaxed - Works
/relaxed/live - Works
/relaxed/live/* (/relaxed/live/_changes) - Client Error
All are configured the same with only json accepted so not sure why this is happening. Trying to work through it now.
Comment #56
polynya CreditAttribution: polynya at Nomensa commentedI have been trying to get patch #51 working with Relaxed as well. One issue is that RequestFormatRouteFilter->getDefaultFormat() always returns 'html' if there are multiple routes in the collection.
I have created a new patch which checks if all the routes require the same format. This works for the RemoteCheck plugins in Relaxed - check for Relaxed Remote errors on the status report page. Deploy still fails because rest.relaxed.doc.GET accepts json and mixed formats.
Comment #57
polynya CreditAttribution: polynya at Nomensa commentedSorry, please try this patch not #56.
Comment #58
polynya CreditAttribution: polynya at Nomensa commentedI have updated my patch (based on #51) so that RequestFormatRouteFilter->filter() keeps routes if they support only one format and the request does not specify the format.
Comment #60
Wim Leers#54: thanks for testing!
#55: Do you have news?
#56 + #57 + #58: please post an interdiff in the future. I've created interdiffs for your patches now. Without those interdiffs, it's impossible to see what you changed.
Also, this patch was RTBC. You say it didn't work for you. Ideally, you'd have added a new test that fails, which would prove that it didn't work in some case. Then once we'd fix that bug, we'd also have test coverage preventing this from ever regressing again! Of course, I know that writing that test may be not super easy.
Comment #61
Wim Leers#56:
Well yes, that's the very purpose here.
Oh, interesting! That is theoretically also okay, yes.
But I have to say I don't quite understand that if all those routes match the same path, already support the same method (because
method_filter
has a higher priority thanrequest_format_route_filter
), already accept the same input (becausecontent_type_header_matcher
also has a higher priority), and per your explanation they also support the same format! Looking at\Drupal\relaxed\Plugin\rest\resource\ResourceBase
, I don't quite understand how this is possible. To understand this, I installed the Relaxed module, but I still don't understand:Pinged dixon_ for insight. EDIT: the ping: https://twitter.com/wimleers/status/981477876104036352
Comment #62
dawehnerIt would be nice to document in line, why we need this additional logic.
Is there a reason we don't just use
foreach
which doesn't require you to use the iteration directly?Let's use a strict comparison here :)
Comment #63
polynya CreditAttribution: polynya at Nomensa commentedThanks for the feedback. I'll try to provide more details later and to look at the large set of test failures for patch #58.
Also jeqq is working on a fix for Relaxed - #2955317-9: The module needs fixes after changes in Drupal 8.5.x core
Comment #64
Wim LeersThanks, left a comment at #2955317-12: The module needs fixes after changes in Drupal 8.5.x core.
Comment #65
bendev CreditAttribution: bendev at WebstanZ commentedThank you for your work.
I tested #58 successfully.
Same remark as in #54. I needed to configure Json as the only "Accepted request formats" in Format -> Serializer -> Settings
Now I no longer need to have the _format=json query argument.
Comment #66
dawehnerSome interesting observations while comparing what happens in 8.4.x and 8.5.x in relaxed:
\Drupal\multiversion\RouteProvider
, but it turns outthat doesn't matter
\Drupal\relaxed\Plugin\rest\resource\ResourceBase::routes
,which means that assumptions might no longer be true
\Drupal\rest\Routing\ResourceRoutes::getRoutesForResourceConfig
sets the format,which was changed in
b4a02309bf739ae16ceedda3ff3a181d9a3a9b82
. Basically one problem I see is::routes
used to be overridable, withb4a02309bf739ae16ceedda3ff3a181d9a3a9b82
morelogic is living outside.
Summary: I think we should move the relaxed usecase into its own issue, because it is actually a rest module problem,
not a generic routing problem.
Comment #67
polynya CreditAttribution: polynya at Nomensa commentedI have updated patch #57 to remove the iterator and to add some comments.
Comment #68
dgtlmoon CreditAttribution: dgtlmoon commentedHi all, sorry to bump the thread - I'm doing some follow up work on the Drupal 8 REST documentation, according to my current
8.6.x-dev
it seems to me that the "_format=
" argument is still ALWAYS required - can someone vouch if that is the case?As it stands, both the config (
config/sync/rest.resource.foobarinfo.yml
, "default_format: json
") and the comment block in my resource handler dont' make any use of default_formatcurl -H "Accept: application/json" http://172.17.0.1/api/foobar/666
will give me a 406however
curl -H "Accept: application/json" http://172.17.0.1/api/foobar/666?_format=json"
will work.According to the restui configuration, I have only the "JSON" format enabled.
Again, sorry to muddy the waters on this thread.
note: patch from #58 seemed to drop the requirement for _format=, however when more than 1 format are enabled it does not let me choose the response via the "Accept: application/(json|xml|hal_json)" header
Comment #69
mpdonadio#68, there are several related issues to _format being required now at the top of the page. Those related issues may turn up a few more. Unfortunately, this is a complex situation. The related issues tackes various aspects: this one is about inbound routing, there is another about paths for view displays, etc.
The background behind all of this is probably best summarized in #2481453: Implement query parameter based content negotiation as alternative to extensions and #2364011: [meta] External caches mix up response formats on URLs where content negotiation is in use, along with the CR in [#2501221]. The TL;DR is that the Accept header is horribly broken across the interwebs and not a reliable way to do content negotiation in a generic manner in core.
Comment #70
dgtlmoon CreditAttribution: dgtlmoon commented@mpdonadio ok great, yeah I'm not too worried about the the actual state of the issue, just to know if this is worth updating the documentation that
_format=json
type argument should always be added, so that sounds correct to me then, thanks for the reply!Comment #71
denisomerovic CreditAttribution: denisomerovic commentedPatch from comment #51 worked for me.
Comment #72
Wim LeersDrupalCon Nashville happened, and then I was called in to help with critical customer problems. So, alas, we're now a month since we had an RTBC patch in #53… 😞
Based on the assessment @dawehner posted in #66, patches #57+#58+#60+#67 actually belong in a separate issue.
#51 still applies cleanly. But #67 makes the pretty confusing patches #57/#58/#60 simple again: it only updates
\Drupal\Core\Routing\RequestFormatRouteFilter::getDefaultFormat()
, which is much, much easier to understand again. Theinterdiff-51-67.txt
interdiff is also super simple. Great! But it still doesn't have the necessary test coverage additions. I asked for that in #60, nearly 3 weeks ago. I even spent a long time digging into therelaxed
module to try to get to a point where I had multiple route matches, but couldn't.So … let's add test coverage, then this can be RTBC again!
Comment #73
gabesulliceThis looks like a nice and reasonable improvement over the original RTBC patch. I agree that the more specific REST issues need to be addressed in a separate issue Re: REST module.
This then looks like it can be RTBC again (assuming tests pass).
Comment #75
Wim LeersYay, #51+test coverage patch is failing, #67+test coverage patch is passing! 🎉
Comment #76
jeqqI started to review and test the patch with Relaxed, it's still not working with it, will add details tomorrow when I finish the testing as today I was limited in time.
Thank you @Wim Leers for the work on this!
Comment #77
albertski CreditAttribution: albertski at Xeno Media, Inc. commentedTested patch from #72 (..test_plus_67..) on a custom POST Resource I created and I was able to post without putting _format=json. Thanks!
Comment #78
jeqqAfter testing with Relaxed - some of our routes work without
_format
, but most of them not.Are working routes with one required format and only if there is one route in the collection. Also it will work with more than one route in the collection only if they all have one and the same required format.
Which are not working: routes with multiple format requirements e.g.
json|mixed
and multiple routes in the collection with different format requirements (this is howRequestFormatRouteFilter::getDefaultFormat() works
).One idea might be to use a default format for the routes: in the route
defaults
to have_format
that can be only one format (not multiple like the_format
from routerequirements
). In this case we can check inRequestFormatRouteFilter::getDefaultFormat()
, after we check for the required format in the route, for the default format of the route. Having this we have more chances to return other format thathtml
when there is not_format
specified in the query and not end with a406
.This still won't fix all the issues we have with Relaxed and the CouchDB replicator, that because the replicator is relying on the
Accept
header when getting the response in a specific format, one example would be therest.relaxed.doc.GET
route that hasjson|mixed
format requirement. The CouchDB replicator expects to get the format it needs by sending the specific format in theAccept
header ('multipart/mixed' or 'application/json').@Wim Leers I remember you said that we can't rely on using the
Content-Type
header, but how aboutAccept
? Can we use it to determine the format and return the response it depending on this header (when it's set)?Comment #79
gabesullice@jeqq, I think it was agreed in #66, #72 & #73 that the Relaxed issues need to be addressed in a separate follow-up issue.
Relaxed appears to have very specific requirements which will prevent this fix from landing and "un-breaking" the more standard sites that were broken by this bug fix.
I think the separate problem that needs to be addressed for Relaxed is: "how to do media type negotiation without the
_format
query parameter." That is beyond the scope of this issue, however.Comment #80
Wim LeersIt makes sense they can't work: it's impossible to determine a reliable default for them. This is the first time this is being mentioned though.
We have an issue for exactly that: #2852587: Allow REST resource config entities to specify a default format. Please help push that issue forward :)
No stable release of Drupal 8 has ever supported this.
I never said that; I actually said you can't rely on the
Accept
header, and that's exactly the thing that D8 has never supported. I also provided the relevant link: https://www.drupal.org/node/2501221. You can implement it for just the RELAXed module routes, assuming they're on paths not shared with no other controllers, and assuming there are never anonymous, non-HTTPS requests/responses. See\Drupal\accept_header_routing_test\Routing\AcceptHeaderMatcher
. Or, you can do something like what the JSON API module did: a hardcoded path prefix which can be checked very early in request processing, which can then be used to set the appropriate format: #2831137: Remove the need for ?_format=api_json: assume the use of the 'api_json' format for routes managed by JSON API.Comment #81
jeqqI agree with #79.
Wll take a look.
I wanted to check that but couldn't find the link. :)
Will also check the solutions Wim proposed. Thanks!
Comment #82
Wim LeersCool :) Thanks for being so very diligent, @jeqq! And I'm sorry for being a bit impatient — it's hard to walk the line/find the balance between helping dozens, likely hundreds, possibly thousands of sites using the core
rest
module ASAP and also covering the needs of the contribrelaxed
module. Thanks for being so understanding, and for helping us arrive at a better solution!Comment #83
larowlanCouple of minor observations
is there a chance that RouteCollection could be empty here? i.e should we be checking that [0] exists before we call methods on it? (defensive programming)
should me make an explicit protected method for this?
Would be more obvious what is going on? Could be public - I'm guessing we have the same logic in multiple places
any reason why we didn't use
foreach
here? Would aid readability.Comment #84
catchBumping back to CNR for those points.
Comment #85
Wim Leers\Symfony\Cmf\Component\Routing\NestedMatcher\NestedMatcher::matchRequest()
throwsResourceNotFoundException
if the route collection is empty.format.*'\|'
in*.php
files.For points 2 + 3, the easiest solution was to almost completely refactor @polynya's code, but retain the same fundamental logic. An
array_reduce()
to get a list of all formats of all routes in the collection, then turn it into a set, if the set contains only one format, then we can use that as the default. Sounds simple enough hopefully? :) Much smaller too.Comment #87
dawehnerThe new code looks quite a bit like you started to learn functional programming :)
I think you better use
reset()
here. array_filter doesn't reset keys, seeComment #88
Wim LeersHah, I just was working on an updated patch, and noticed the same :) Did I mention I very much miss a
Set
data structure in PHP? :(Is this meant as "yay FP" or "you're obviously getting started with FP"? I'm fine with either/both, just curious now :D I've been doing this style of logic for this kind of problem for a while now.
Comment #89
dawehnerI'm sitting in a corner crying everyday that the JS Set doesn't implement union and diffs, but yeah at least it implements the Set semantics :)
I made the experience that many problems solved in functional style has less ugs than solved in imperative way. You can decide for hay or nay :)
Comment #90
vivify CreditAttribution: vivify commentedEdit: Nevermind. I assumed latest patch 88 contained all previous code.
Patch only worked on the preview for me. I preview with contextual filters, go to the path provided in the preview. Page is a 406, but shows json in the preview.
Comment #91
alphex CreditAttribution: alphex commentedApplying patch #88 with composer is generating bad files.
creates the directory b/core in the .../core directory.
Comment #92
larowlan@alphex that is a reported issue in the latest version of composer-patches
Comment #93
larowlanAdding review credits
Comment #97
larowlanFixed on commit
Committed as 41dfad3 and pushed to 8.6.x.
Cherry-picked as 3f955a3 and pushed to 8.5.x.
Comment #99
b.livonnen CreditAttribution: b.livonnen commentedNot working for a pure headless implementation (8.5.5)
I had to put 'json' as the default format
Comment #100
Wim Leers@b.livonnen: Your "pure headless" routes are probably not specifying a
_format: json
route requirement.