Problem/Motivation
Requests have their query parameters overridden by RouteProvider::getRouteCollectionForRequest() (here) when a route collection is matched in the cache. This is problematic if the cache entry was saved during a subrequest which alters the original request, specifically DefaultExceptionHtmlSubscriber, which sets two internal query parameters (_exception_statuscode and destination) for the error route.
This is problematic in so far as it's not intuitive behaviour (though this dates back to some early pre-8.0 work; see #2480811: Cache incoming path processing and route matching) but more importantly that the query parameters of later requests are overridden with those in the cache. For a form submission, this would mean the destination is overridden and incorrect, but in the case of a jsonapi path, it also means you get:
Drupal\\Core\\Http\\Exception\\CacheableBadRequestHttpException: The following query parameters violate the JSON:API spec: 'destination', '_exception_statuscode'. in Drupal\\jsonapi\\EventSubscriber\\JsonApiRequestValidator->validateQueryParams()
For instance, a common pattern in OAuth/JWT workflows is that a client will make a request with an existing access token; if it's unsuccessful (e.g., returns 401 - see #2840205: Error messages/codes should be more helpful & match spec.) then the client will attempt to obtain a new access token with a refresh token grant, and retry the original request. In this case, the retried request will fail with the above exception against a jsonapi endpoint as the now-authorized request will have the non-JSON:API spec compliant query string added.
Steps to reproduce
- As an unauthenticated user, request a json:api resource requiring authentication (e.g., a user.)
- Receive a 401 access denied error (expected.)
- Authenticate.
- Request the same resource; the site will error, complaining of improper destination and _exception_statuscode query parameters.
- Clear cache.
- Request the resource, again; on a clean cache, the request will process successfully.
Proposed resolution
The query string recovery from cache has mostly to do with the private files controller; see notes below. However, this issue is sensitive to alterations to the Request object prior to the cache ID generation for the collection cache, earlier even than incoming path processing.
Current thoughts are to not cache the match if we're doing a subrequest (though we currently lack that context as that's information stored in the kernel) or set the format for json:api requests sooner, so as to keep the html exception handler from firing to begin with. Or, something else?
Adjust the route collection cache cid to include the query parameters stored on the Request object after early processing.
Remaining tasks
Subsystem maintainer/core committer review.
User interface changes
None
API changes
None.
Data model changes
None.
Release notes snippet
Not necessary?
| Comment | File | Size | Author |
|---|
Issue fork drupal-3085360
Show commands
Start within a Git clone of the project using the version control instructions.
Or, if you do not have SSH keys set up on git.drupalcode.org:
Comments
Comment #2
bradjones1Maybe (?) related to #2958660: 4xx Recursion issue on non-existent html routes
Comment #3
bradjones1It's also possible we can remove the query parameters being set into the cache value, as we won't be using them any longer when loading. But I suppose it's possible someone, somewhere could be depending on that info?
Comment #4
bradjones1Comment #5
bradjones1Comment #7
bradjones1Looks like this broke the right kinds of tests, about 7 distinct failures with most of them related to return codes. I'll dig deeper.
Anyone with an insight into the origin of this behavior, please pipe in?
Comment #8
bradjones1Ah - ok. So the explanation of "why" is on line 173, a comment before storing query parameters:
Incoming path processors may also set query parameters.This makes sense in light of the test failures; they have to do with private files which (because of a Symfony routing constraint) put the requested file in a query parameter inside ofDrupal\system\PathProcessor\PathProcessorFiles.(As an aside, I wonder if the above query-string handling for private files is still necessary, from a read of the Symfony docs and the code. But another issue for another day.)
Fair enough, so what now? A little more explanation of how this error occurs in the context of json:api; unlike REST module or pretty much any other content-type negotiation in core, json:api uses Symfony's HTTP negotiation middleware and an early format setter to set the request format to
api_json. The json:api spec says clients should pass a corresponding Accept header, but it turns out after registeringapplication/vnd.api+jsonas an available format, the request format gets set inEarlyFormatSetteraccording to the route being accessed. Pretty clever (if it did take a while to figure out.)The docblock on the early format setter actually mentions wanting to run before others so those other filters' exceptions can be returned in json:api format, but this is all too late for an exception/subrequest run during authentication. This request gets negotiated as html (since it's format-setting happens a few lines down in the Router), and so
DefaultExceptionHtmlSubscriberdoes its thing, including the subrequest with troublesome query parameters added and cached, per the OP.Whew!
Comment #9
bradjones1So I think the options here are either: A) actually use the Accept header for json:api requests to set the format in the content negotiation middleware, so we get it sooner (but it's still optional in the sense it gets set - again - per note above when the route is matched) or B) only use the route collection cache on master requests. I suspect option B is less disruptive in the sense that it doesn't wade back in to the content negotiation morass that has consumed a lot of time over the years.
Any other ideas?
Comment #10
wim leersWoah, interesting that you found the need to modify core's routing system for this! Will try to look into this soonish.
Comment #11
bradjones1Comment #12
bradjones1Proposed fix in #3028501: Enable header-based proactive content negotiation with optimizations and opt-outs available..
Comment #13
bradjones1Re-opening this as the proposed solutions elsewhere would not address this in all circumstances.
Comment #14
bradjones1New approach: set an attribute on the HTTP exception subscriber's sub-request to indicate it should not be set to the cache. Add this const to the Route Provider class. Similar pattern to how the request sanitizer sets an attribute on the request. We can't target sub-requests overall because the Route Provider doesn't have access to the HTTP kernel's context, so we use the request itself.
Comment #15
bradjones1Comment #16
bradjones1Removing an unused import, but doesn't change functionality.
Comment #17
bradjones1For real this time.
Comment #18
bradjones1Added a test, and a test-only patch for validation.
Comment #20
bradjones1Cool, ready for review. I think it's worth reiterating that an alternative/addition would to change the API and or adjust the kernel to make sub-requests not set the route cache as a broader rule, but I'm not sure if that's too much.
Comment #21
bradjones1More rubber-ducking on alternatives; I don't think the kernel is an option for moving this out of a flag on the request, in so far as in my testing the reason why the cache gets poisoned is because other modules (e.g., metatag) may do routing against the request, for instance to create Urls, and this in particular uses the access-unaware router which then calls for the route collection matches. So I think we're left with something like the above patch.
Another option of course is to stop adding these query parameters on the request, but that would be a BC-breaking change to expected behaviour.
Comment #23
bradjones1Talked with Tim Plunkett at DrupalCon Amsterdam; he thinks that it would be appropriate to change the cid generation to include not [just] the query string from the original request but the query parameters including those set programmatically (which seems to be the main issue of poisoning) which would be more elegant. Needs a demonstration via a failing test (though not sure how to handle that with only core components, but we'll see) and I'm removing the jsonapi tag since it's not really a jsonapi issue.
Comment #24
bradjones1New approach - the enclosed patch adds to the route collection cache CID the route object's query parameters, in addition to the query string. Thanks @timplunkett for the suggestion!
I don't think this will significantly change the total size of the cache, though it does lengthen the CID for any path which originally contains a query string. I am open to suggestions on a better way to achieve the intent here, which is to capture changes during incoming path processing. Most of the changes in the patch are for new test coverage.
Does this need an update hook to clear the route collection cache?
Comment #25
bradjones1Oops - had included test coverage from another patch as well. Cleaned up here.
Comment #26
bradjones1Ah - actually the route provider API in play changed a bit in #3007669: Add publishing status to path aliases which doesn't change the problem space at all but does make it slightly more straightforward to accomplish our CID adjustment. I think this also underscores how the route collection cache is in play in other parts of Drupal.
The patches in #25 are valid against 8.8.x (I'll re-queue- but these should apply to 8.9.x, due to the now-resolved merge conflict with the changes in the above issue.
Comment #29
bradjones1Query parameter bag values can be non-scalar, so adjusting for that in the cid value.
Comment #30
bradjones1Comment #31
bradjones1Comment #34
bradjones1Adding missing parenthetical in cid construction.
Comment #36
bradjones1Updated existing kernel test to new cid pattern, and expand test coverage to multivalue query params that are converted into arrays.
Comment #38
bradjones1Added an array_filter() around implode in
::getQueryParametersCacheIdPart().Comment #41
bradjones1Comment #42
bradjones1Comment #43
bradjones1Needs a re-roll apparently.
Comment #44
bradjones1This is also 9.1.x at this point.
Comment #45
avpadernoComment #46
ravi.shankar commentedHere I have added re-roll of patch #41.
Comment #47
bradjones1Thanks, Ravi - did you find this issue because you ran into this issue, or just helping with re-rolls? I'd like to get feedback from others who have been bitten by this bug, to help demonstrate its importance. Thanks!
Comment #48
ravi.shankar commentedHi @bradjones1,
I just added a re-roll of the patch.
Comment #49
matt_paz commentedI'd encountered the same issue. While I wasn't using subrequests in my case, I did encounter the same outcome.
The patch from #41 when applied to Drupal 8.8.2 seems to have helped! Will advise if anything new crops up on this front.
Thank you @bradjones1 for the deep dive on this -- diagnosing it and proposing a solution.
Comment #50
bradjones1Great, thanks. FWIW I do not believe subrequests is a prerequisite for reproducing this... though I'm sure it's one way to do so.
Comment #51
bradjones1This does not need a re-roll but it does have some new-ish test failures, probably related to Drupal 9 changes/removals.
Comment #52
bradjones1Indeed, this is due to a change in Symfony 4: https://github.com/symfony/http-foundation/commit/aa64c6c805c45a824ae9a6...
So this would need a slight change (basically apply #41) to be backported to Drupal 8.
Comment #53
bradjones1Also maybe exposes a small hole in test coverage relating to square-bracket query strings, since there weren't any in that suite before. I don't think it changes much of anything for core other than cache IDs, though.
Comment #55
bradjones1Comment #56
tipit commentedHi @bradjones1,
I think I'm fighting with the same issue in this issue.
To test the patch or even trying to debug the issue, can you provide the steps how to reproduce the error?
I have no idea how to proceed and this really seems to be a complex issue.
Comment #57
tipit commentedAnd if it's even the same issue, we can close it as a duplicate.
Comment #58
tipit commentedI closed my issue because it tested the patch from #41 of course in production and it seems to fix the issue in 8.8.5 core. No idea why, but no errors appeared in the log during couple of days now.
Comment #59
bradjones1@TipiT - There is a test in this patch to reproduce. Glad it helped.
Comment #60
nigelcunningham commentedHi all.
I have reproduced this issue via an Ajax call, on a site using my config_entity_revisions module, which calls \Drupal::service('router')->matchRequest(\Drupal::request(), TRUE) from a form alter hook, triggering the issue. (Line 42 of webform_revisions.module, if it's any help).
Regards,
Nigel
Comment #61
bradjones1Nigel - are you saying this patch fixed that issue, or that it needs more work?
Comment #62
nigelcunningham commentedI'm saying it fixed the issue, thanks.
Comment #63
nigelcunningham commentedComment #64
bradjones1Needs fresh tests against 9.1.x and a likely re-roll
Comment #65
bradjones1Re-roll for 9.1.x.
Comment #67
ridhimaabrol24 commentedTrying to fix testcases in #65.
Comment #69
rajandro commentedAdding fix for the coding standards violation.
Regarding test case failure => It looks like the test case is fine but the fix seems to need to check for the query parameter.
Comment #71
josephdpurcell commentedI'm on Drupal core 8.9.6 and have applied patch #41 and am still seeing the error described in the description, specifically a dynamic page cached 400 response from /jsonapi/user/{entity} with message "CacheableBadRequestHttpException: The following query parameters violate the JSON:API spec: 'destination'". I am unable to reliably reproduce the issue.
Comment #72
josephdpurcell commentedI think I've identified steps to reproduce:
1. Make an unauthorized GET request to the user endpoint /jsonapi/user/{entity}
2. Notice the response is a 401 with Content-Type text/html
3. Make an authorized GET request to the exact same user endpoint /jsonapi/user/{entity}
4. Notice the response is a 400 with Content-Type application/json
Comment #73
josephdpurcell commentedI've also identified this: if I get the request to bypass lines 138-142 in docroot/core/modules/dynamic_page_cache/src/EventSubscriber/DynamicPageCacheSubscriber.php for one request, the next inbound request will render correctly. I assume this is simply because the dynamic page cache gets rebuilt on the request that bypasses it.
This leads me to conclude the request that gets the 401 response (steps 1 and 2 in my previous comment) is building a dynamic page cache value behind-the-scenes that gets returned as a 400 when a valid request comes in (steps 3 and 4 in my previous comment).
I'm assuming what I'm describing here still aligns with the intention of this ticket.
Comment #74
josephdpurcell commentedI've isolated the cache to "route:local.example.com:en:/jsonapi/user/123456ff-123f-123f-123f-123f123f123f:". Whatever is generating this route cache is doing so in a way that is adding "destination", here is an example cache value:
a:3:{s:4:"path";s:50:"/jsonapi/user/123456ff-123f-123f-123f-123f123f123f";s:5:"query";a:2:{s:11:"destination";s:1:"/";s:21:"_exception_statuscode";i:401;}s:6:"routes"....morestuffAnd, I've isolated that issue as well. That route cache gets populated when metatag module is installed. I noticed this by the fact that if I put a return statement in
metatag_preprocess_htmlandmetatag_page_attachmentsthe problem goes away. To be absolutely certain, I did an additional test of disabling the module entirely and noticed the problem went away.Here's how I'm testing this locally:
1. Clear out relevant caches
2. Make a GET request to /jsonapi/user/123456ff-123f-123f-123f-123f123f123f with an invalid authorization, e.g. "Authorization: Bearer 123"
3. Start a step debug session and put a breakpoint at line 182 of docroot/core/lib/Drupal/Core/Routing/RouteProvider.php in the getRouteCollectionForRequest() methodwhere it populates the route cache and make sure $query_parameters is empty.
Alternatively, you can simply query the route cache table to see if you get a result that has "destination" and "_exception_statuscode" specified as query params:
Important to note that I have simple oauth module installed.
I can't outright disable metatag module. That module is important to site functionality. So, my next step here would be to figure out the following: Why does an invalid authorization header to /jsonapi/user/{entity} generate an incorrect route cache entry in cache_data when metatag is installed, but when it is not installed no route cache entry is generated at all? The route cache entry is only generated on a successful request.
Comment #75
josephdpurcell commentedI confirmed that by just enabling metatag the issue does not occur. However, it does occur if I enable metatag AND add the metatag field to the basic page content type. The basic page content type is the node type that renders the 401 result. Maybe the path I need to go down here is to ensure /jsonapi/ routes do not return Content-Type text/html on a 401, i.e. don't render a node?
Comment #76
josephdpurcell commentedOk, I have more insight.
The issue described in this ticket seems to be reproducible any time a request is made to /jsonapi/user/{entity} that results in an exception being thrown in the SimpleOauthAuthenticationProvider here: https://git.drupalcode.org/project/simple_oauth/-/blob/8.x-4.x/src/Authe.... And, metatag must be enabled with a field on the node that gets rendered on a 401.
When an exception happens, the DefaultExceptionHtmlSubscriber from core will add "destination" and "_exception_statuscode" to the sub-request it makes, see https://git.drupalcode.org/project/drupal/-/blob/8.9.x/core/lib/Drupal/C....
It's this sub-request that seems to be generating the invalid cache. Specifically, the logic hits core's RouteProvider here https://git.drupalcode.org/project/drupal/-/blob/8.9.x/core/lib/Drupal/C.... Which, sees the "destination" and "_exception_statuscode" in the query parameters and stores them in the cached route definition in the "cache_data" table.
Then, on a subsequent request that is valid, the route definition loads from cache with the "destination" and "_exception_statuscode" query parameters which then gets run through the JSON API validation and declared as a violation here: https://git.drupalcode.org/project/drupal/-/blob/8.9.x/core/modules/json....
I'm not sure where to go from here. Is the issue I'm seeing the same root cause described by the ticket? Is the issue actually with metatag module? Is it with JSON API module? Is it with route caching? This all seems highly interconnected to me; I can't quite pinpoint what I would declare as the exact issue that needs fixed.
Side note: When testing, its important to make sure dynamic page cache and the route cache are both cleared. If you just clear the route cache, but still have a dynamic page cache, you won't reproduce the error.
Comment #77
josephdpurcell commentedI've tested and confirmed that this hook will avoid the issue:
This avoids the route cache being poisoned with the destination and _exception_statuscode query parameters from a 401 request. I don't like this workaround, so I'd like to keep digging.
I see the token metatag is generating is for
[current-page:url]. This line at 761 in tokens.tokens.inc gets called:A question I keep coming back to is why JSON API doesn't handle the exception? I see that stepping through core/modules/jsonapi/src/EventSubscriber/DefaultExceptionSubscriber.php the isJsonApiExceptionEvent() returns false. So, maybe the root issue is how simple_oauth is throwing exceptions on JSON API routes? I see core/modules/jsonapi/src/Routing/Routes.php::routes() will set the JSON_API_ROUTE_FLAG_KEY flag and set _format to api_json, and these are missing from the request. Maybe simple_oauth is doing something to prevent this from happening?
Comment #78
josephdpurcell commentedI have found what, at this point, I would consider the root problem: Authentication runs before route filters. This means that Simple OAuth can throw an exception before the inbound request gets labelled as JSON API format. This, then, means the \Drupal\Core\EventSubscriber\HttpExceptionSubscriberBase `onException` subscriber will catch the exception and render an HTML response.
I discovered this by observing that when the request gets to SimpleOauthAuthenticationProvider, the request "_format" attribute is not set. This means, $request->getRequestFormat() will return 'html'.
I'm thinking a solution to this problem could be focused on one of three areas in the code, each would result in a different change and philosophy about this problem:
1 - The issue is with \Drupal\jsonapi\Routing\EarlyFormatSetter. Determining the request as a "JSON API" request or not should not happen in a route filter. It should happen sooner in the request stack.
2 - The issue is with \Drupal\jsonapi\EventSubscriber\DefaultExceptionSubscriber. It should not rely on determination of whether the request is a "JSON API" request solely based on whether the inbound request has the format of "api_json" set.
3 - The issue is not in how we're determining the request format. The issue is that JSON API needs its own authentication exception handler that catches anything on a JSON API route. Assumedly, this exception handler would know that route filters had not been run, therefore it would need to do its own determination as to whether the incoming request is JSON API or not.
Options 1 and 2 would require some sincere thought about the architecture of request handling specific to authentication and route filters.
Option 3 would be straightforward to implement, either in core or a separate module like JSON API extras.
Since I need a quick solution here, I'm going to focus on 3 and submit a proof-of-concept patch for that just in case someone else comes along and encounters this issue.
Comment #79
crasx commentedI worked on the above investigation with Joe
Comment #80
josephdpurcell commentedYes, to @crasx' point, it was talking through with him that we figured out that requests get flagged as JSON API in a route filter, and that route filters run after authorization. MANY many thanks, @crasx!
And, I have a working solution! Here it goes:
mymodule.services.yml
src/EventSubscriber/JsonApiExceptionSubscriber.php
(see attached file for exact example code)
I'm not so sure I've got the logic here written in the best way, but it is working, so I'll consider this an OK proof-of-concept.
I've said LOTS of words on this ticket. I hope they are helpful. Does anyone else know enough about the original issue described in this ticket to confirm if everything I've been writing about aligns with this ticket? And, does anyone agree with the diagnosis or have a thought on a permanent solution?
Comment #81
nickmans commentedThanks @josephdpurcell for all your hard work on this issue. I managed to get your suggestion from #80 to work, which is a great help for me.
I changed a few things, because the GetResponseForExceptionEvent is deprecated:
Comment #82
bradjones1@nickmans a patch would be more helpful in discovering the changes you made? Thanks.
Comment #83
josephdpurcell commentedVERY IMPORTANT detail: the $router->matchRequest($request) call may throw an exception. Why is this important? It may turn all would-be-404 pages into 500 errors.
I would love to see a proper fix for this so that I'm not using a patch. I'll try to circle back on this if I can, but for now wanted to note that about the matchRequest call causing 500s.
Comment #84
josephdpurcell commentedI haven't tested this patch, but I think this represents the concept of what we're thinking here.
Comment #85
bradjones1@josephdpurcell - Thanks for chiming in... I'm a bit confused by your comment, though... are you saying that the patch in #84 is a replacement for this, or needs to be done in addition, or?
A little extra context would help me understand where you're coming from here.
Comment #86
josephdpurcell commentedClarity! Sorry, I definitely didn't explain.
Ok, the comments from #80 and #81 have "code samples" of how you can work around the issue described by this ticket in a custom module.
The patch from #84 replaces the need for those code samples by fixing the problem in the JSON API module directly. So, if you are using the code samples, you don't need the patch, and vice versa. And, if you have both applied, it shouldn't cause an error or anything, it would just be redundant.
At this point, perhaps the proper state of this ticket is Needs Review? I'll set it to that. I will point out that there is no test written at this point. I did end up testing this locally and will be using this in my project.
Comment #87
josephdpurcell commentedWhoops! I made the patch in the wrong place. Ok, this patch is for Drupal core 9.1.0.
Comment #88
josephdpurcell commentedThe patch from #87 is quite different than the patch from #41-69. I think what #41-69 was addressing was different than what #87 is fixing. Do these need to be merged together into 1 patch that addresses both?
Comment #90
bradjones1If you believe it's a separate issue, open a separate issue and link them as related. In particular, it would be helpful to (and basically no patch will be committed to core without) include a test that demonstrates the failing condition which the change fixes.
Comment #91
bradjones1Comment #92
josephdpurcell commentedI agree this needs a test. I'll put back to Needs work and add needs test tag.
There is something about calling $router->matchRequest($request); here that is causing the existing tests to fail. I think we can work around that by duplicating what \Drupal\jsonapi\Routing\EarlyFormatSetter is doing (which is what is required to determine whether the inbound request is JSON API or not). I don't like this approach because it results in duplicate code, but it works, and hopefully helps us triangulate a better approach.
Comment #93
josephdpurcell commentedOk, I *think* I have a working test. In order to test, I created an auth provider that throws an auth exception. Without the fix in place that we're talking about, the test should error saying that the returned content-type is "text/html; charset=UTF-8", when it should be "application/vnd.api+json". And, this aligns with what we're seeing: the DefaultExceptionSubscriber isn't saying the exception should be JSON when it should!
I've created two patches, one with just the test, the other the test and the patch. So, we should see the test-only patch fail, and the test+code patch succeed.
Comment #94
bradjones1I apologize but I'm still not following here - your patches are not derivatives of the patches worked through #69 and are specific to the jsonapi integration in core; I sincerely feel this is better worked out through a separate issue. If they end up being merged at some point, let's do that, but I think there are two separate threads going here since comment #70 and it makes reviewing a little more difficult. I still don't see anything in the latest patches that indicates this issue is limited only to jsonapi.
I haven't been able to read through all the recent history in depth but I suspect that your fix may address this problem in so far as it applies in your jsonapi use case, but doesn't address it in the routing system at a broader level.
Thoughts?
Comment #95
bradjones1Comment #96
josephdpurcell commentedAgreed! And, I'm happy to take any advice and guidance! Ultimately, I'm just hoping to get the issue I'm observing resolved in Drupal core.
In my previous comments I had tried to identify the root cause, and I think I've boiled it down to:
A) The fact that route filters do not run before authentication, which causes JSON API's EarlyFormatSetter to not be run, which causes the inbound request format to not be set when there is an exception during authentication.
B) The fact that JSON API's DefaultExceptionSubscriber assumes EarlyFormatSetter had already been run.
C) Some other generalized problem in the routing system I haven't identified yet.
I agree to let this ticket be for discussing A and C (any other generalized root problem in the routing system), and split B out into its own ticket since the solution is JSON API specific. I think that makes sense to me!
I'll work to pull out comments and patches from #70 onward to a new ticket. This ticket should then roll back to comment #69 for the most recent / relevant conversation. I'm so sorry to have cluttered this ticket with JSON API specific things!
Comment #97
josephdpurcell commentedOk, I moved the issue I was describing over to #3187763: Use api_json format on exception responses for api_json resources. It's worth noting that the pre-#70 patches and the patch I'm attaching now does not also solve #3187763.
Looking back through the pre-#70 comment patches, I see this:
* #65 has the latest reroll, which is 3085360-route-provider-cache-query-string-65.patch
* #67 fixes the patch from #65, specifically the need for the return type declaration on setUp, this is 3085360-67.patch
* #69 fixes a code style issue from #65/#65, this is 3085360-69.patch
It took a few iterations over the tests to get them passing locally. I will point out that it looks like the cache ID has the query string duplicated. For example, instead of [query_parameters]=foo=bar it has [query_parameters]=foo=bar:foo=bar. I didn't know if there was a special reason for that so I left it.
Comment #101
bradjones1Comment #103
benjifisherI am setting the status to NW for an issue summary update. I will add the tag for that. While I am at it, I will remove the tag for a reroll, since that seems to be out of date.
I like this line from the Proposed resolution section of the current summary:
but I am not sure what to make of the rest of it.
Based on the line I quoted, and a quick look at the non-test portion of the patch, I think this issue is about cache IDs that are not specific enough, so we are fetching the wrong data from the cache.
Currently, when you create a new issue, the template includes a "Steps to reproduce" subsection, and the instructions include
From reading the current issue description, I cannot tell what the steps are to reproduce. In principle, I could read through the test portion of the MR and figure it out. But I am much more likely to review this issue if you do that work for me. I think the same is true of other potential reviewers. In principle, the steps for this issue should be pretty simple: make one request, then a second one, and get the wrong results.
If you can give two scenarios (form request and JSON:API endpoint), that is even better.
If the background information is complicated, then add additional subsections to the Problem/Motivation section.
I often quip that writing documentation is easy: provided that the only person reading it will be the person who wrote it. The same idea applies to bug reports.
Opinions differ on what belongs in this section. Some people routinely add "Patch, Review, Commit" as a bulleted list when creating a new issue. My opinion is that this is not helpful: it duplicates the information in the issue status. I typically leave this section blank for a new ticket, and update it as work progresses to show what needs to be done to get the issue ready for review.
Also, subsystem maintainers are not the only ones who can review issues. The "C" in RTBC stands for Community. Anyone who feels competent to review an issue (and who has not contributed to it) can promote an issue to RTBC. And my parenthetical caveat is not an absolute rule.
Comment #106
tim.plunkett@bradjones1 can you (or a core committer!) edit the MR to be based on 9.3.x? Thanks!
Comment #107
bradjones1@tim.plunkett, anything for you, I'm honored just to have you in my little issue!
Comment #108
bradjones1Rebased MRs.
Comment #109
bradjones1Comment #110
bradjones1Looks like mostly deprecation errors; I'll address.
Comment #111
bradjones1The test only branch will have no changes (since there was no changes here to the tests) so this is back to needs review.
Comment #112
bradjones1Comment #113
bradjones1Rebased the main and test-only MRs for 9.4.x.
Comment #114
bradjones1Only 65 tests run? Hmmm...
Comment #115
longwaveDrupalCI is failing to upgrade PHPUnit, so none of those tests are running:
Comment #116
avpadernoWould testing on PHP 7.3 avoid those errors?
Comment #117
longwaveNot really sure that's the problem, it seems DrupalCI thinks it is installing against 9.3.x, but the branch is set to 9.4.x?
Root composer.json requires drupal/core 9.3.x.dev, found drupal/core[9.3.x.dev] but the package is fixed to 9.4.x-devComment #118
bradjones1The retests are working now. Shrug emoji.
Comment #119
bradjones1Well, the test-only branch tests correctly but the main branch does not. Asked for help in Drupal Slack from the infra team.
Comment #120
avpadernoAt least, with PHP 7.3, the run test are 28971.
Comment #121
bradjones1Adding explicit reproduction steps.
Comment #122
kristen polThanks for the steps to reproduce! Tagging for testing.
Next steps:
1. Code review.
2. Manual testing.
Comment #123
kristen polPatch applies cleanly to 9.3 and 9.4 and with offsets for 10. It still needs testing per the steps to reproduce in the issue summary.
Comment #124
bradjones1Is the requirement for manual testing coming from anything in particular unique to this issue? There is broadened test coverage in the MR which effectively does the same as manual testing would? Just trying to reduce roadblocks.
Comment #125
longwaveAgree this doesn't need manual testing, the automated test coverage is comprehensive and covers it better than a manual tester would.
I reviewed the MR and there's a number of bits of feedback that need addressing.
Comment #126
bradjones1Thanks for the clarity and the review; doesn't seem to be anything major so I'll try to revisit this week and address.
Comment #127
kristen polThanks @longwave and @bradjones1. I'm just in the habit of tagging for testing if a bug can be tested but I defer to @longwave on this one :)
Comment #128
bradjones1There are two MRs, one is test-only and the other includes the changes and the test. FYI. Not entirely sure if "test-only" is still a thing, but I haven't been updating the test-only MR unless it no longer applies. So we'll need to update the "main one," which is MR 183.
Comment #130
andregp commentedI tried to address @longwave comments on the MR.
Comment #133
needs-review-queue-bot commentedThe Needs Review Queue Bot tested this issue. It either no longer applies to Drupal core, or fails the Drupal core commit checks. Therefore, this issue status is now "Needs work".
Apart from a re-roll or rebase, this issue may need more work to address feedback in the issue or MR comments. To progress an issue, incorporate this feedback as part of the process of updating the issue. This helps other contributors to know what is outstanding.
Consult the Drupal Contributor Guide to find step-by-step guides for working with issues.
Comment #136
bbralaCommit was because of this discussion: https://drupal.slack.com/archives/C5A70F7D1/p1694441795848429
With this patch there were issues with symfony 6. Last commit was to fix that.
Comment #137
bradjones1Comment #138
joseph.olstadGreat work on this one, it makes sense from a high level. Functionality appears to be covered by tests.
One more thing that the core maintainers might like to see is a tests only run, I was about to do this but have been summoned elsewhere.
Comment #139
bradjones1There is a test-only MR that is a bit out of date but the previous experience with it at least demonstrates what the issue is. I can put in the work to update it if need be, however this subsystem is very low-change so hopefully it's clear enough from the issue history and prior runs.
Comment #140
joseph.olstad@bradjones1 , it looks like @longwave comments have been resolved.
Comment #141
bbralaI've rebased the test only issue, cause i kinda want to see a fresh run of that as part of the review. Asked for a retarget in @contribute.
Comment #142
bbralaThere is still a few open comments on the main MR unfortunately, also awaiting retarget of test branch so we can have a recent test only run. (although that is almost automated on gitlab ci ;))
Comment #145
bradjones1Rebased and updated the cache ID creation logic to ensure it's normalized recursively with any nested arrays.
Raised this PR against Symfony upstream to do so out of the box in the Request object, but I won't hold my breath.
Optimistically marking this NR again.
Comment #146
bradjones1Comment #147
bbralaAl issues are addressed. Code still looks good. IS is still correct. Changerecord is not needed. All seems ready for commit.
Comment #148
catchSomehow I haven't seen this issue before.
My first thought was 'why do we allow incoming path processors to set query parameters?' and then I re-read the original issue that added the caching, realised it was opened by me, and it all came back.
So the history of this bug is something like this:
1. The router was rewritten on top of Symfony very early in the Drupal 8 cycle, it was railroaded through and almost designed from the start to erode performance, scalability and security (see #1793520-7: Add access control mechanism for new router system and elsewhere for some background). A lot of things did not work with the initial commit, and only got found out about in some cases years later when we were still trying to release Drupal 8 (like e.g. the entire menu links system which didn't really start getting fixed again until about 2014).
2. More than year later, the system files controller got converted #1987712: Convert file_download() to a new style controller.
3. PathProcessorFiles got added to deal with the fact that symfony doesn't like
arbitrary/path/parts/when/looking/up/potential/routes4. We then added caching on top of of this, which had to take into account what PathProcessorFiles does to the query string in #2480811: Cache incoming path processing and route matching
5. I think this is the first time any of 2-4 has been revisited since it was added.
As the issue summary says this is all a bit problematic.
I think my question is:
Could
PathProcessFilesset a request attribute instead of a query string, which we then retrieve inFileDownloadController? Then we'd need to save the request attribute(s) in the cache item instead of the query string, but because that's not overwriting information from the request it should be fine. For that specific use case we could almost add a generic 'original request path' request attribute that preserves what came in before path processors run.If we completely stop adding the query parameters to the cache item, that might break some other code somewhere that's doing something similar to PathProcessorFiles, but could we possibly only do that if the query parameters differ after incoming path processing has run, and trigger a deprecation error if they do? Then we could stop supporting that behaviour altogether in 12.x then.
This is potentially all for a follow-up rather than here, but it feels like if the system files path processor had used a request attribute instead of a query string in the first place, we'd never have had this problem.
Comment #149
bradjones1Thanks for the review, @catch. I agree a request attribute would likely fix this - though the BC layer would have to look something like this MR anyway if we want it to actually work with caching. And given D10 is LTS, I think it's still worth "fixing" even if it contains a long-lived deprecation that gets cleaned up in D12. I agree it could be a second issue and worth cleaning up.
TL;DR, I think we still need this bugfix regardless if we decide to change the underlying bug, since it's entirely possible people are using query strings internally as cheapo attributes.
Comment #150
catchOpened #3450279: Use attributes instead of query string in PathProcessorFiles and stop replacing query string in RouteProvider::getRouteCollectionForRequest() for the follow-up. Haven't done a line by line review here so leaving this at RTBC.
Comment #151
kingdutchI just want to chime in here that the proposed
PathProcessoris exactly what we implemented in Open Social where I found the problems withqueryon cacheable requests (which is why I assume the private file system didn't suffer) and moved toattributesinstead.However, one issue we're seeing is that there is a difference between the initial uncached request and secondary requests that utilise the page cache. Specifically we're seeing that for the first request the attribute that's set is properly run through
Drupal\Core\PathProcessor\PathProcessorDecodebut for subsequent requests that doesn't happen. This causes the attribute to be inconsistently decoded or encoded which makes downstream code brittle and can cause errors.I'm not sure if that's fixed with the MR proposed here (I have to look at the code) but at least wanted to share that issue because maybe someone already knows how this MR might affect that behaviour.
EDIT: Looking at the code I don't really expect this MR to change anything for the above. I don't know whether there would be a blocker for converting the private file system's path processor.
EDIT2: I think it will become a blocker for Private File system.
Using query to transfer the path:
The filepath query item is generated from the decoded URL which has
%20replaced with "". This query item is cached. The path is also cached and does not contain the original filepath since that's not part of the route match (it was captured in the processor as excess and moved into the query).On the next request the path is matched and the path (without filepath) is restored along with the URL decoded query which contained the filepath.
Using the attributes to transfer the path:
The filepath attribute item is generated from the decoded URL which has
%20replaced with "". This attribute is _not_ cached. The path is cached but does not contain the original filepath since it's not part of the route match.One the next request the path is matched and the path (without filepath) is restored. The URL decoded query is not present because the file processors haven't run and the attribute is not part of the cache. The attribute will no longer be provided to the controller. To work around this it'll have to reprocess the original URL. This was something I didn't understand before. This uses the raw URL which hasn't passed through any path processors so it must decode the URL itself.
Comment #152
kingdutchThis was shortly discussed on Slack where Bard added
And catch confirmed
So in that sense the private file system will likely need attribute caching in #3450279: Use attributes instead of query string in PathProcessorFiles and stop replacing query string in RouteProvider::getRouteCollectionForRequest() but that can likely happen as part of that issue and I don't see a reason to block this issue itself. We don't currently see a reason why attribute caching wouldn't work.
Happy to leave this RTBC (also after looking at the code) :D
Comment #157
catchReviewed this again. I think we should try to get the follow-ups done so that we can stop caching by query string again, but what's here is good, and the test coverage should ensure that any changes we make in the follow-ups don't break things unexpectedly.
Did my best with issue credit here but it's a very long issue and ctrl-f isn't working to find people's usernames properly for some reason.
Committed/pushed to 11.x and cherry-picked back through to 10.3.x, thanks!
Comment #158
bradjones1Thanks!
One quick clarification though:
We've always cached by query string... the change here is that we are caching based on both the original query string and any changes after processing. Honestly I think this would have to remain even if we change the original reason for this being a bug. For instance, I am setting some "default" query parameters in a path processor on my current project, which I think is a legitimate use of this pattern. (Instead of core stashing away some metadata derived from the query string.) So there's additional discussion to be had about pulling support for this altogether. But yes, follow-ups definitely appropriate. Thanks for closing this out! Bolstering my reputation as a necromancer.
Comment #159
marcofernandes commentedHi, I'm trying to apply a patch based on MR183 to 9.5.x but it's failing at TestRunnerKernel.php
Maybe related to the failed test on this commit?: https://git.drupalcode.org/project/drupal/-/merge_requests/183/diffs?com...
Here's the rejected hunk:
Comment #160
bradjones1@marcofernandes Commenting on a closed issue is unlikely to get you a response. Also, D9 is unsupported, so the general recommendation will be to upgrade first.
Comment #161
marcofernandes commented@bradjones1 Yes, I understand that 😉 but my comment was meant as a heads-up for people who are still in the 9.x and comes across this issue. The solution was just remove that rejected hunk.