The 'url' cache context is defined as requestStack->getCurrentRequest()->getUri(). However, this results in all 404 pages sharing the same cache context of https://example.org/system/404 rather than using the request URL.

The solution is to use requestStack->getMasterRequest()->getUri() to provide the originally requested URL for the cache context.

It's not unusual for sites to use different cached content (or redirect) for different 404 pages, so it would be useful to provide a finer-grained URL cache context.

We could modify the built-in core 'uri' cache context to use the original request URL, or, assuming the current implementation is desired, we could add a new cache context.

Comments

mfb created an issue. See original summary.

cilefen’s picture

Could you restate the question? You know 'url' is a cache context. But I am not sure exactly what you are asking about the url cache context, especially in relation to the 403 and 404 pages.

mfb’s picture

as far as I can tell, if an anonymous user goes to https://example.org/admin or any other "403" URL, the 'url' cache context will be https://example.org/system/403

and as far as I can tell, if an anonymous user goes to https://example.org/nothing-to-see-here or any other "404" URL, the 'url' cache context will be https://example.org/system/404

This means that many URLs share the same 'url' cache context.

cilefen’s picture

I think it means that all 404's and 403's have the same respective url cache context because a 404 or 403 is what is returned. Why wouldn't they? But they are special cases.

Picture a render array that creates a block. If the block output may be different on different URLs you would set the 'url' cache context.

You are observing that in the case of 404's and 403's the url cache context varies by response URL and you would like a way to vary the cache based on the request URL. Is that accurate?

cilefen’s picture

So, would node/1 which is path-aliased to my/page have the same url cache context is the question.

mfb’s picture

That's correct. I was able to write a CacheContext that does this, so that could be the solution if I'm the only developer who needs it.

But arguably it would be supported by core, as it's probably not unusual to offer different content, redirect, or whatever depending on the specific 403 or 404 URL requested?

cilefen’s picture

The bottom of the API page explains how to add new cache contexts, so you could offer a patch.

That said, assuming this is the right class, it looks like the UrlCacheContext uses the request URI:

public function getContext() {
  return $this->requestStack->getCurrentRequest()->getUri();
}
cilefen’s picture

I was able to write a CacheContext that does this

Oh...

mfb’s picture

when you getUri() for the current request, if it's a 403 or 404 request you don't actually get the request URL, you get system/403 or system/404.

To get the actual request URL I had to handle it as an unrouted URL, which you can see here: http://cgit.drupalcode.org/securelogin/tree/src/SecureLoginManager.php?h... - granted there might be a better way :)

cilefen’s picture

There could be other contrib modules that may need it. Maybe post a patch, call the context url.unrouted or something similar.

mfb’s picture

Category: Support request » Feature request

ok changing this to feature request.

For the time being, I put some working code in Secure Login module - http://cgit.drupalcode.org/securelogin/tree/src/SecureLoginCacheContext.... - but, if other modules are also caching based on the request URL then it doesn't make sense to live here :)

cilefen’s picture

I am tagging this "Needs issue summary update" and "Needs title update". After the title and summary are updated with the template and they explain the plan, those should be removed.

mfb’s picture

Title: how to add request URL as cache context? » Provide a request URL cache context
Issue summary: View changes
Issue tags: -Needs issue summary update, -Needs title update
fabianx’s picture

Assigned: Unassigned » wim leers
Category: Feature request » Bug report

I think it is a bug, the cache contexts always are used from the master request - or rather we have not really defined how to deal with sub requests ...

Assigning to Wim for feedback.

wim leers’s picture

Title: Provide a request URL cache context » 4xx handling using subrequests: no longer able to vary by URL
Component: cache system » request processing system
Assigned: wim leers » Crell
Priority: Normal » Major

or rather we have not really defined how to deal with sub requests ...

This is definitely true.


But I'm not yet fully convinced this is a problem. Let's walk through the problem.

  • /foo and /bar are 404s.
  • /foo and /bar are internally redirected to the system.404 route: a subrequest is made to that route and its response is returned. (See \Drupal\Core\EventSubscriber\DefaultExceptionHtmlSubscriber::on404().)
  • The concern is then (copied from IS): It's not unusual for sites to use different cached content (or redirect) for different 404 pages, so it would be useful to provide a finer-grained URL cache context.
  • Cache contexts are tied to the (request) context. Hence when processing a subrequest, they're tied to the subrequest. That makes sense.
  • Even though it makes sense conceptually, here it causes a problem. Because we're only using a subrequest as a means of executing a particular controller. But that implies we have a different request context when generating the response than the main request context. And that is problematic here, because we want to execute a certain controller in the master request context.
  • So… it seems like this is an incorrect use of subrequests, then?

Assigning to Crell for feedback.

dawehner’s picture

And that is problematic here, because we want to execute a certain controller in the master request context.

Can you explain why you think this is the case? The subrequest controller should be executed in the context of the subrequest.

wim leers’s picture

Can you explain why you think this is the case? The subrequest controller should be executed in the context of the subrequest.

I agree with The subrequest controller should be executed in the context of the subrequest.! But that is actually what the OP is claiming to be wrong.

We have a 404. Which means no route + controller is defined for this URL. But perhaps you want customize the 404 page based on the URL — a common case is performing a search, to hopefully guide the user to content the user is looking for. But that means that we're actually trying to generate a response for the master request. Which means that the response we're generating should actually be executed in the master request context, because it wants to react to the master request!
Hence using a subrequest is wrong in this case, because a response generated in a subrequest by definition cannot vary by the master request, and we want the 404 response to be able to vary by the master request context.
That's why I concluded So… it seems like this is an incorrect use of subrequests, then?

(I know I'm not explaining it great; I'm not finding the right words for this one.)

Crell’s picture

We discussed this a bit during the today's WSCCI meeting. We struggled a bit to come up with a concrete example that we could write tests around to verify this issue and a fix for it. What we came up with (suggested by neclimdul) was different types of 403. Eg, admin pages get a generic 403, unpublished nodes get a "this is archived" message, and some published nodes get a paywall "please sign in and give us money" message.

In that sort of case, I wouldn't say the subrequest is the issue. Rather, what you want then is 3 different 403 routes (which a custom module could do) and a custom exception listener that, based on some information, sent a subrequest to one of those 3 routes. Those routes then SHOULD be operating in the context of the subrequest, shouldn't they?

Is there a better concrete example here we can chew on?

wim leers’s picture

IRC transcript covering #18 and describing why #18's summary/conclusion are not wrong, but missing the point:

17:27:20 <Crell> I need to mentally walk through a for-reals case where the problem manifests.
17:27:52 <Crell> Like, which actual context is going to be different in the subrequest that is thus Wrong(tm)?
17:28:34 <dawehner> for example the current route context could be different
17:28:55 <Crell> Isn't that rather the case by definition?
17:29:22 <Crell> The 404 has no route context. That's why it's a 404. :-)
17:29:57 <Crell> EclipseGc: Do you have a PM-based example here?
17:30:12 <dawehner> Crell: well, the underlying route under the 404 might have, right?
17:30:33 <Crell> Have what?
17:31:04 <neclimdul> Crell: that would be true if NotFoundHttpException didn't break the assumption
17:31:23 <Crell> ?
17:32:04 <neclimdul> node/123, hook_something_or_other() , throw new NotFoundHttpException(); <-- has route context
17:32:21 — neclimdul is personally more interested in 403 though
17:32:49 <Crell> I'd probably slap whoever threw that exception, but… :-)
17:32:52 <EclipseGc> Crell: yeah, these examples are hard
17:32:55 <Crell> OK, so a 403-based example.
17:33:03 <Crell> EclipseGc: Yes, but necessary to write failing tests. ;-)
17:33:09 <EclipseGc> lol
17:33:30 <EclipseGc> so, I mean, custom landing page at /test
17:33:49 <neclimdul> Crell: we have different 403 for archived content, paywall denied content, and generic no access(like admin page)
17:33:57 <EclipseGc> page manager has 2 variants, 1 that succeeds for admin roles, and a 403 for everyone else
17:34:03 <EclipseGc> is this an example worth nuancing?
17:34:12 — Crell zeros in on neclimdul's.
17:34:18 <Crell> OK, that's more concrete.
17:34:26 timplunkettAFK → timplunkett
17:34:49 — neclimdul will be in trouble if that doesn't work
17:34:53 <Crell> So an unpublished node gives 403a, a paywall deny gives 403b, and non-nodes give 403c.
17:35:19 <Crell> (Not to be confused with retirement accounts.)
17:35:43 <EclipseGc> yeah, in page_manager I’d like to see 40x errors able to return custom panels for any time page_manager figures out it’s sending 40x
17:36:12 <Crell> So then… what you would really want there is a different route used for different classes of 403, no?
17:36:59 <EclipseGc> yup
17:37:08 <EclipseGc> that’s why page_manager generates route per variant right now
17:37:16 <Crell> Yay!
17:37:16 — Druplicon throws confetti
17:37:28 <EclipseGc> Crell: which btw, cmf needs a serious upgrade
17:37:29 <EclipseGc> but w/e
17:37:32 <dawehner> Crell: which shows btw. a lot of horrible problems in CMF
17:37:39 <Crell> So for neclimdul's case, what you'd want to do isn't propagate master request context to the subrequest, it's send the subrequest to a different place.
17:37:40 <EclipseGc> yup
17:38:17 <EclipseGc> though, we’re doing it via route filtering, so it’s all on the same path, just a different route
17:38:25 <EclipseGc> which gives us granular control over everything
17:38:27 <EclipseGc> all the time
17:38:28 <Crell> Which means adding your own exception listener and forking to a different subrequest based on whatever rules.  Which you can do now, as long as you either are given or can access sufficient information on which to branch.
17:38:54 <EclipseGc> Crell: in our case, even the 403 would be a custom variant
17:39:20 <EclipseGc> Crell: so while today we send an http exception, I think it more likely we’ll send a real response that’s just got the 403 code
17:39:39 <Crell> In that case, this problem doesn't exist for PM.
17:40:19 <Crell> Although, we're talking about routes here, not uris, and the original request was talking about uris.
17:40:23 <Crell> Original ticket, rather.
17:44:19 <Crell> So… would that answer be sufficient?  Is that information available (excluding PM) in an exception listener?
17:50:50 <Crell> neclimdul, EclipseGc, dawehner: ?
17:51:44 <neclimdul> i've got know idea what's going on actually and was just giving an example based on the title :-D
17:53:36 <Crell> :-P
17:57:35 <EclipseGc> Crell: I am on the same page as neclimdul except I have “no” idea
17:57:43 <EclipseGc> ;-)
17:57:56 <Crell> Right then.
17:58:05 <EclipseGc> dawehner: ?
17:58:06 — Crell will respond on the issue accordingly then.
18:12:44 <dawehner> i respond meh
18:26:20 <WimLeers> Crell: with your own exception subscriber you can indeed achieve anything
18:27:01 <WimLeers> Crell: dawehner neclimdul: EclipseGc But your discussion was about something other than the reported problem.
18:27:15 <neclimdul> faiiiiil
18:27:16 <neclimdul> sorry
18:27:32 <WimLeers> Crell: dawehner neclimdul: EclipseGc You were talking about pointing to different controllers (let's call it subroutes :P) based on the URL of the master request
18:27:35 <WimLeers> That's one thing
18:27:39 <WimLeers> The problem here is another
18:27:52 <EclipseGc> WimLeers: I’d love to pick your brain on something we’re trying to do in services. If you have a few minutes at some point
18:28:10 <WimLeers> What if whichever controller ends up being called wants to return DIFFERENT CONTENT based on the 404/403 URL?
18:28:16 <WimLeers> i.e. it's a different layer.
18:28:32 <WimLeers> You were talking about selecting a subroute based on the master request URL
18:28:44 <WimLeers> Of course that can render different things then.
18:28:46 <neclimdul> i think that's what I was describing...
18:28:54 <WimLeers> This issue is about rendering different things for the SAME subroute
18:29:07 <WimLeers> i.e. the ability for the subroute to react to the master request
18:29:10 <neclimdul> well, yeah eclipse said PM handles it with diferent routes
18:29:12 <WimLeers> … which makes no sense of course
18:29:25 <WimLeers> … which is why I'm saying the 4xx handling in D8 core should NOT use subrequests
18:29:32 <WimLeers> because it then loses the ability to render differently
18:29:35 <WimLeers> Consider blocks.
18:29:48 <WimLeers> Consider a block only shown for 4xx responses
18:30:01 <WimLeers> Consider that that block only wants to be displayed if the URL contains "bunny"
18:30:10 <WimLeers> *That* is where it breaks in HEAD.
18:30:28 <neclimdul> WimLeers: right, i think that's what i was descibing. I might want to say "upgrade you membership. call us here: 123-456-7890" or "This content has been archived. Find something new here: /search"
18:30:35 <WimLeers> neclimdul: okay, maybe you meant that, but the overall discussion was definitely about selecting a subroute rather than rendering differently
18:31:00 <WimLeers> neclimdul: well, it depends, you *could* achieve that by selecting a different subroute
18:31:21 <WimLeers> neclimdul: but the point is that *some* rendered things on 4xx responses may want to react to something in the master request
18:31:45 <WimLeers> which makes sense… because we're rendering a response for something that doesn't have an officially assigned controller/response
18:31:55 <WimLeers> which, again, is why subrequests make no sense

catch’s picture

Issue tags: +rc target triage

Tagging this for rc target triage.

I personally think we should try to figure this out during RC rather than after if we can, but hard to make a final call without seeing what a patch is like.

dawehner’s picture

We could modify the built-in core 'uri' cache context to use the original request URL, or, assuming the current implementation is desired, we could add a new cache context.

I'm really not convinced that this is the right idea. For a custom module I has the HTTP kernel to get data from other pages in Drupal. If we would change the cache context here, this would no longer work.

wim leers’s picture

Assigned: Crell » wim leers

Working on this.

wim leers’s picture

Assigned: wim leers » Unassigned
Status: Active » Needs review
StatusFileSize
new2.99 KB
new17.6 KB

Having started looking into this, I think I've already found the explanation for why this uses a subrequest in the first place.

The answer can be found here: http://symfony.com/doc/current/components/http_kernel/introduction.html#....

The Symfony request/response pipeline goes through these events:

REQUEST
CONTROLLER
VIEW
RESPONSE

But when an exception occurs, the EXCEPTION event is triggered… and at that point, whatever the exception controller returns, only the RESPONSE event is still invoked, NOT the VIEW event.

And… Drupal needs the VIEW event to transform a render array into a Response.

Consequently, it is impossible to not use the HTTP kernel to perform a new subrequest.


But! There is still hope:

  1. The above got me searching for "nested master request" and the likes. Apparently the master request is defined as "request zero on the request stack", so executing a new master request in the KernelEvents::EXCEPTION event handler for the original master request cannot possibly be without flaws.
  2. Then I was thinking "surely somebody in Symfony must be dealing with this too, right?" That's how I ended up at http://symfony.com/doc/current/cookbook/controller/error_pages.html#cust..., which pointed to \Symfony\Component\HttpKernel\EventListener\ExceptionListener as being the default KernelEvents::EXCEPTION event handler. That class does something mighty interesting: it duplicates the original request and then just overwrites the _controller attribute… which is pretty much exactly what we want: the original request, but with "enforced routing", to point to the route that contains the controller that is able to render a Response (or a value object that the KernelEvents::VIEW subscriber will turn into a Response). Note that this default exception event handler requires a hardcoded controller, i.e. it completely avoids routing!

So, there you have the answers we are looking for in this issue:

  1. Using a subrequest is not actually the problem.
  2. Using routing in the subrequest, to locate the controller that will generate a response is the problem. Because it causes the response to be generated in the request context of that "subrouting"/"subrequest routing", not of the original routing/"master routing"/"master request routing".
  3. So, the crux of the problem here is: we use routing in a place where we shouldn't.

Therefore the solution is simple:

  1. Keep 99% of the logic.
  2. Don't perform routing in the subrequest.
  3. Instead, manually look up the route for this exception, and surgically insert it into a clone of the original request object, then let the HTTP kernel handle that request as a subrequest, and the response we get back will be actually executed in the expected request context.

I've added test coverage that renders a 404 page containing the current request's URL. That is what should match the original request's URL.

Two patches:

  1. The actual patch, which contains the fix plus test coverage. Note how we actually simplify things! The only actual change is in DefaultExceptionHtmlSubscriber::makeSubRequest().
  2. FAIL patch, which contains just the tests hunks of the other patch, plus two minor additions: installing the url_alias table because that's necessary for the test to not throw an exception, and a debug($response->getContent()) to dump the response we get back in HEAD. This clearly shows HEAD does not use the original request's URL, but the 404 page's route's URL.

P.S.: I have NFI what the destination query arg and _exception_statuscode query arg are for, but I figured I'd keep that completely unchanged, since changing that is out of scope here.

Status: Needs review » Needs work

The last submitted patch, 23: exception-response-correct-context-2595695-23.patch, failed testing.

wim leers’s picture

Status: Needs work » Needs review

Despite not having a passing test (note that the tests specific to this functionality are green, I of course didn't run the entire test suite locally), this still needs reviews.

Status: Needs review » Needs work

The last submitted patch, 23: exception-response-correct-context-2595695-23.patch, failed testing.

wim leers’s picture

Oh, FFS. Testbot is a bit too pro-active nowadays :/

catch’s picture

Looks great to me, tidies things up a lot.

fabianx’s picture

Looks great to me, too. Much clearer what is happening.

I was thinking if this would affect dynamic_page_cache in a way, but was not sure. I think it would still work correctly, just doing two cache_sets in quick succession.

In theory for an anonymous page there is a race condition, but it depends on the route cache context if this is a problem or not, which warrants testing.

The theoretical (aka same CIDs for dynamic_page_cache) race condition is like:

Browser A:

/not_found => Exception => subrequest => store_dynamic_page_cache

Browser B:

/not_found => dynamic_page_cache_hit => deliver response from subrequest without the additional headers 

Browser A:

=> store in page_cache => deliver real response

Browser B:

=> store subrequest response in page_cache

Not sure this can happen, but just pointing out ...

wim leers’s picture

Great, I'll work on fixing the test failures then. I'll also work on an answer to #31 — perhaps test coverage for that would be valuable? Patch welcome :)

Crell’s picture

Good catch, Wim! We actually were using ExceptionController originally, but dropped it due to excessive complexity. (We had TWO layers of subrequest going on as a result, which was pointless.) Adding back in that piece of logic sounds reasonable.

One thing we ran into with subrequests often was that attributes got lost when cloning. Do we need to clone over the attributes from the parent request before overriding with an extra route result?

+++ b/core/lib/Drupal/Core/EventSubscriber/CustomPageExceptionHtmlSubscriber.php
@@ -28,30 +29,22 @@ class CustomPageExceptionHtmlSubscriber extends DefaultExceptionHtmlSubscriber {
+   * @param \Symfony\Component\Routing\Matcher\UrlMatcherInterface $access_unaware_router
+   *   A router implementation which does not check access.

Naming it this way seems excessive. It's a url matcher. That's all we need to know. Implicit additional requirements communicated by the variable name seems like a bad idea.

catch’s picture

Issue tags: -rc target triage +rc target

Discussed briefly with @effulgentsia and triaged as rc target.

This is soft blocking other issues. Also conceptually subrequests have been a point of contention in 8.x (at one point we were going to use them for menu link access or maybe did for a while) so cleaning up the spots we do use them clarifies things.

Also not sure about this change in a patch release. It's possibly fine but happier to get it in rc if we can.

wim leers’s picture

Status: Needs work » Needs review
StatusFileSize
new18.41 KB
new897 bytes

This should fix some of the failures.

Status: Needs review » Needs work

The last submitted patch, 35: exception-response-correct-context-2595695-35.patch, failed testing.

wim leers’s picture

Status: Needs work » Needs review
StatusFileSize
new18.44 KB
new1012 bytes

I can't reproduce any of the failures locally. The only thing I could think of is that testbot installs Drupal in a subdirectory. So I asked @penyaskito to try to reproduce it, since he had a subdirectory install available. That allowed him to reproduce it.

So, something is wrong in this patch wrt subdirectory handling. I think this fixes it.

Status: Needs review » Needs work

The last submitted patch, 37: exception-response-correct-context-2595695-37.patch, failed testing.

wim leers’s picture

Status: Needs work » Needs review
StatusFileSize
new19.5 KB
new2.32 KB

Whew. That was tricky to track down. Also: I am now even more frightened of Drupal 8's routing system than I already was. Instead of just confusing/messy Drupal code, we now have confusing/messy Drupal code and confusing/messy Symfony code.

So, here's what happens (in both HEAD and the patch):

  • DefaultExceptionHtmlSubscriber does Url::fromRoute('system.404')->toString(), which returns the path (note that path == relative URL) to that route, relative to base URL
  • CustomPageExceptionHtmlSubscriber just returns a valid path, i.e. the path configured at /admin/config/system/site-information

In HEAD, we create a request object and perform full-blown routing. Hence the inconsistency mentioned above is resolved in the routing system

In the patch, we don't do full-blown routing anymore; we reuse the existing request by cloning the master request and essentially overwriting the _controller attribute. Because in Drupal, we allow paths to indicate a controller to use for a certain 4xx response (path -> route -> controller). So, the solution is quite simple: make DefaultExceptionHtmlSubscriber behave the same as CustomPageExceptionHtmlSubscriber: don't turn it into a URL, just return the route's path. That avoids it being prefixed with the subdirectory in the first place.

(This also reverts #37's interdiff.)

Status: Needs review » Needs work

The last submitted patch, 39: exception-response-correct-context-2595695-39.patch, failed testing.

wim leers’s picture

Status: Needs work » Needs review
StatusFileSize
new20.4 KB
new1.34 KB

Still 4 failures. Most failures are due to an endless recursion problem. So let's fix that first.

The root cause for this one turns out to be the fact that Drupal uses routing to determine which controller to use for an exception. Let me repeat two things I wrote in #23:

Note that this default [Symfony] exception event handler requires a hardcoded controller, i.e. it completely avoids routing!

So, the crux of the problem here is: we use routing in a place where we shouldn't.

This time, the problem is that we rely on events within our routing; to some extent, we are doing custom and even hierarchical routing: we allow a route to not have a _controller, but a _form. We use an event subscriber to map _form to _controller: Symfony requires a route to eventually point to a _controller.
The actual problem is then that this event subscriber runs only at one time: after routing. And the router performs routing unless:

        if ($request->attributes->has('_controller')) {
            // routing is already done
            return;
        }

(from \Symfony\Component\HttpKernel\EventListener\RouterListener::onKernelRequest())

However, as of this patch, it's perfectly possible that we have performed routing, but all we have is _form, not _controller. I think that strictly speaking, we are therefore violating Symfony's API's. Because note that for Drupal, having _form means that routing has happened, but as far as Symfony's concerned, only _controller will do.

So we have three possible solutions:

  1. Automatically transform _form to _controller upon parsing Drupal's *.routing.yml files: i.e. compile-time transformation (this would also yield a small performance boost, because it'd allow us to remove an event subscriber!).
  2. Override Symfony's router listener to to not only check _controller, but also _form (i.e. override the cited code above).
  3. Also execute the event subscriber that maps _form to _controller before the router listener, so that when the cited code above runs, the _controller corresponding to _form is already set.

Option 3 is the least disruptive, so that's what I went with.

Status: Needs review » Needs work

The last submitted patch, 41: exception-response-correct-context-2595695-41.patch, failed testing.

fabianx’s picture

#41: I think the cleanest version is indeed to compile time transform _form and others into _controller. In fact I had thought we had been doing that already.

Worth a follow-up for sure, even if we only get to that by 8.Y.0 or 9.x.

wim leers’s picture

Status: Needs work » Needs review
StatusFileSize
new21.35 KB
new1 KB

The two remaining failures also fell in two categories:

  1. A plain simple bugfix, that was masked by a bug in HEAD: a test was specifying a path without a leading slash in configuration, even though HEAD prevents this from happening.
  2. A login form on a 4xx response not redirecting the user to the original location: this is where the destination query argument comes in. Hence this is an answer to my P.S. in #23, where I wrote:

    P.S.: I have NFI what the destination query arg and _exception_statuscode query arg are for, but I figured I'd keep that completely unchanged, since changing that is out of scope here.

This fixes the first of those two failures.

wim leers’s picture

StatusFileSize
new21.54 KB
new1.47 KB

And this fixes the second of those two failures.

wim leers’s picture

I don't think any change records need to be updated here.

This is definitely a bugfix.

wim leers’s picture

#33 Very glad to hear you're +1! :)

Do we need to clone over the attributes from the parent request before overriding with an extra route result?

I double-checked with a debugger, and cloning the request object also clones the attributes. And the code also confirms this:

    public function __clone()
    {
        $this->query = clone $this->query;
        $this->request = clone $this->request;
        $this->attributes = clone $this->attributes;
        $this->cookies = clone $this->cookies;
        $this->files = clone $this->files;
        $this->server = clone $this->server;
        $this->headers = clone $this->headers;
    }
Naming it this way seems excessive.

I copy/pasted the naming and comments from \Drupal\Core\Path\PathValidator. I'm not sure it's such a bad idea to keep calling it "access-unaware router" etc. Because we are doing routing. And we don't want to do access checking, because these are just exception responses, and we inherit the access result from the original master request.


#34:

Discussed briefly with @effulgentsia and triaged as rc target.

It's possibly fine but happier to get it in rc if we can.

Yay!


#43: Done: #2611164: Remove ContentControllerSubscriber: automatically transform _form to _controller during compile time.


AFAICT this is ready for final review.

The last submitted patch, 44: exception-response-correct-context-2595695-44.patch, failed testing.

Crell’s picture

Hm. OK, that's the difference between clone() and Request::duplicate(). I'm not entirely clear on why both still exist. If it's no problem to use clone(), I'm cool with that.

One thing I'm a bit concerned about: Why do we need to pre-check for _form if the use case is for when we're explicitly setting _controller? Why does that matter?

Also note that there are a bunch of other non-_controller properties that we use, such as the _entity_* properties. Do those need to be accounted for?

wim leers’s picture

RE: your concern: can you quote the code you are referring to? I don't understand what you mean, and I want to avoid answering beside the question :)

RE: others: Those may need to be updated also, yes. But:

I think that strictly speaking, we are therefore violating Symfony's API's.

… given that, I'm not so sure. Know I am no routing system expert, so I mean that literally: I am not sure what is the right way, I am only sure that what is in head is wrong :)

Crell’s picture

The latest patch runs the check for "if _form is set, add a _controller" twice, once pre-routing, once post-routing. However, if I understand it the point here is to fire a subrequest that has _controller set already, thus skipping routing and thus still having the parent request's route context. So... why does _form matter? :-)

Regarding "violating Symfony's API", I don't believe we are. The Router component itself doesn't promise to return a _controller, just that it returns whatever the route attributes are. The RouterListener then merges that onto the Request attributes.

The HttpKernel has an expectation that by the time both the request and controller events have fired, it can hand the Request to the ControllerResolver and get back a callable. It does not require that the ControllerResolver use _controller to determine that callable. I believe some Symfony use cases do even weirder things than we are in terms of how _controller gets there, such as FOSRestBundle.

The particular ControllerResolver we're using just dumbly looks for _controller, as does the one that Symfony fullstack uses. That is not, however, a requirement of any ControllerResolver implementation. The API requirement of our ControllerResolver is simply "there's a _controller attribute by the time you get to me". An alternate implementation might internalize all of ControllerResolverInterface could do its own handling of _controller/_form/_content/_whatever, in fact. My very initial proof of concept for Drupal, when I was making the case for using HttpKernel, did so, if I recall correctly.

Symfony itself doesn't impose hard requirements on the properties, just a few conventions that we've adopted. We could actually reorganize a lot of it were we so inclined, without breaking the "Symfony API".

wim leers’s picture

However, if I understand it the point here is to fire a subrequest that has _controller set already, thus skipping routing and thus still having the parent request's route context. So... why does _form matter? :-)

Because

  1. "doing routing" means "given a path, figure out which route name to use, and given that route name, add its route defaults as attributes to the request".
  2. Symfony's router listener only considers routing to already be done if the _controller request attribute is already set
  3. routes pointing to forms have _form, not _controller

1+2+3 = we need to transform _form to _controller in order to actually avoid routing.

I've already explained all of this in more detail in #41 ;)

Regarding "violating Symfony's API", I don't believe we are. The Router component itself doesn't promise to return a _controller, just that it returns whatever the route attributes are.

Well, there are only two options: either we are violating Symfony's API, or Symfony is violating its own API. Again quoting onKernelRequest:

        if ($request->attributes->has('_controller')) {
            // routing is already done
            return;
        }        
The HttpKernel has an expectation that by the time both the request and controller events have fired, it can hand the Request to the ControllerResolver and get back a callable. […] The particular ControllerResolver we're using just dumbly looks for _controller […]

The ControllerResolver is not even in the picture here: we don't even get to that point because Symfony will insist doing all routing again if _controller is not set, which means that it will again conclude this is a 404, so it again tries to render the 404 response, but it will again do routing, again conclude this is a 404, et cetera et cetera.

Symfony itself doesn't impose hard requirements on the properties […]

Perhaps it does not on theory, but it does in practice. We could override Symfony's RouterListener and whitelist _form, _entity_form, et cetera. But then contrib modules adding more such keys will have to whitelist theirs too. IOW: that's not really workable.


The solution in this patch is workable. The solution in this patch requires minimal changes. I'd also like to fix things in an ideal way, and not in a minimal way. But an rc target is not the place for that. Let's do that in #2611164: Remove ContentControllerSubscriber: automatically transform _form to _controller during compile time.

Crell’s picture

Wim: Wait, are we talking about a _form-using route that is the 404, or the one that is the configured 404 handler?

wim leers’s picture

The latter.

Crell’s picture

Ahhh... Now it makes more sense.

But that means that we definitely need to account for non-_form cases as well, since I expect an entity to be used as a 404 page more often than a form. That is, we need to make sure to handle arbitrary non-_controller alternatives.

tim.plunkett’s picture

wim leers’s picture

StatusFileSize
new23.02 KB
new3.39 KB

That already works. :) Those are route enhancers. And route enhancers are already run by the access-unaware router that we manually invoke. It's just _form that's the special duck here.

This expands the test coverage to prove that both _form and _entity_form already work. (And if _entity_form works, then so do _entity_view and entity_list.)

Crell’s picture

Well let's address that canard, then: #2613034: Use a route enhancer to handle _form, like everything else. Worked for me in quick spot checks, but we'll see what testbot says. If that works, that cleans up _form and the rest of this patch looks good to me. The oddity of _form was my only issue.

effulgentsia’s picture

wim leers’s picture

wim leers’s picture

Status: Needs review » Reviewed & tested by the community

@catch in #30:

Looks great to me, tidies things up a lot.

@Fabianx in #31:

Looks great to me, too. Much clearer what is happening.

@Crell in #58:

If that works, that cleans up _form and the rest of this patch looks good to me. The oddity of _form was my only issue.

Given #2613034: Use a route enhancer to handle _form, like everything else landed, I'm going to go out on a limb and consider #30 + #31 + #58 = RTBC.

Status: Reviewed & tested by the community » Needs work

The last submitted patch, 57: exception-response-correct-context-2595695-56.patch, failed testing.

wim leers’s picture

Status: Needs work » Needs review
StatusFileSize
new22.45 KB

Fail because patch doesn't apply anymore because conflict with #2603788: HtmlResponseSubscriber does not call HtmlResponseAttachmentsProcessor on subrequests.

Rebased, let's see if this works.

I'm afraid this will have missed the RC window, but AFAICT should be fine to fix in a point release, since this is a pure bugfix.

wim leers’s picture

Status: Needs review » Reviewed & tested by the community

Back to RTBC per #61.

dawehner’s picture

  1. +++ b/core/lib/Drupal/Core/EventSubscriber/DefaultExceptionHtmlSubscriber.php
    @@ -120,54 +128,46 @@ protected function makeSubrequest(GetResponseForExceptionEvent $event, $url, $st
    +      $sub_request = clone $request;
    +      $sub_request->attributes->add($this->accessUnawareRouter->match($url));
    

    Could/should we use $request->duplicate() ?

  2. +++ b/core/lib/Drupal/Core/EventSubscriber/DefaultExceptionHtmlSubscriber.php
    @@ -120,54 +128,46 @@ protected function makeSubrequest(GetResponseForExceptionEvent $event, $url, $st
    +      $parameters = ($sub_request->getMethod() === 'GET') ? $sub_request->query : $sub_request->request;
    

    nitpick: you can use isMethod()

  3. +++ b/core/lib/Drupal/Core/EventSubscriber/DefaultExceptionHtmlSubscriber.php
    @@ -120,54 +128,46 @@ protected function makeSubrequest(GetResponseForExceptionEvent $event, $url, $st
    -        // Persist the 'exception' attribute to the subrequest.
    -        $sub_request->attributes->set('exception', $request->attributes->get('exception'));
    

    Dropping that reads like an API change for me?

  4. +++ b/core/lib/Drupal/Core/EventSubscriber/DefaultExceptionHtmlSubscriber.php
    @@ -120,54 +128,46 @@ protected function makeSubrequest(GetResponseForExceptionEvent $event, $url, $st
    +      // If an error happened in the subrequest we can't do much else. Instead,
    ...
    +      // exception and handle it normally.
    +      $error = Error::decodeException($e);
    +      $this->logger->log($error['severity_level'], '%type: @message in %function (line %line of %file).', $error);
    

    Just an out of scope thought: Actually we could pass it along to our other exception handling?

  5. +++ b/core/modules/system/src/Tests/Form/RedirectTest.php
    @@ -26,7 +26,7 @@ class RedirectTest extends WebTestBase {
    -  function testRedirect() {
    +  function atestRedirect() {
    

    Let's not drop test coverage

wim leers’s picture

Status: Reviewed & tested by the community » Needs review
StatusFileSize
new21.84 KB
new1.73 KB

#65:

  1. No, because duplicate() doesn't actually duplicate; it resets just about everything. It's a very confusing method. Look at the source and weep.
  2. Cool :)
  3. We don't drop it. Using clone $request instead of $request->duplicate() means that we just don't need to do this anymore.
  4. Hrm, maybe… I'd think this was done for a good reason. I don't fully grok that code.
  5. LOL, oops.
dawehner’s picture

Status: Needs review » Reviewed & tested by the community

No, because duplicate() doesn't actually duplicate; it resets just about everything. It's a very confusing method. Look at the source and weep.

OH right, its a different kind of duplicating than we need indeed.

We don't drop it. Using clone $request instead of $request->duplicate() means that we just don't need to do this anymore.

Oh gotcha, thank you!

Thank you wim

Status: Reviewed & tested by the community » Needs work

The last submitted patch, 66: exception-response-correct-context-2595695-66.patch, failed testing.

wim leers’s picture

Status: Needs work » Reviewed & tested by the community

Random fail in NodeTypeTranslationTest.

Status: Reviewed & tested by the community » Needs work

The last submitted patch, 66: exception-response-correct-context-2595695-66.patch, failed testing.

wim leers’s picture

Status: Needs work » Reviewed & tested by the community

Testbot problem.

Status: Reviewed & tested by the community » Needs work

The last submitted patch, 66: exception-response-correct-context-2595695-66.patch, failed testing.

wim leers’s picture

Status: Needs work » Reviewed & tested by the community

And back to RTBC again now that testbot seems to have gotten over its hangover.

alexpott’s picture

Status: Reviewed & tested by the community » Needs work

I think we also need to fix drupalSettings here too. If you create a user without access to the admin page and visit it the drupalSettings are wrong.

<script type="application/json" data-drupal-selector="drupal-settings-json">{"path":{"baseUrl":"\/","scriptPath":null,"pathPrefix":"","currentPath":"system\/403","currentPathIsAdmin":false,"isFront":false,"currentLanguage":"en","currentQuery":{"_exception_statuscode":403,"destination":"\/admin"}},"pluralDelimiter":"\u0003","ajaxTrustedUrl":{"\/search\/node":true},"user":{"uid":"2","permissionsHash":"ed011ac77bd12c096da236a9762d75bbe23d5a5ef09539388da1de53794138ca"}}</script>

I would argue the currentPath, currentPathIsAdmin and currentQuery are both wrong here.

wim leers’s picture

Agreed that that should work correctly too. And if the current path is wrong, I'm not surprised everything else is wrong also.

I'll work on this after 8.0.0, unless somebody beats me to it.

wim leers’s picture

alexpott’s picture

Status: Needs work » Needs review
StatusFileSize
new1.24 KB
new3.09 KB
new24.37 KB

Here's a fix for #74

wim leers’s picture

Looks splendid!

The last submitted patch, 77: 2595695.drupalsettings-test-only.patch, failed testing.

Status: Needs review » Needs work

The last submitted patch, 77: 2595695-75.patch, failed testing.

alexpott’s picture

Status: Needs work » Needs review
StatusFileSize
new1.02 KB
new1.69 KB
new24.13 KB

Ah the currentPath should not have the base path in there - there is a separate thing in drupalSettings for that.

The last submitted patch, 81: 2595695.81.drupalsettings-test-only.patch, failed testing.

Status: Needs review » Needs work

The last submitted patch, 81: 2595695-81.patch, failed testing.

alexpott’s picture

Status: Needs work » Needs review
StatusFileSize
new902 bytes
new24.18 KB

Fixed test.

dawehner’s picture

So how do we deal with the potential API break which is that you could have changed the path of those routes, but now we hardcode the paths after the patch.

wim leers’s picture

Status: Needs review » Reviewed & tested by the community

Still passes, back to RTBC per #67.

catch’s picture

Version: 8.0.x-dev » 8.1.x-dev
Status: Reviewed & tested by the community » Fixed

@dawehner so that's a potential API break, but I can't really think of a use case to do it.

However between that and the service constructor changes here, putting this into 8.1.x only.

Committed/pushed to 8.1.x, thanks!

  • catch committed 6a09a57 on 8.1.x
    Issue #2595695 by Wim Leers, alexpott, mfb, Crell, cilefen, dawehner:...

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

mr.baileys’s picture

This change caused a regression, crosslinking #2678674: Access bypass for custom error pages