Problem/Motivation
The problem stems from the fact that when using page_cache.module we check the response policy at two points. The first is in FinishResponseSubscriber and the second is later in the request in the PageCache middleware.
The problem with this setup comes from the fact that FinishResponseSubscriber assumes it is basically the last thing to run and changes response headers related to caching. If however one of your response policies is looking at one of those headers when it runs later in the middle ware it can return a different result.
Example:
Response policy to not cache responses with "Vary: Accept" headers. In the first check we deny caching. In the second check the Vary header has been removed and we allow caching, putting the non-cacheable result into the cache.
Proposed resolution
Remaining tasks
Figure out a solution?
User interface changes
API changes
Comment | File | Size | Author |
---|---|---|---|
#7 | 2478915.patch | 5.23 KB | neclimdul |
Comments
Comment #1
Wim LeersComment #2
Wim LeersThis is very closely related to #2476407: Use CacheableResponseInterface to determine which responses should be cached.
Comment #3
znerol CreditAttribution: znerol commentedIf you set a
Vary
header, then you expect that the response is cached somewhere - otherwise you wouldn't set aVary
header. So, generically speaking such a rule does not make any sense at all.I gather that the intention of introducing such a rule is to prevent that a response is cached in the internal cache if content negotiation is used on a given path. Because this is a limitation of the internal cache, I think it is acceptable to hard-code that condition there. Note that we already have a hard-coded check for
BinaryFileResponse
andStreamedResponse
for the same reason.Comment #4
neclimdulI was under the impression that we where blocking caching on "Vary: Accept" paths per https://www.drupal.org/node/2364011#comment-9681765. In which case we would want to exclude caching but still send a Vary header. If not we should discuss there.
Regardless, I think a more general solution should be adopted instead of another special case. If ResponsePolicy fails in FinishResponse, the only way PageCache should cache the response is if something explicitly sets the response cacheable between it and PageCache. Consider that FinishResponseSubscriber set's "Cache-control: no-cache" yet PageCache still caches the page. Maybe that's the real bug here.
Comment #5
znerol CreditAttribution: znerol commented@Crell is wrong about that. We need to add
Vary: Accept
to responses in order to make caching work with reverse proxies when content negotiation is in use on REST routes.Trying to use content negotiation for Ajax/JavaScript stuff is just not worth it because browsers are broken. Browser callbacks and REST are different things, mixing them will cause problems all over the place.If the
ResponsePolicy
denies caching inFinishResponse
, then it also will deny caching in thePageCache
middleware unless someone broke it.The
PageCache
stores pages if the response policy allows it. TheFinishResponseSubscriber
addsCache-Control: public
if the response policy allows it and if there is amax-age
configured, it addsCache-Control: no-cache
ifmax-age
is not configured. So if the response policy allows caching butmax-age
is configured to0
, thenPageCache
stores the response (including theCache-Control: no-cache
header) and that behavior is totally fine.Edit: second part of first paragraph was an early morning fail.
Comment #6
Wim LeersComment #7
neclimdulSomeone broke it then? Not really trying to point blame but that assumption is currently not true.
The failure of the attached patch should show this. The response policy will return DENY the first call because there is a vary header and null the second time since there are no vary headers. Despite FinishResponse denying caching, PageCache still caches the request because the policy couldn't DENY the second time because the state of the system is different, the response headers have been changed.
The way I see it we have 2 options.
1) Don't mange cache headers in FinishResponseSubscriber
2) We change something so that FinishResponseSubscriber::setResponseNotCacheable() forces responses not to be cached by the PageCache middleware.
Comment #8
neclimdulto testbot to show failure.
Comment #10
znerol CreditAttribution: znerol commentedJust do not do that for the reasons stated above. That rule does not make any sense at all.
Update:
The
PageCache
middleware stores whole responses (including cache headers). Therefore we somehow need to set cache headers before the page cache middleware handles the response.Like pointed out above, there are legit use-cases where pages are cached in the internal cache even though caching is denied for external caches (browsers, proxies).
Comment #11
Wim LeersWe lost track of this, but I think we should look at this again now that we're approaching release.
Comment #12
neclimdulYeah, I never knew how to respond to znerol's comments. The point of the issue was response policies are used to determine what should and shouldn't be cached so they should have access to unmodified headers. But he seemed to disagree with the premise so I don't have a path forward.
Originally this was to support real content negotiation support in core but that's taken a different direction so I don't know what interest is outside me still wanting to implement that in contrib at some point.
Comment #13
Wim LeersI don't think you're the only one who wants
Accept
header negotiation support in D8 (contrib).Comment #16
Wim LeersI'm closing this issue. Feel free to reopen, but then please include a failing test that demonstrates the problem.