Proposed commit message:
Issue #2476407 by borisson_, hussainweb, znerol, Fabianx, Wim Leers, dawehner, Crell, Berdir: Using CacheableResponseInterface to better determine which responses should be cached/cacheable
Problem/Motivation
If you do in your controller:
return new Response('Hi');
this will be cached by Drupal, but you did not specify cacheable metadata.
That is a security risk (information disclosure) and can lead to cache (invalidation, variation) problems as Symfony Controller responses are usually uncached by default.
Proposed resolution
- Only cache responses implementing CacheableResponseInterface
Remaining tasks
- Commit
API changes
- None, but normal Responses will no longer be cached by page_cache or external proxies.
- They need to set Cache-Control manually - as usually in Symfony applications.
Beta phase evaluation
Issue category | Bug because behavior is changed from normal and expected Symfony usage and there is already contrib code in the wild running into this bug. |
---|---|
Issue priority | Critical because of security (information disclosure) and caching potentially uncacheable things. |
Prioritized changes | The main goal of this issue is security and fixing cacheability bugs. |
Disruption | Disruptive for contrib and custom modules as normal Responses are no longer cached. |
Original report
Problem/Motivation
Quoting Fabianx in #2463009-12: Introduce CacheableResponseInterface: consolidate ways of setting X-Drupal-Cache-Tags/Contexts headers:
[…]
We use a check oninstanceof CacheableResponseInterface
for _any_ cache related data we set inFinishResponseSubscriber
as final headers on the response.That means that if you do:
return new Response('Hi');
you are yourself responsible for everything, but the default
Response
is uncacheable (Cache-Control: private, no-cache, no-store, etc.
) so this is not a security issue.This essentially makes Drupal's cache, etc. system opt-in instead of mandatory, which should be fine and was a goal of the WSCCI initiative in the first place.
[…]
Proposed resolution
So, talking concretely, this issue wants to do what Fabianx said in #2463009-27: Introduce CacheableResponseInterface: consolidate ways of setting X-Drupal-Cache-Tags/Contexts headers:
a) Only do any Cache-Control headers if it is a cacheable response, this truly lets someone opt out of core's caching architecture, etc.
b) Have page cache honor Cache-Control header OR check for cacheable response as well.
Remaining tasks
Do it.
User interface changes
None.
API changes
TBD
Comment | File | Size | Author |
---|---|---|---|
#123 | 2476407-123.patch | 12.77 KB | hussainweb |
#123 | interdiff-120-123.txt | 960 bytes | hussainweb |
#120 | using-2476407-120-test.patch | 12.73 KB | hussainweb |
Comments
Comment #1
Fabianx CreditAttribution: Fabianx commentedBlocked on a decision on:
- #2472321: Move rest.module routes to /api
Blocked on implementation:
- #2463009: Introduce CacheableResponseInterface: consolidate ways of setting X-Drupal-Cache-Tags/Contexts headers
Therefore postponing for now.
Comment #2
Wim LeersComment #3
webchickAs a reminder, a critical issue is one that:
* Renders [the] system unusable and has no workaround.
* Causes loss of data.
* Exposes security vulnerabilities.
Downgrading until some justification for critical is added to the IS.
Comment #4
webchickActually, I see that justification is provided, but that's only in the event we're doing /api and so far I count 3 framework maintainers against that.
Comment #5
Wim Leers#2463009: Introduce CacheableResponseInterface: consolidate ways of setting X-Drupal-Cache-Tags/Contexts headers landed.
Comment #6
Fabianx CreditAttribution: Fabianx commentedI am marking this active as major (not critical task).
The other issue showed that even without the move of /api it is very easy to shoot yourself in the foot right now.
And if Core does not get it right, we cannot expect contrib/ to do so ...
i.e. as a Symfony developer I would never get the idea that if I do:
that it then suddenly is cached in the page cache or external cache. As we only do it for anon its less of a problem (and hence not critical), but the DX of getting opted-in to all the cache headers when what I have might not be cacheable at all, is questionable at best and wrong at worst ...
Comment #7
dawehnerI totally agree with @fabianx here, if you use raw response objects, you did not wanted the result to be cached, unless you set your headers for yourself.
The code producing the response object is the only one which can figure out, how long something is cacheable.
Comment #8
Wim Leers+1
Comment #9
anavarreComment #10
Wim LeersAFAICT this is not blocked on anything.
Comment #11
dawehnerHere is a list of raw response objects:
Comment #12
Crell CreditAttribution: Crell at Palantir.net commentedSo what's the actionable here? dawehner's comments in #11 suggest there's maybe 2-3 direct Response usages right now we should cache, and the rest probably leave as is. Shall we spin up dedicated issues for those and make this a meta?
Comment #13
Fabianx CreditAttribution: Fabianx as a volunteer commented#12: I think the only thing left here is write a patch. I don't think we need dedicated metas as the conversion should be very straightforward, but I am also not opposed to a meta ...
Comment #14
borisson_So, theoretically, the attached patch should fix (a)?
Comment #15
borisson_@dawehner mentioned in irc that we should be using cacheable responses for rest as well, so attached is a patch that does exactly that.
Comment #16
Wim LeersLet's instead do this
instanceof
check as the first check in the expression for$is_cacheable
. Allows us to avoid evaluating the request & response policies. Less things to compute :)Comment #18
Fabianx CreditAttribution: Fabianx as a volunteer commentedLooks great, I assume we need some tests though.
Comment #20
borisson_Fixes the test failures in #14, #15.
Optimized the if statement as suggested by @Wim Leers in #16.
Added a test as requested in #18 by @Fabianx.
Comment #22
Fabianx CreditAttribution: Fabianx as a volunteer commentedTest looks great, now just need to get it green.
Comment #23
borisson_The second call to /system-test/respond-reponse is a cache-hit, while it's not supposed to be one. I have no idea why that happens though.
Locally that is the only remaining failure. I can't reproduce the other test failures.
Comment #25
Fabianx CreditAttribution: Fabianx as a volunteer commented#23: That is likely page cache interfering, which uses its own logic.
Maybe we need to try to add the logic there, too?
And we should have one test scenario with page_cache disabled at least.
Comment #26
borisson_I disabled page cache module and ran the tests again. Locally, this doesn't seem to work though. I can't figure out why these test don't work though.
Comment #28
borisson_The
testCacheableResponseResponses
in\Drupal\page_cache\Tests\PageCacheTest
now passes locally.I'm assuming that there'll be a lot of new failures though, so lets see where exactly.
Comment #30
Fabianx CreditAttribution: Fabianx as a volunteer commentedI don't think that is right.
A non-cacheable response should never enter the page cache in the first place.
I think the simplest actually is to use a ResponsePolicy, which then works for page cache and the main subscriber out of the box.
Comment #31
borisson_I removed that snippet of code from the PageCache module and put it in a ResponsePolicy as suggested in #30. This should break at least the same amount of tests as before.
Comment #33
borisson_The current patch adds a CacheableJsonResponse. This also changes the
DenyNonCacheableResponse.php
to not doinstanceof CacheableResponse
it now doesinstanceof CacheableResponseInterface
.This way, the newly added CacheableJsonResponse works, as well as the original CacheableResponse and HtmlResponse.
I've also slightly changed the
PageCacheTest
so that passes now as well, this should reduce the amount of errors significantly.Comment #34
borisson_Comment #36
yched CreditAttribution: yched commentedFinishResponseSubscriber::onRespond() :
Isn't that redundant with the addition of the DenyNonCacheableResponse policy ?
Comment #37
borisson_@yched, you're right, this is redundant now.
Comment #38
borisson_Removing needs test tag, because there's a test added in the patch.
Comment #39
BerdirFor the record, this is an API change as existing responses will no longer be cached. So at least needs a change record for that...
Comment #40
Crell CreditAttribution: Crell at Palantir.net commentedThat is a hilarious service name. Almost as good as block_block_block...
Nit: One line docblock.
My only other issue with #37 is that it's a deny, which feels like an opt-out type implementation rather than an opt-in. Is it just a quirk of the way the cache policy system works (which I don't fully grok) that it's done that way?
Comment #41
Crell CreditAttribution: Crell at Palantir.net commentedTo clarify why I'm concerned in #40, if someone has a Response object of their own, and they set cache headers on it themselves, we should not be overriding that. The presence of CacheableResponseInterface should be the opt-IN to us touching cache headers at all. The default for a Response is to have no cache. If a controller doesn't opt-in, if I read this code correctly, we'd forcibly disable caching on it. Rather, we want to do nothing at all to caching and assume the controller dev knows what he's doing cache-wise.
Comment #42
dawehnerDoes it make sense to talk about page cache inside core itself?
Comment #43
borisson_#40.1: Changed the service name to:
page_cache_deny_non_cacheable
.#40.2: Fixed the docblock.
#42: At the same place in services.yml the following services were already defined: page_cache_request_policy, page_cache_response_policy, page_cache_no_cache_routes, page_cache_no_server_error. If this is not good practice, I can open a followup to move these to the page cache module.
Regarding @Crell's comment in #40 & #41:
I honestly don't have a correct answer for this, I followed the suggestion from @Fabianx in #30. I'm going to try to answer it, as far as I understand this.
Cacheability doens't change if you return a renderable array from a controller, that'll be the most used use-case when creating new controllers in custom code, I think.
When you're actually doing something like:
return new Reponse('Test', 200);
there isn't any cache metadata and thus refusing to cache that controller is a good solution, imho. If you're already setting the correct headers for caching (X-Drupal-Cache, X-Drupal-Cache-Tags and X-Drupal-Cache-Contexts) I'm assuming you're pretty well known with the entire caching layer and so returning aCacheableResponse
instead of theResponse
shouldn't be that hard to figure out.Comment #44
Wim Leers+1 to #41 — great comment, Crell :)
Related: #2540684: Rename Drupal\Core\PageCache namespace.
Comment #45
borisson_I think I correctly understood what @Fabianx explained about this in irc, so here's an updated patch.
Patch removes the DenyNonCacheableResponse response policy and only adds cacheability headers in the FinishReponseSubscriber when the response is a cacheable response. Didn't change any of the tests but everything should still pass.
Comment #46
Fabianx CreditAttribution: Fabianx as a volunteer commented( X-POST with #45 )
Discussed in IRC; I indeed forgot that DENY would remove any cacheable headers set.
- Just add an early return when its not a CacheableResponseInterface response
- Leave page cache 100% alone in here as the following should still work:
Unrelated:
In a follow-up we could discuss to set a FrozenCacheableMetadata at this point to ensure immutability.
Comment #47
Fabianx CreditAttribution: Fabianx as a volunteer commentedNo, this is not correct and has the same problems.
We want to early return above the other CacheableResponnseInterface check.
e.g.
Comment #48
borisson_Implements the early return mentioned in #47.
Comment #50
Fabianx CreditAttribution: Fabianx as a volunteer commentedThat part was reverted, that is why the patch does not apply anymore.
Looks great!
Probably need to extend the test coverage a little for the 'leave responses that set cacheable data' alone part and ensure that standard responses by default are not cacheable.
Comment #51
borisson_My local install wasn't up to date with the latest commits. This interdiff is vs #45.
Comment #53
borisson_@Fabianx figured out that when there's no "Expires" header, everything breaks. So this makes sure that we always have one.
Comment #55
Fabianx CreditAttribution: Fabianx as a volunteer commentedWe can and should even call $this->makeResponseUncacheable()
or what it is called.
That is perfectly valid - as the default is uncacheable, so if no one has changed the Cache-Control value we mark it uncacheable.
Comment #56
borisson_#55
There already is
\Drupal\Core\EventSubscriber\FinishResponseSubscriber::setResponseNotCacheable
so I don't think that's a good idea. I can call out to that method instead though.Comment #57
Fabianx CreditAttribution: Fabianx as a volunteer commented#56: Yes, that is what I meant.
Comment #58
borisson_So, this fixes #55
Comment #60
Fabianx CreditAttribution: Fabianx as a volunteer commentedNice, likely either the image tests are mistaken or they need to return a CacheableResponse instead.
Comment #61
borisson_This fixes the failures in
ImageStylesPathAndUrlTest
. I can't get the other 2 tests to work though.Comment #63
Wim LeersGiven the changes in #53, we should get @znerol's sign-off.
Comment #64
borisson_Comment #65
znerol CreditAttribution: znerol commentedI agree with the logic here. The negation in combination with the early return makes that a bit hard to read. I probably would try to write this in positive logic, i.e. (
if ($response instanceof CacheableResponseInterface) { /* manipulate headers } else { /* leave them alone unless cache-control is not set at all */ }
. YMMV.That does not seem to be correct. When images are served with the
private
stream wrapper, then I expect thatCache-Control
should containprivate
notpublic
. We want to allow browsers to store the file, but not intermediate proxies.I also suspect that this change might be wrong. instead I think that
ImageStyleDownloadController::deliver()
should be altered and those headers should be added to the list of default headers before they are passed to theBinaryFileResponse
I think. And why addCache-Control: no-cache
here?Comment #66
znerol CreditAttribution: znerol commentedRe #46
What exactly do you expect to happen in that case in:
Comment #67
borisson_#65
if (!($response instanceof CacheableResponseInterface)) { ... }
reads kind of funky. I think this is a better solution over adding extra indentation in the following code blocks. So I haven't changed this yet. I think the documentation is sufficiently clear.ImageStyleDownloadController::deliver()
I added the cache control header to get tests to pass.
#66: If I understand it correctly, the finish response subscriber adds the headers it should add to every response: X-UA-Compatible, Content-language, X-Content-Type-Options and X-Frame-Options. After adding those headers it should stop executing (because it's not an instance of
CacheableResponseInterface
and the cache-control header is added. I'm not enterily sure what the page cache middleware should do, but because this is not aCacheableResponseInterface
I'd assume it shouldn't be triggered.The attached patch reintroduces the test failures found in #58. Working on getting those fixed.
Comment #69
borisson_Attached patch reduces the amount of failures from 6 to 4.
Comment #71
borisson_I spent an afternoon looking at the remaining 4 testfailures.
Attached interdiff is what I think could possible be a resolution to the remaining failures but it doesn't make a difference so the attched patch is still the one from #69. I'm going to need help to figure this out.
Comment #73
Wim LeersFrom Testbot, failures in:
These seem to be a random test failure, @dawehner also saw them in #2544932-60: Twig should not rely on loading php from shared file system.
So, ignore those, focus on the others.
Comment #74
borisson_@Wim Leers is right, the failures in
ConfigFormOverrideTest
+ContainerRebuildWebTest
are not related to this patch.I can't figure out how to fix the failures in the other 3 tests.
Comment #75
borisson_I was on holidays for a week and spent a couple of hours with this patch. The attached interdiff is the only piece of code that actually is useful. Attached patch is still failing with the same 4 failures.
Like I mentioned in #71, I'll need help to figure out how to resolve those failures.
Comment #77
znerol CreditAttribution: znerol commentedLooking at this now.
Comment #78
znerol CreditAttribution: znerol commentedSo, the BinaryFileResponse constructor has an optional argument
$public = TRUE
. If that is set, then theCache-Control
directivepublic
will be set - regardless of the contents of the$headers
parameter.Comment #80
znerol CreditAttribution: znerol commentedIn HEAD we seem to deliver private image styles with
Cache-Control: no-cache
, but everything which is delivered through theFileDownloadController
(even if it is an image attached to a node), getsCache-Control: private
.Also the change to
ShortcutSetsTest
seem bogus.Comment #82
borisson_The remaining failure in
PageCacheTest
is introduced because dynamic page cache doesn't vary by query parameters by default.The attached patch fixes this by clearing caches between each call to
system-test/set-header
.Comment #87
borisson_The last patch I uploaded, #82 didn't integrate all the changes @znerol did in #80. Attached patch does.
There is still one remaining failure in
PageCacheTest
on this line:$this->assertEqual($this->drupalGetHeader('Foo'), 'bar', 'Custom header was sent.');
. We could also clear caches (Cache::invalidateTags(['rendered']);
) but then the comment preceding this would be wrong:// Check that authenticated users bypass the cache.
.Comment #88
Fabianx CreditAttribution: Fabianx as a volunteer commentedComment #91
borisson_The problem in
PageCacheTest
was that the response generated inSystemTestController::setHeader
didn't set the correct cacheability metadata, it now does and this fixes the remaining failures.Comment #92
borisson_Comment #93
Wim Leerss/we're trusting/assume/
s/from/in/
The inner parentheses in
if (!($a instanceof X))
are not necessary.Can we get a comment for this?
Missing trailing period.
s/normal Response object/plain Symfony Response object/
s/cache/cached/
This seems wrong; we turn them into cacheable responses whereas they weren't before, without doing the work necessary to ensure they have the necessary cacheability metadata (specifically cache tags). This means that these responses would now be cached by the (Dynamic) Page Cache, unlike in HEAD, but they would continue to serve the same content forever because they'd never be invalidated.
These have hardcoded responses so indeed are perfectly cacheable.
Comment #94
borisson_Fixes all the remarks made in #93, reverted the code in .8
Comment #95
Wim LeersThis looks great now, thanks!
Nit: s/expires/'Expires'/
Nit: s/the finish response subscriber/FinishResponseSubscriber/
Both can be fixed on commit or in a follow-up.
Comment #96
Wim LeersFrom chat, sounds like @znerol still has some concerns, I thought he already was the one who set this direction/approach. We need his +1 for this.
Comment #97
borisson_Talked with @znerol about this, we're happier now.
Comment #98
BerdirIsn't this an API change?
For example, I'm pretty sure that this means that redirect responses are no longer cached, and there is no redirect response class in core that is cacheable then? That will break redirect.module's response caching, for example?
No change except the array() format switch...
Same here, the only real difference seems to be the additional argument? That makes it a lot harder to review...
Missing docblocks.
Comment #99
borisson_Comment #100
borisson_Attached patch fixes most of @Berdir's problems in #98. Not sure about #98.1.
Comment #101
znerol CreditAttribution: znerol commentedThanks @Berdir for the review. Re #98.1, while I do consider this as an API change, it for sure has an impact on existing code/deployments. From manual testing I can confirm that indeed the
Cache-Control
header for redirect responses will prevent caching in reverse proxies with the patch. That was not the case before.Not sure what to do about that. Maybe our various
RedirectResponse
classes should be subclasses ofCacheableResponse
?Also I am not quite sure about the scope of this issue. What exactly is the motivation for all this?
Comment #102
znerol CreditAttribution: znerol commentedFiled #2573923: Introduce a CacheableRedirectResponse and use it where appropriate
Comment #103
Fabianx CreditAttribution: Fabianx as a volunteer commented#101: The motivation is exactly like happened in that issue.
A redirect might be cached, even though we don't know how to invalidate it, etc.
So if someone uses a normal Symfony response then they are on their own to use cache headers, etc.
This is to both:
a) avoid bugs (caching something wrongly)
b) allow people to do with Symfony responses whatever they want - more freedom.
Comment #104
Wim LeersComment #106
Wim LeersPIFRfail.
Comment #108
borisson_Patch no longer applied, reroll attached.
Comment #109
Wim LeersStraight reroll, already was RTBC, back to RTBC.
Comment #110
Fabianx CreditAttribution: Fabianx as a volunteer commentedThis looks like rc deadline material to me. The reason for rc deadline is because this is a potential behavior change, but a good one.
Already now I have seen code in the contrib wild west that returned a normal Response without cacheable metadata.
However now this will be cached both internally and externally for the set cache time, but there is no cacheable metadata with the usual problems of cache expiration, etc.
So this is really important to get in.
Comment #111
Fabianx CreditAttribution: Fabianx as a volunteer and at Tag1 Consulting commentedAfter long back and forth decided to make this critical.
I would have made this critical after rc deadline anyway (if it was not in, yet), so it is more honest to make it critical now.
Rewritten the IS and added beta evaluation to make the point way more clear.
CR next.
Proposed commit message:
Comment #112
Fabianx CreditAttribution: Fabianx as a volunteer and at Tag1 Consulting commentedComment #113
Fabianx CreditAttribution: Fabianx as a volunteer and at Tag1 Consulting commentedAdded a CR: https://www.drupal.org/node/2579995
Comment #114
dawehnerthere are http/1.0 only proxies out there?
Does someone mind adding a comment about why we need FALSE here?
Comment #115
hussainweb#114.2:
I didn't want to derail this issue by posting a patch to test things here. I was not sure why we need FALSE and not
$scheme !== 'private'
like in the other case. I did that in #2580147-3: Testing issue - Please ignore and the tests still pass. I have added a test to see if this can be made to fail. I am waiting for those tests to complete at the moment.Comment #116
borisson_@dawehner, @hussainweb: regarding #114.2, @znerol explained this in #78.
Not sure about #114.1
Comment #117
hussainweb@borisson_: I saw that, but I couldn't find anything on why
$scheme
is not used here. Did I miss something?Comment #118
dawehnerWell, so can we add this to the patch?
Comment #119
hussainwebHere it is. I am not entirely convinced by the explanation (I mean the original explanation of setting
$public
toFALSE
). What if something else changes the header down the line? This will cause theFinishResponseSubscriber
to not touch the Cache-Control header again.Comment #120
hussainwebI found a case where just passing
FALSE
to$public
fails. I have been trying out this in a testing issue and here are the final patches. To see the test which fails, see interdiff-119-120-test.txt. The interdiff with the fix is in interdiff-119-120.txt. The corresponding patches are attached as well. The-test.patch
should fail in\Drupal\image\Tests\ImageStylesPathAndUrlTest
.EDIT: Fixed typo.
Comment #121
dawehnerThank you @hussainweb!
I think we need to adapt the comment now
Comment #123
hussainwebExpected failures and passes. :)
Here is the change as per #121.
Comment #124
dawehnerAlright, well I still think we could improve the comment to explain why we don't want to change the headers but meh.
Comment #125
Fabianx CreditAttribution: Fabianx as a volunteer and at Tag1 Consulting commentedRTBC + 1
Just a question:
Should we use $schema === 'public' for $public == TRUE or is private our only private scheme we use in Core?
Overall I think our file system works in the way that anyone regardless of scheme should be able to set headers with Cache-Control to public or private.
e.g. it is perfectly fine to change access to a public resource to have a Cache-Control of private.
Given that this is a pre-existing bug and an edge-case only crazy people like me need however, I don't think we need to solve that here.
But in the future it should be the right fix to just add $public = FALSE, which means in essence:
"Do not make this file explicitly public, but do nothing and leave the headers as is."
Overall that flag is quite badly named and should have never been used in the first place.
It might be also that the need for that flag was during one of the conversions, where we removed our explicit setting of Cache-Control: public, in which case the patch here is completely correct.
This was more an analysis of the changes and does not affect RTBC status.
New proposed commit message:
Comment #126
catchUpdating commit credit per #125.
Comment #129
effulgentsia CreditAttribution: effulgentsia at Acquia commentedThis patch makes a lot of sense, and better to break this before RC than after, so pushed to 8.0.x and published the CR.
Two requests:
This seems like it's roughly (but not exactly) duplicating code that also exists elsewhere. Not sure if there's a way to refactor this method to have a bit less duplication, but if there is, please open a followup for that.
Comment #130
Fabianx CreditAttribution: Fabianx as a volunteer and at Tag1 Consulting commentedThanks, updated the CR:
https://www.drupal.org/node/2579995/revisions/view/8958033/8958223