Problem/Motivation
The function drupal_page_is_cacheable()
currently has too many responsibilities. Especially troublesome is the fact that it incorporates the default logic to determine whether a page is cacheable but also allows a caller to override the result. When the cacheability of a page is enforced (allowed or denied) multiple times during a page request, the last caller wins. In some situations this can be dangerous. For example when the page cache was disabled by drupal_set_message()
but later in the request enabled again by third party code.
Also it is currently only possible to enforce recording of a page in the cache. There is no way of extending the logic used to check whether an attempt to retrieve and deliver a page from the cache should be made. This may lead to rather ugly hacks, e.g. _toolbar_initialize_page_cache()
.
Proposed resolution
Replace drupal_page_is_cacheable()
with a set of services forming a flexible and extensible page cache policy. Those services are page_cache_request_policy, page_cache_response_policy, and page_cache_kill_switch. Sites implement their policies by implementing a "policy chain". The default request policy chain is \Drupal\Core\PageCache\DefaultRequestPolicy, and its behavior is unchanged in this patch. It consists of RequestPolicy\NoSessionOpen() and RequestPolicy\CommandLineOrUnsafeMethod() which disable page cache when unsafe method, CLI, or a session cookie are found.
The framework makes it possible in a clean fashion to cache some pages even for authenticated users. See AllowToolbarPath
.
Remaining tasks
User interface changes
API changes
Modules can disable page cache during the response by: \Drupal::service('page_cache_kill_switch')->trigger();
. This used to be drupal_page_is_cacheable(FALSE)
Comment | File | Size | Author |
---|---|---|---|
#62 | 2263981-replace-drupal-page-is-cacheable-g2-62.diff | 52.46 KB | znerol |
#60 | interdiff.txt | 1.28 KB | znerol |
#60 | 2263981-replace-drupal-page-is-cacheable-g2-60.diff | 52.52 KB | znerol |
#56 | interdiff.txt | 849 bytes | znerol |
#56 | 2263981-replace-drupal-page-is-cacheable-g2-56.diff | 52.51 KB | znerol |
Comments
Comment #1
znerol CreditAttribution: znerol commentedComment #2
Wim LeersSorry it took so long to do a review!
s/deliver/delivery/
Please refer to a specific issue.
Wouldn't
$enable
be more clear? Or, alternatively, rename_drupal_bootstrap_page_cache()
.First, it says "policy rule".
Then, the typehint also says "policy rule".
But the variable name says "policy".
"Policy" implies that there may be multiple rules, that together form a policy.
Hence this is quite confusing. I do see how e.g.
DefaultRequestPolicy
implementsPolicyRuleInterface
, accepts multiple policy rules and then applies them all — i.e. forming an "aggregate policy rule". However, in the grand scheme of things this is only minor. But if you have an idea to make it clearer, that's of course welcome.This reads as "is cachable if the request policy is applied and the response policy is applied".
Hence I think
apply()
is a misnomer. What aboutisCachable()
,accepts()
, or something else that indicates whether the policy does not forbid the cacheability?At first sight, this seems to imply that if the
Cache-Control
header is customized, that the response will not be marked as cacheable.But I don't think that's the case. I think what this essentially is saying is "if the response should be cacheable, and nothing has marked it as cacheable yet, then mark it as cacheable according to the configuration".
If that's correct, this could use a comment explaining exactly that.
Super nitpicky: Either both should have "The", or neither.
And it's not really the "default request/response policy", but the "default page caching request/response policy".
The namespace implies that for the class name, so that's fine, but the docs should "expand" to the full meaning.
s/Construct/Constructs/
Always 3rd person singular.
These should be mentioned in the docblock: a brief explanation of the default request policy.
Indicates whether this is a CLI request.
This is the mismatch I mentioned earlier. The
@return
documentation is clear and logical, but the method name doesn't match that, IMO.We indeed apply policy rules to a request or a response, but IMHO
isCacheable()
would be much more appropriate thanapply()
.Comment #3
znerol CreditAttribution: znerol commentedThanks for the review. Addressed everything except 3, 4 and 10. I need some more time for those.
Comment #4
Wim LeersInterdiff looks great. Hopefully we can get this to RTBC quickly now :) Looking forward to the fix for 3/4/10.
Comment #6
znerol CreditAttribution: znerol commentedRelocate
IsToolbarPath.php
to the proper PSR-4 directory.Comment #7
znerol CreditAttribution: znerol commentedRegarding #2.10:
I feel that it is the other way around. The
@return
documentation in fact does not reflect the implementation. Currently theRequestPolicy::apply()
works like this:FALSE
if any of the deny-rules returnsTRUE
TRUE
if any of the allow-rules returnsTRUE
FALSE
So the meaning of the
RequestRuleInterface::apply()
is inverted when used as adeny
-rule. This helps preventing double-negation. For exampleKillSwitch::apply()
returnsTRUE
if the kill-switch was triggered,CommandLineOrUnsafeMethod::apply()
returnsTRUE
when in command-line mode,IsBinaryFileResponse::apply()
returnsTRUE
if the given response is aBinaryFileResponse
.On the other hand
IsToolbarPath::apply()
returnsTRUE
if the request is directed to/toolbar/subtrees/{hash}
. TheIsToolbarPath
is used in the allow-chain of theRequestPolicy
.Perhaps we should figure out whether this is the proper approach before trying to fix the comment or the API.
Comment #8
msonnabaum CreditAttribution: msonnabaum commentedI need to look at the rest of the patch more closely, but it seems like this is introducing behavior that didnt exist before. Can we please keep this patch scoped to a refactor and leave the rest for followups?
For example, it appears that #6 is denying on binary files, which is very bad because they need to be sending cacheable headers.
Comment #9
znerol CreditAttribution: znerol commented@msonnabaum Regarding stream / binary file response, please take a look at #2218463: Private file and image style downloads throw 500 error when page cache is active.
Comment #10
znerol CreditAttribution: znerol commentedReroll after the kernel bootstrap patch. As a result it is possible to kill the workaround which was necessary to disable retrieval of cached pages during manual rebuild in
drupal_rebuild()
. Also this removes the response policy rules for binary file / stream responses.Comment #11
znerol CreditAttribution: znerol commentedTrying a new approach modeled after the access-checker we have in core already. This should address #10 regarding the naming and behavior of
apply()
by replacing it with acheck()
method which is inline with the other policy-implementations we have already in HEAD (Symfony and Drupal).Comment #12
BerdirThis looks great, but we should probably do some profiling to see how much it affects performance. Obviously it will be slower, but we will be able to deal with cases like toolbar in a much nicer way so that it will be possible to return a cached response there much faster. But we want to know how much we lose for that :)
Comment #13
Wim LeersLike Berdir, I think this looks great!
I only have one naming concern., Other than that I think this is good to go, but could use more reviews and of course the profiling that Berdir asked for.
"compound" vs. "chain". Let's always call it either? I think "compound" makes most sense, but I haven't got a strong preference. (The fact that it's evaluated as a chain is an implementation detail.)
:)
Comment #14
moshe weitzman CreditAttribution: moshe weitzman commentedAnyone available to move this ahead? @znerol?
Comment #15
Anonymous (not verified) CreditAttribution: Anonymous commentedapply patch fails all over the place, i'll try to get a reroll working today.
Comment #16
Anonymous (not verified) CreditAttribution: Anonymous commentedhere's a reroll.
had to manually put the hunks for the 5 files above back in, hopefully i didn't break too much.
Comment #17
Anonymous (not verified) CreditAttribution: Anonymous commentedComment #19
Anonymous (not verified) CreditAttribution: Anonymous commentedthere was a test that still used drupal_page_is_cacheable(), interdiff:
Comment #20
Anonymous (not verified) CreditAttribution: Anonymous commentedi did some profiling, and this patch adds 68 function calls to the page cache hit code path.
unsurprisingly, a bunch of the added overhead is in loading more class code and calling some implementation methods.
i didn't get a consistent enough wall-time across runs to say much about the difference there, but i'm satisfied that the overhead this patch introduces is not 'too much'.
Comment #21
Wim LeersAwesome, thanks!
That only leaves this remark of mine from #13:
What do you think about this, beejeebus and moshe? If you think it's unimportant, feel free to ignore it and RTBC this :)
Comment #22
moshe weitzman CreditAttribution: moshe weitzman commentedI'm fine with either word. If someone rerolls it, they can pick :). Its such a nit - lets go to RTBC.
Comment #23
Crell CreditAttribution: Crell commentedOR is a code smell. These shouldn't be tucked into one rule class.
Disabling caching if the method isn't safe really ought to go in Symfony itself. Can we submit an upstream patch for Response::prepare() so that this isn't necessary?
This isn't a service. It's a stateful global wrapped up into a class. We need a better way to handle this because stateful global services are how we destroy the loose coupling work we've been trying to do.
Can we centralize all of these constants? There's no need to repeat them in 3 different interfaces.
Comment #24
Crell CreditAttribution: Crell commentedComment #25
znerol CreditAttribution: znerol commentedThere would also be the option of renaming the class into something like
[...]\RequestPolicyRule\FundamentalRule
. However, I prefer the current precise name.Response::prepare()
is called just before the response is sent to the browser and after it is stored in / retrieved from the cache.Response::isCacheable()
is no option either, because there we do not have access to the request (to test the HTTP method). In factRequest::isMethodSafe()
is called from within symfonyHttpCache::handle()
but the symfony code never checks whether it is running in the CLI. We need a home for both checks as long as we do not useHttpCache
and even if we manage to implement it, we still need the CLI-check. Unless of course we manage to not wrap the kernel intoHttpCache
when running from the CLI.drupal_set_message()
anddrupal_rebuild()
, or alternatively those methods get eliminated and we have other ways to check whether a message was rendered on a page or not.static::DENY
anymore.Comment #26
Crell CreditAttribution: Crell commented3. There's another issue that is replacing drupal_set_message() with a service. It's not quite in but worth looking into to see it helps here: #2278383: Create an injectible service for drupal_set_message()
4. What we did for access control was to move the constants to an interface that all of the other objects/interfaces can implement. Since it's just constants, no methods, it means the static:: constants become available immediately as soon as the interface is implemented. That should work fine here, too, as the various other interfaces can just extend that interface and all is well.
Comment #27
Wim LeersSo #23.1 and #23.2 have been answered by znerol in #25.
#23.3 was answered and #26.3 shows that improving this further is indeed out of scope — I don't think any of us want to block this issue on #2278383: Create an injectible service for drupal_set_message(), which hasn't been making progress lately.
So I think that just leaves point 4. I agree with #26's general reasoning, but
AccessInterface
is about access checking, not page caching. So I don't think this would make a lot of sense. Especially becauseAccessInterface::ALLOW
andAccessInterface::KILL
would be meaningless. Finally, we don't want to make the page cache system dependent on the access system: they're orthogonal.Therefore: back to RTBC.
Comment #28
Crell CreditAttribution: Crell commentedI'm not suggesting we use the same interface as Access is using. Just that the cache system use its own "constants interface" rather than repeating things. It would have on it the same constants that are defined multiple times here.
Comment #29
alexpottWe need a change record and issue summary update.
A bit more detail would be great. Thanks!
Comment #30
moshe weitzman CreditAttribution: moshe weitzman commentedComment #31
moshe weitzman CreditAttribution: moshe weitzman commentedI expanded the Issue Summary and drafted a CR
Comment #32
Crell CreditAttribution: Crell commentedIf nothing else #28 is still outstanding.
Comment #33
znerol CreditAttribution: znerol commentedRe #28/#32 I'd like to point out that the
check()
method of request policy rules (those implementing theRequestPolicyInterface
) may return any ofallow
ordeny
while the one of a response policy rule only may return adeny
action.Comment #34
Wim Leers#30 and #31 addressed #29.
And #33 addressed #32.
Comment #36
znerol CreditAttribution: znerol commentedThis is a reroll of #11. Instead of triggering the kill-switch as suggested in #19, the proper way to disable the page cache for the image-style download controller is to exclude it using a request-rule. This has the advantage that sites using an advanced cache system can swap out that rule if necessary.
Comment #37
znerol CreditAttribution: znerol commentedUpdated the change record.
Comment #39
znerol CreditAttribution: znerol commentedFix
NodePreviewController
.Comment #40
BerdirAfter having similar problems in the globalredirect 8.x port, are you sure this is working correctly with language prefixes? AFAIK, the language prefix will be included in getPathInfo(). Especially because this probably happens before language negotiation?
Comment #41
znerol CreditAttribution: znerol commented@Berdir: Right, this is bad. What did you do about that over at Global Redirect?
Comment #42
znerol CreditAttribution: znerol commentedLet's switch to a response-policy for the image and node modules: Prevent that the response is saved to the page cache based on the route name.
The toolbar policy cannot be converted to a response policy, because it needs to be run before attempting to retrieve a page from the cache. Therefore let's just extend the check such that it also works when there is exactly one path-component before the toolbar-subtree path.
Comment #43
moshe weitzman CreditAttribution: moshe weitzman commentedInterdiff looks good, and we are green. RTBC.
Comment #44
chx CreditAttribution: chx commentedif ($cache_enabled && $request_policy->check($request) == RequestPolicyInterface::ALLOW)
erm nope.
if ($cache_enabled && $request_policy->check($request) === RequestPolicyInterface::ALLOW)
better.
why does this even exist? And why does it get $request? Very weird. Just inline it.
Comment #45
znerol CreditAttribution: znerol commentedDoh, you're certainly right. There are more, e.g. in
ChainRequestPolicy
.I was about to say because of the unit-tests, then I realized that they are missing. I'm sure I've written some, they probably live in some other branch or in a stash...
Comment #46
znerol CreditAttribution: znerol commentedUse identity instead of equality operator when checking return-values, add unit-tests.
Comment #47
chx CreditAttribution: chx commentedMuch better.
Comment #48
znerol CreditAttribution: znerol commentedAs suggested by @chx in IRC, make
ChainRequestPolicy
andChainResponsePolicy
throw an exception if garbage is returned from a rule.Comment #49
chx CreditAttribution: chx commentedI like robust APIs. It is not unreasonable that someone will return an unwanted TRUE or FALSE instead of the constants. This very nicely covers that. Thanks, znerol.
Comment #50
znerol CreditAttribution: znerol commentedUhm, turns out that I've put my unit-tests into the wrong place. Inside modules they need to be in
tests/src/Unit
instead ofsrc/Tests
.Also changed the name of the new result variable in
ChainRequestPolicy::check()
from$overall_result
to$final_result
.Comment #51
alexpottMissing @param
Missing @param
Missing @params
{@inheritdoc}
This class does not exist
chaching? :)
{@inheritdoc}
This class does not exist.
This could (perhaps should) be simplified by using the RouteMatch service. Plus the example in the CR is wrong because of this - it does not take account of the language prefix issue either.
This class not exist.
imediately?
{@inheritdoc}
Comment #52
znerol CreditAttribution: znerol commentedThanks for the review @alexpott, fixed the documentation strings. Regarding #51.9, the request-policy-checks run way before the router. I've updated the change notice to better highlight that fact.
Comment #53
znerol CreditAttribution: znerol commentedDiscussed the
AllowToolbarPath::check()
method with @alexpott and @Berdir in IRC. Let's use a regular expression matching the end of pathinfo in order to cater for arbitrary path prefixes. Also added a note to theRequestPolicyInterface
which highlights the peculiarities that implementors need to consider.Comment #54
moshe weitzman CreditAttribution: moshe weitzman commentedThx - back to RTBC
Comment #55
alexpottI think we should add another test - something like.
Once that is in I'm going to assign to @catch for rtbc review since I think this is an important and really good patch and therefore the more eyes the better.
Comment #56
znerol CreditAttribution: znerol commentedDone.
Comment #57
alexpottGreat - I think this is good to go - assigning to catch for rtbc review.
Comment #58
catchShould this line clarify "if any of the rules evaluated to static::ALLOW and none have evaluated to static::DENY"?
I'd probably change this to check DENY first and early return - makes it extra clear that that overrides everything else, right now there is no note in the documentation, and only the break; enforces it. Also similar methods elsewhere in the patch are doing early return.
Comment #59
catchEr, failed with dreditor last comment, meant here...
Comment #60
znerol CreditAttribution: znerol commentedOk, done.
Comment #62
znerol CreditAttribution: znerol commentedReroll after #2338571: Remove SessionManager::startLazy(), no changes.
Comment #63
chx CreditAttribution: chx commentedRound #2.
Comment #64
alexpottI'm happy that @catch review this in #58 and left it rtbc.
Committed fb6c562 and pushed to 8.0.x. Thanks!
Comment #66
Wim LeersWoot! Congrats, znerol!
According to #2257695: [meta] Modernize the page cache, #1597696: Consider whether HttpCache offers any significant benefit over the existing page cache is now the next step.