Problem/Motivation

The new router supports (or will support when it's all in place) different responses for the same uri based on accept headers. This is also the current design for the serialization work. However Drupal's page cache only caches by path, not accept headers.

Ideally we want to be able to do full caching in reverse proxies and/or the internal page cache regardless of the response (as you currently can for RSS albeit at a different path). We also don't want to serve incorrect responses if HTML gets in the page cache and it bypasses the routing system.

Proposed resolution

Throughout the comment chain below, we have identified the following proposed solutions:

@crell recommends in comment #3
Should be solved by using the mime type in the cache key, inside HttpCache (or Varnish if you're using that).

@beejeebus recommends in comment #5
...we have to normalise the Accept header before it gets to the backend, and include it in the cache key.

@greg.1.anderson in comment #16 (improvement idea)
The cache key should contain a hash of the data type that was stored in the cache, not the set of all types that were requested. If someone used wget to request just "text/html", it would not make sense to generate a different cache key than the request shown for the browsers listed in #13.

This is hard because we haven't bootstrapped modules at the time the cache key is generated and checked.

If a module needs different "granularity" of normalization (e.g. to return a different type for "application/xhtml+xml" than for "text/html"), then it would need to somehow specify in some way that xhtml is significantly different than html, and this would have to happen on a path-specific basis.

@larowlan in comment #42
Provided a patch that still fails testing, but seems to get close to a solution.

@effulgentsia in comment #44
The reason [the patch in #42] fails is that formats are currently registered in KernelEvents::REQUEST subscribers, but currently, page cache is pre-kernel. Also mentioned in #1984766-3: Start relying on Request/Response objects for cache handling

Remaining tasks

1. getting the patch in #42 to work correctly and pass testing
2. from comment #9: "we start hosting a sane sample config on Drupal.org (if we don't already) and we can include well-documented normalization routines there"
3. from comment #9: "Make page caching work with accept header-based routing"

User interface changes

n/a

API changes

drupal_page_get_cache() has changed to take a Request object, the $check_only param was removed.

the Request object is required to build a page cache cid from drupal_page_cache_get_cid().

nothing in core used the $check_only flag.

Original report by catch

Additional info from original report, not included in summary:

Someone is bound to point out that there is #1597696: Consider whether HttpCache offers any significant benefit over the existing page cache in the queue, but I see no sign of HttpCache supporting this either, i.e.

<?php
return $this->keyCache[$request] = 'md'.sha1($request->getUri());
?>

Opening this as critical, however it might be postponed on this feature actually being implemented.

I'm not sure which issue it is that actually does this, there is #1505080: [META] Content Negotiation for Accept Headers but this felt a bit off-topic in that issue. However if that's the one we could also just make sure that any patch that introduces this comes with tests to ensure this particular feature still works.

Files: 
CommentFileSizeAuthor
#79 1855260-79.patch7.71 KBbeejeebus
PASSED: [[SimpleTest]]: [MySQL] 58,021 pass(es).
[ View ]
#73 1855260-73.patch7.68 KBbeejeebus
PASSED: [[SimpleTest]]: [MySQL] 57,920 pass(es).
[ View ]
#68 1855260-68.patch7.01 KBbeejeebus
PASSED: [[SimpleTest]]: [MySQL] 57,992 pass(es).
[ View ]
#63 1855260-63.patch6.71 KBbeejeebus
PASSED: [[SimpleTest]]: [MySQL] 57,644 pass(es).
[ View ]
#61 1855260-61-test-only.patch2.92 KBbeejeebus
FAILED: [[SimpleTest]]: [MySQL] 57,944 pass(es), 1 fail(s), and 0 exception(s).
[ View ]
#61 1855260-61.patch6.62 KBbeejeebus
PASSED: [[SimpleTest]]: [MySQL] 57,841 pass(es).
[ View ]
#55 1855260-55.patch3.7 KBbeejeebus
PASSED: [[SimpleTest]]: [MySQL] 57,893 pass(es).
[ View ]
#42 accept-caching-1855260.test-only2.patch3.17 KBlarowlan
FAILED: [[SimpleTest]]: [MySQL] 55,612 pass(es), 2 fail(s), and 0 exception(s).
[ View ]
#42 accept-caching-1855260.42.patch3.17 KBlarowlan
FAILED: [[SimpleTest]]: [MySQL] 55,799 pass(es), 2 fail(s), and 0 exception(s).
[ View ]
#41 accept-caching-1855260.test-only.patch3.14 KBlarowlan
FAILED: [[SimpleTest]]: [MySQL] 55,885 pass(es), 1 fail(s), and 0 exception(s).
[ View ]
#41 accept-caching-1855260.41.patch4.73 KBlarowlan
FAILED: [[SimpleTest]]: [MySQL] 55,829 pass(es), 1 fail(s), and 0 exception(s).
[ View ]
#22 1855260-accept_headers_page_cache-22.patch1.59 KBmsonnabaum
FAILED: [[SimpleTest]]: [MySQL] 56,293 pass(es), 59 fail(s), and 3 exception(s).
[ View ]
#21 1855260-accept_headers_page_cache-21.patch877 bytesmsonnabaum
FAILED: [[SimpleTest]]: [MySQL] 56,430 pass(es), 71 fail(s), and 3 exception(s).
[ View ]
#18 1855260-accept_headers_page_cache-18.patch817 bytesmsonnabaum
FAILED: [[SimpleTest]]: [MySQL] 56,066 pass(es), 71 fail(s), and 3 exception(s).
[ View ]
#12 1855260-accept_headers_page_cache-12.patch825 bytesmsonnabaum
FAILED: [[SimpleTest]]: [MySQL] 55,543 pass(es), 14 fail(s), and 0 exception(s).
[ View ]
#11 1855260-accept_headers_page_cache-11.patch811 bytesmsonnabaum
FAILED: [[SimpleTest]]: [MySQL] 55,556 pass(es), 21 fail(s), and 28 exception(s).
[ View ]

Comments

catch’s picture

Hmm #1834288: RESTfully request an entity with JSON-LD serialized response went in recently, but there's no mention of page caching in that issue, so this might be bug instead.

No time to test at the moment but will do so another time if no-one beats me to it.

catch’s picture

Very old, similar-ish bug, except in that case skipping the page cache was OK as a fix: #339958: Cached pages returned in wrong language when browser language used.

With reverse proxies, I'm not sure how this can work now.

We're going to be normalizing the accept header in PHP, which means what the reverse proxy sees for accept, vs. what the routing system sees for accept is going to be different no?

Crell’s picture

I agree that this should be solved by using the mime type in the cache key, inside HttpCache (or Varnish if you're using that). Whether we do that now or wait for the HttpCache pipeline to be completed, I'm not sure. I'm inclined to wait to reduce the overall amount of work that needs to be done.

catch’s picture

Also I have zero idea how this is supposed to work with reverse proxies. If the accept header normalization is done in PHP, then short of implementing identical logic in varnish or nginx what do you do?

beejeebus’s picture

spoke to crell about this a bit.

as far as the http cache layer is concerned, the routing stuff means we may, or may not, get a different response from the backend based on the Accept header. so i think that means we have to normalise the Accept header before it gets to the backend, and include it in the cache key. make sense?

(we do something like this at $dayjob with the User-Agent and Accept-Encoding headers - normalise in nginx, so we can use them in the nginx cache without having a bazillion different keys when we're only varying on 'gzip/not gzip' and 'mobile/desktop'.)

chx’s picture

I recommend not worrying about nginx :) Just because PHP is able to do Accept massaging doesn't mean we can't implement it in nginx and a) add the massaged variable to the proxy_cache_key b) pass the massaged version on with proxy_set_header. That really is of no concern. (Heck we can farm it out to a PHP upstream to do the Accept massaging only if that's what we want; after all nginx can do everything; but that's not speedy.)

catch’s picture

Yeah I was thinking about this and it'd probably fine. The main thing is either having upstream override the header, or making sure that headers are normalized reliably in both PHP and upstream, but it should be doable either way. I'm mildly concerned about people running varnish who don't set this up then file bug reports but that just means a change notice somewhere to point them to.

Crell’s picture

You already have to have a Drupal-customized Varnish configuration to use it. Most people I know borrow the one from 4 Kitchens. It's probably time we start hosting a sane sample config on Drupal.org (if we don't already) and we can include well-documented normalization routines there.

YesCT’s picture

The issue could use updating by someone who understands the discussion.
Especially identifying remaining tasks.
It sounds like from #8 that one of the tasks should be

we start hosting a sane sample config on Drupal.org (if we don't already) and we can include well-documented normalization routines there

Are there other concrete tasks identified?

Also, is this a [no patch] issue?

If the proposed solution is the issue title: "Make sure page caching works with accept header-based routing" is that to be done along the way, or at a certain point in the release cycle? Via manual testing (would need "steps to test" to help folks do that)?

Maybe a title like "Make page caching work with accept header-based routing" is better and a patch would do that...

larowlan’s picture

msonnabaum’s picture

Status:Active» Needs review
StatusFileSize
new811 bytes
FAILED: [[SimpleTest]]: [MySQL] 55,556 pass(es), 21 fail(s), and 28 exception(s).
[ View ]

Perhaps I'm missing something, but does this patch not solve the issue? Seems pretty simple to me.

msonnabaum’s picture

StatusFileSize
new825 bytes
FAILED: [[SimpleTest]]: [MySQL] 55,543 pass(es), 14 fail(s), and 0 exception(s).
[ View ]

j/k

msonnabaum’s picture

Also, in my local testing, chrome, safari, firefox, iOS safari, and chrome on iOS all send these for Accept:

[
"text/html",
"application/xhtml+xml",
"application/xml",
"*/*"
]

So I'm not sure how much normalization actually has to be done.

effulgentsia’s picture

Issue tags:+D8 cacheability

tagging

Status:Needs review» Needs work

The last submitted patch, 1855260-accept_headers_page_cache-12.patch, failed testing.

greg.1.anderson’s picture

The cache key should contain a hash of the data type that was stored in the cache, not the set of all types that were requested. If someone used wget to request just "text/html", it would not make sense to generate a different cache key than the request shown for the browsers listed in #13.

This is hard because we haven't bootstrapped modules at the time the cache key is generated and checked. c.f. #5, above: seems the best thing to do is to normalize the accept headers at the time the cache key is generated, and make sure that the modules get the normalized accept headers, not the full accept headers. If a module needs different "granularity" of normalization (e.g. to return a different type for "application/xhtml+xml" than for "text/html"), then it would need to somehow specify in some way that xhtml is significantly different than html, and this would have to happen on a path-specific basis.

larowlan’s picture

Issue tags:+Needs tests

Tagging

msonnabaum’s picture

StatusFileSize
new817 bytes
FAILED: [[SimpleTest]]: [MySQL] 56,066 pass(es), 71 fail(s), and 3 exception(s).
[ View ]

Here's a better approach. Didnt see this method in the symfony 2.0 docs I was looking at.

Still broken because it doesn't handle setting yet.

msonnabaum’s picture

Actually, not better, because getRequestFormat just returns html unless you set it…

greg.1.anderson’s picture

Actually, #18 is moving in the right direction. We just need to run something like this first:

foreach (Drupal::request()->createFromGlobals()->getAcceptableContentTypes() as $type) {
  if (/* MAGIC HERE */) {
    Drupal::request()->createFromGlobals()->setRequestFormat($type);
  }
}

Drupal can later use getRequestFormat to determine what data type to build and return, and it will be consistent with what was done at cache time. All we need is the magic part. If we just used vget to fetch a table that could be uses on a per-path basis (maybe using a regex to match paths?), we could iterate through the available path / type mappings and return TRUE if the type and path both match. The variable could be rebuilt by a function that modules could hook to insert additional path/type mappings.

By default, the table would contain just "all paths" == "text/html". If nothing is set in the look, then getRequestFormat already defaults to "html". (Maybe we need to map "text/html" to "html", or set the default to "text/html" before running the loop.)

msonnabaum’s picture

StatusFileSize
new877 bytes
FAILED: [[SimpleTest]]: [MySQL] 56,430 pass(es), 71 fail(s), and 3 exception(s).
[ View ]

Here's what I was going for previously.

msonnabaum’s picture

Status:Needs work» Needs review
StatusFileSize
new1.59 KB
FAILED: [[SimpleTest]]: [MySQL] 56,293 pass(es), 59 fail(s), and 3 exception(s).
[ View ]

And here's the setting bit.

greg.1.anderson’s picture

Status:Needs review» Needs work

Nifty. Seems it would be good to make getContentType call setRequestFormat. Of course, getContentType also needs to work for types other than html, but it is a good start.

msonnabaum’s picture

Yes, I agree, I think setRequestFormat is the right method to call, but I figured that should be changed in a larger content negotiation issue.

effulgentsia’s picture

I think #12 would be an acceptable solution to the (Drupal) part of this issue that's prioritized as critical, especially given #13.

I think #16 would be a nice improvement, maybe even worthy of a "major" priority. To do that, we have a 'content_negotiation' service registered in core.services.yml. As long as when the drupal_page_get_cache() is running (or if/when HttpCache replaces it), we have access to the DIC in order to get the content_negotiation service, and are able to get a $request object to pass to it, then we can call Drupal::service('content_negotiation')->getContentType($request) to get the cache key we want. Per the issue summary, our implementation of that service will be improved in #1505080: [META] Content Negotiation for Accept Headers.

and this would have to happen on a path-specific basis

Once #1505080: [META] Content Negotiation for Accept Headers is in, we could in theory do that by collecting all routes that match the path, taking a union of their allowed formats, and passing that union to bestMatch(). However, that would require running a bunch of routing code, even on cache hits. I think it would be better to instead get an ordered array of possible matches, among all the formats supported by the site, and cache on that without regard to path-specific restrictions. I think the extra bit of cache duplication from that is lower cost than a much slower cache key generation process.

The main thing is either having upstream override the header, or making sure that headers are normalized reliably in both PHP and upstream, but it should be doable either way.

Yeah, per #8, this is relatively straightforward to do for proxies that can be configured (Varnish, nginx, etc.). The annoying elephant in the room is Akamai, which, according to http://www.rimmkaufman.com/blog/vary-user-agent/30112012/, will flat out refuse to cache any page that uses Vary: Accept (or any other header other than Accept-Encoding). My current thought on that though is to leave that to contrib to solve. Such a contrib module would need to implement whatever the D8 equivalents of hook_url_inbound_alter() and hook_url_outbound_alter() will end up being (plus we'll need to expose a client-side equivalent hook in the Drupal.url() JS function) to implement whatever strategy it wants for adding format-specific information to the URL. Brainstorming some names for such a module: http_legacy, unrest, retro_urls :)

effulgentsia’s picture

#25 was an xpost. Looks like #22 already connects to the content_negotiation service. Cool.

msonnabaum’s picture

Akamai is already incompatible with drupal because they ignore Vary: Cookie. I dont think this is worth considering.

msonnabaum’s picture

I should clarify, they are incompatible by default, but you can do whatever you need with a custom configuration from them as I recall.

msonnabaum’s picture

Status:Needs work» Needs review

Status:Needs review» Needs work

The last submitted patch, 1855260-accept_headers_page_cache-22.patch, failed testing.

Crell’s picture

+++ b/core/includes/bootstrap.inc
@@ -1126,6 +1125,15 @@ function drupal_page_get_cache($check_only = FALSE) {
+function drupal_page_cache_key() {
+  global $base_root;
+  $key = array($base_root);
+  $request = Drupal::request()->createFromGlobals();
+  $key[] = Drupal::service('content_negotiation')->getContentType($request);
+  $key[] = request_uri();
+  return md5(implode(',', $key));
+}

Doesn't this imply that the key we use is "html" or "json_hal", not the actual mime type? Which, in turn, means an inconsistency with how a reverse proxy will cache it?

That doesn't seem wise.

Also, we shouldn't be using request_uri(). That should really be deprecated. I also don't understand why you're recreating the request object here rather than just using Drupal::request().

msonnabaum’s picture

Using Drupal::service('content_negotiation')->getContentType() is essentially the normalization we were looking for. I'm using the same logic here to key the cache that we are to determine what response to send. I don't see anything wrong there.

I recreated the request because the headers weren't available unless I called createFromGlobals(). No idea why.

Removing request_uri is out of scope for this issue.

Crell’s picture

I mean the thing you get back from request_uri() can be gotten off of the request object.

If the headers are not available, that's a bug that should be investigated, not hacked around.

msonnabaum’s picture

catch’s picture

What about Accept headers sent by RSS feed readers and crawlers? Those are a lot less consistent than browsers usually. I wasn't able to find a handy list on the internet but it looks to me like both Drupal 6 and Drupal 7 aggregator module doesn't set any accept header when requesting feeds at all, for example.

beejeebus’s picture

i don't think that is a caching problem, but a routing one.

as far as i understand it, if a client asking for RSS doesn't set the right Accept headers, D8 will either serve them html instead of rss, or will return a 404.

catch’s picture

Well if ten different RSS readers visit the same feed URL with ten different sets of headers, how many unique cache IDs will they generate.

That's also a fun routing issue though if turns out like that.

greg.1.anderson’s picture

#22 does not suffer from the problem described in #37 (although #12 did). The existing function getContentType() used in #22 is very similar to the proposal in #20, except it is currently just a stub function that handles only html. As mentioned in #24, that will be addressed in a separate issue -- but I don' think there is an existing issue for it yet.

effulgentsia’s picture

except it is currently just a stub function that handles only html

It currently handles html, drupal_ajax, ajax, and iframeupload. Once #1959574: Remove the deprecated Drupal 7 Ajax API is in, I'm not entirely sure what the future of ajax and iframeupload will be. #1944472: Add generic content handler for returning dialogs adds drupal_dialog and drupal_modal. In HEAD, we also have a HalSubscriber that adds 'hal_json'. Other request subscribers can add more. But maybe KernelEvents::REQUEST is the wrong place to add these if they're needed by the page caching layer?

As mentioned in #24, that will be addressed in a separate issue -- but I don' think there is an existing issue for it yet.

Isn't that #1505080: [META] Content Negotiation for Accept Headers, or are you referring to some other work that's needed beyond the scope of that issue?

Doesn't this imply that the key we use is "html" or "json_hal", not the actual mime type? Which, in turn, means an inconsistency with how a reverse proxy will cache it?

Why is that a problem? So Drupal's page cache can benefit from a more efficient cache key, while a reverse proxy would use a less efficient cache key, unless someone configures it with some normalization logic based on knowledge of what Drupal modules are in use. Why is it necessary for Drupal and the reverse proxy to have identical normalization?

Well if ten different RSS readers visit the same feed URL with ten different sets of headers, how many unique cache IDs will they generate.

In the Drupal page cache, with #22, 1. After #1505080: [META] Content Negotiation for Accept Headers, possibly several, depending on how we want to implement fallbacks, but likely not many. In the reverse proxy, depends on what normalization logic it's configured with.

greg.1.anderson’s picture

Agree with #39; thanks for the corrections.

larowlan’s picture

Status:Needs work» Needs review
Issue tags:-Needs tests
StatusFileSize
new4.73 KB
FAILED: [[SimpleTest]]: [MySQL] 55,829 pass(es), 1 fail(s), and 0 exception(s).
[ View ]
new3.14 KB
FAILED: [[SimpleTest]]: [MySQL] 55,885 pass(es), 1 fail(s), and 0 exception(s).
[ View ]

Here's some tests now #1944472: Add generic content handler for returning dialogs is in - interdiff is the test-only patch.
Note that patch from 22 does not make the test pass so def needs work.

larowlan’s picture

StatusFileSize
new3.17 KB
FAILED: [[SimpleTest]]: [MySQL] 55,799 pass(es), 2 fail(s), and 0 exception(s).
[ View ]
new3.17 KB
FAILED: [[SimpleTest]]: [MySQL] 55,612 pass(es), 2 fail(s), and 0 exception(s).
[ View ]

Fixes the debug in the test and add missing blank line at end of yml file.
Interdiff against #22 is test only file.
Again will still fail but at least we have a test now.

Status:Needs review» Needs work

The last submitted patch, accept-caching-1855260.42.patch, failed testing.

effulgentsia’s picture

Thanks for the test! The reason it fails is that formats are currently registered in KernelEvents::REQUEST subscribers, but currently, page cache is pre-kernel. I mentioned that in #1984766-3: Change notice: Start relying on Request/Response objects for cache handling as well. I'm not yet clear on whether changes related to that should be discussed/done here, in that issue, or in some new issue.

catch’s picture

Category:task» bug
jaredsmith’s picture

This issue could use an updated issue summary, based on the templates at http://drupal.org/issue-summaries.

rainbreaw’s picture

@chuchunaku and @rainbreaw are working on summarizing this at the d8 sprint at drupalcon

rainbreaw’s picture

@chuchunaku and @rainbreaw have finished/posted our effort at a summary

rainbreaw’s picture

updating tag to remove "needs issue summary"

catch’s picture

catch’s picture

Title:Make sure page caching works with accept header-based routing» Page caching broken by accept header-based routing
chx’s picture

Just remove the router and be done.

catch’s picture

It's as much a REST module issue (the only user of this so far) as it is the router, see the Drupal 7 SA linked earlier.

larowlan’s picture

the dialog controller uses accept based routing too.

beejeebus’s picture

Status:Needs work» Needs review
StatusFileSize
new3.7 KB
PASSED: [[SimpleTest]]: [MySQL] 57,893 pass(es).
[ View ]

reviving this.

Damien Tournoud’s picture

Given the current state of "Content negotiation" in core, #55 is probably enough.

beejeebus’s picture

yeah, i am just dumbfounded at our use of accept headers with the ajax dialog controllers, but we can leave that ugly alone and still have the basics working.

klausi’s picture

Status:Needs review» Needs work
Issue tags:+Needs tests

I was hoping for a cleaner page caching solution in D8, but it seems that this is the best we can do right now.

This should have a test case that shows that we have actually fixed something.

dawehner’s picture

@@ -951,6 +951,23 @@ function variable_del($name) {
+  $cid_bits = array(
+    $request->getUri(),
+    Drupal::service('content_negotiation')->getContentType($request),
+  );
+  return md5(implode(':', $cid_bits));

I am wondering whether there should be also the query parameters be involved

Damien Tournoud’s picture

@dawehner: getUri() returns the absolute URL of the request, including scheme, host, possible base path, path info, and query string.

beejeebus’s picture

Status:Needs work» Needs review
Issue tags:-Needs tests
StatusFileSize
new6.62 KB
PASSED: [[SimpleTest]]: [MySQL] 57,841 pass(es).
[ View ]
new2.92 KB
FAILED: [[SimpleTest]]: [MySQL] 57,944 pass(es), 1 fail(s), and 0 exception(s).
[ View ]

added a test.

msonnabaum’s picture

Missing a @param doc for drupal_page_get_cache, but as soon as that's fixed, I think this is RTBC.

beejeebus’s picture

StatusFileSize
new6.71 KB
PASSED: [[SimpleTest]]: [MySQL] 57,644 pass(es).
[ View ]

added missing @param...

msonnabaum’s picture

Status:Needs review» Reviewed & tested by the community
Damien Tournoud’s picture

This is as smart as our content negotiation, so +1 from me too.

Damien Tournoud’s picture

Issue summary:View changes

Issue summary from @chuchunaku and @rainbreaw at drupalcon sprint

catch’s picture

Status:Reviewed & tested by the community» Needs work
+++ b/core/includes/bootstrap.incundefined
@@ -951,6 +951,23 @@ function variable_del($name) {
+function drupal_page_cache_get_cid(Request $request) {
+  $cid_bits = array(
+    $request->getUri(),
+    Drupal::service('content_negotiation')->getContentType($request),
+  );
+  return md5(implode(':', $cid_bits));

Other places in core we use $cid_parts, this is only internal so doesn't matter but it stuck out.

I'm still disgusted that we have a rule not to use md5() in core, but this breaks that horrible rule...

+++ b/core/modules/system/lib/Drupal/system/Tests/Bootstrap/PageCacheTest.phpundefined
@@ -41,6 +41,29 @@ function setUp() {
+    $this->drupalGet($accept_header_cache_uri);
+    $this->assertEqual($this->drupalGetHeader('X-Drupal-Cache'), 'MISS', 'HTML page was not yet cached.');
+    $this->drupalGet($accept_header_cache_uri);
+    $this->assertEqual($this->drupalGetHeader('X-Drupal-Cache'), 'HIT', 'HTML page was cached.');
+
+    $this->drupalGet($accept_header_cache_uri, array(), $json_accept_header);
+    $this->assertEqual($this->drupalGetHeader('X-Drupal-Cache'), 'MISS', 'Json response was not yet cached.');
+    $this->drupalGet($accept_header_cache_uri, array(), $json_accept_header);
+    $this->assertEqual($this->drupalGetHeader('X-Drupal-Cache'), 'HIT', 'Json response was cached.');

This is encouraging but could we also check that the content is correct?

It'd also be good to add a test case for variations on accept header, i.e. chrome sends today:

text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8

I found this old dodgy header from Safari, is that a cache hit or a new cache entry?

Accept: application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5

That and even worse examples on http://www.gethifi.com/blog/browser-rest-http-accept-headers

Crell’s picture

We really ought to be honoring the Vary header properly rather than just hard coding the Accept header. Our routing system currently supports routing by a whole lot of things. Our cache system should as well.

beejeebus’s picture

Status:Needs work» Needs review
StatusFileSize
new7.01 KB
PASSED: [[SimpleTest]]: [MySQL] 57,992 pass(es).
[ View ]

#66 - i've added everything except the tests on accept header. this new code doesn't do anything smart with accept headers, it asks for a normalised value from our existing content negotiation code. if we need more tests for the content negotiation bits, can we do it in another issue?

#67 - i have no helpful response to that, so i'll just pretend it doesn't exist.

Status:Needs review» Needs work
Issue tags:-Security Advisory follow-up, -D8 cacheability

The last submitted patch, 1855260-68.patch, failed testing.

beejeebus’s picture

Status:Needs work» Needs review
Issue tags:+Security Advisory follow-up, +D8 cacheability

#68: 1855260-68.patch queued for re-testing.

alexpott’s picture

If we going to invoke the no md5() in core rule then we have to do something about Symfony, Guzzle, Twig, Zend Feed, Assetic, PHPUnit. Just saying :)

Crell’s picture

Status:Needs review» Needs work

alexpott: All the more reason we should move those out of the repository and pull them in through Composer properly. ;-)

  1. @@ -958,23 +975,15 @@ function variable_del($name) {
    +    $cache = cache('page')->get(drupal_page_cache_get_cid($request));

    As long as we're touching this line, can we switch it to use \Drupal::cache() instead? One less function call, one more autoloadable class call. This is a good thing.

    (I'm tempted to ask to make all of this functionality a classed object, actually...)

  2. @@ -1,11 +1,17 @@
    function system_test_menu() {
    +  $items['system-test/page-cache/accept-header'] = array(
    +    'page callback' => 'system_test_accept_header',
    +    'access callback' => TRUE,
    +    'type' => MENU_CALLBACK,
    +  );

    MENU_CALLBACK menu items no longer exist. They should be routes only. Please don't add more things we just need to remove later. :-)

  3. @@ -280,3 +286,16 @@ function system_test_authorize_init_page($page_title) {
    +/**
    + * Page callback: serves different responses based on the HTTP Accept header.
    + */
    +function system_test_accept_header() {

    As above, this should be a proper controller. And in that case it can take the Request as an injected value rather than using \Drupal.

At least the way this is structured now, changing how the cid is calculated to be more robust shouldn't touch anything else. So I'm OK with this for now, with the comments above.

beejeebus’s picture

Status:Needs work» Needs review
StatusFileSize
new7.68 KB
PASSED: [[SimpleTest]]: [MySQL] 57,920 pass(es).
[ View ]

updated patch as per review in #72.

Crell’s picture

@@ -0,0 +1,7 @@
+system_test.page_cache_accept_header:
+  pattern: '/system-test/page-cache/accept-header'
+  defaults:
+    _controller: '\Drupal\system_test\Controller\PageCacheAcceptHeaderController::content'
+  requirements:
+    _access: 'TRUE'
+

Thinking about it... This should really be 2 separate controller methods, and 2 separate routes. That is what we want to encourage people to do anyway, so it's a more accurate integration test.

Is it feasible to move drupal_page_cache_get_cid(), drupal_page_set_cache(), and drupal_page_get_cache() to a classed object, specified via Settings? That would allow the code to be autoloaded and allow sites to swap out the cid logic themselves for whatever else they decide it should be.

beejeebus’s picture

i really don't want to do the two separate routes/controllers thing. we are not testing the routing here - that's an implementation detail that the cache system does not care about. we are testing 'the same URI returns different content based on Accept', not 'our routing system can use different controllers based on Accept'. this is much like me pushing back on catch asking for a bunch of different Accept headers, because that would be testing the content negotiation code, not the cache system.

also, i agree with wanting to allow the cid generation to be alterable/pluggable, but was hoping to do that in a follow up. this late in the cycle, small things are easier to land...

update: i created #2062463: allow page cache cid to be alterable for the cid generation bit.

msonnabaum’s picture

I agree about routing. It's a test, nothing matters outside of the code under test.

And yes, let's please do the conversion to an object/service in a followup since that's pure refactoring. Let's just get this in.

Crell’s picture

Status:Needs review» Reviewed & tested by the community

OK, incremental it is.

alexpott’s picture

Status:Reviewed & tested by the community» Needs work
+++ b/core/includes/bootstrap.incundefined
@@ -958,23 +975,15 @@ function variable_del($name) {
     if ($cache !== FALSE) {
       $cache_hit = TRUE;
     }

This can be removed form drupal_page_get_cache() since you've removed the check only functionality.

Otherwise looks good to me and manual testing confirms it's working as expected.

beejeebus’s picture

Status:Needs work» Reviewed & tested by the community
StatusFileSize
new7.71 KB
PASSED: [[SimpleTest]]: [MySQL] 58,021 pass(es).
[ View ]

nice catch, only change is to update for #78.

Damien Tournoud’s picture

Looks good to me too.

catch’s picture

Status:Reviewed & tested by the community» Fixed

This still feels incredibly fragile overall, but the patch itself is fine now.

Committed/pushed to 8.x.

Automatically closed -- issue fixed for 2 weeks with no activity.

Anonymous’s picture

Issue summary:View changes

added api change.

also, whiskey is nice.