One of the big intended benefits from the Kernel work is the ability to use Http caching logic directly rather than implementing our own caching logic. The Kernel component includes a PHP-space implementation of an Http reverse proxy cache. We want to use that for all kernel-called page segments; that is, the entire page itself plus all of the sub-requests. The easiest way to do that is to move the kernel from the 'kernel' DI entry to some other entry, and then make the 'kernel' entry an instance of the HttpCache object, configured with a Store class of our own that saves stuff to the Drupal cache system.

For now, we should let the HttpCache use whatever logic it uses. We can refine it later after the pieces are in place.

Of course, we'll also want a way to disable that cache object when a site is running behind a real proxy cache like Varnish, which is going to be way faster anyway.

This probably needs to wait for #1595146: Load the HttpKernel from the DI Container

Files: 
CommentFileSizeAuthor
#90 1597696-use-symfony-http-cache.patch17.77 KBznerol
FAILED: [[SimpleTest]]: [PHP 5.4 MySQL] 77,294 pass(es), 14 fail(s), and 2 exception(s).
[ View ]
#70 drupal-httpcache-7269676-70.patch17.68 KBiamEAP
FAILED: [[SimpleTest]]: [MySQL] 57,613 pass(es), 61 fail(s), and 10 exception(s).
[ View ]
#64 1597696-httpcache-2.patch17.67 KBCrell
FAILED: [[SimpleTest]]: [MySQL] 54,163 pass(es), 45 fail(s), and 0 exception(s).
[ View ]
#63 1597696-httpcache.patch41.47 KBCrell
FAILED: [[SimpleTest]]: [MySQL] 53,935 pass(es), 46 fail(s), and 0 exception(s).
[ View ]
#44 1597696-44-http-cache.patch9.17 KBbeejeebus
FAILED: [[SimpleTest]]: [MySQL] 40,346 pass(es), 48 fail(s), and 1 exception(s).
[ View ]
#42 1597696-42-http-cache.patch8.91 KBbeejeebus
FAILED: [[SimpleTest]]: [MySQL] 23,717 pass(es), 14,357 fail(s), and 1,878 exception(s).
[ View ]
#39 1597696-39-http-cache.patch8.96 KBbeejeebus
FAILED: [[SimpleTest]]: [MySQL] Failed to run tests: failed to enable simpletest module.
[ View ]
#37 drupal_httpcache-37.patch8.73 KBmsonnabaum
FAILED: [[SimpleTest]]: [MySQL] 40,274 pass(es), 55 fail(s), and 0 exception(s).
[ View ]
#35 drupal_httpcache-35.patch8.01 KBmsonnabaum
FAILED: [[SimpleTest]]: [MySQL] 40,289 pass(es), 50 fail(s), and 0 exception(s).
[ View ]
#31 drupal_httpcache-31.patch6.1 KBmsonnabaum
FAILED: [[SimpleTest]]: [MySQL] 40,611 pass(es), 98 fail(s), and 5 exception(s).
[ View ]
#29 drupal_httpcache-29.patch6.08 KBmsonnabaum
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch drupal_httpcache-29.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]
#23 drupal_httpcache.patch13.72 KBR.Muilwijk
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch drupal_httpcache_1.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]
#22 drupal_httpcache.patch12.82 KBR.Muilwijk
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch drupal_httpcache_0.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]
#8 drupal_httpcache.patch12.7 KBmsonnabaum
FAILED: [[SimpleTest]]: [MySQL] 39,577 pass(es), 170 fail(s), and 0 exception(s).
[ View ]

Comments

Crell’s picture

Issue tags:+WSCCI, +symfony, +kernel-followup

Tagging.

catch’s picture

This needs benchmarks, we may well want to do it even if there's a regression, but let's not have another nasty surprise like #1064212: Page caching performance has regressed by 30-40% in D7 (which was due to a very tiny, innocent patch).

I'm also interested to see how this could tie in with cache tag support. We don't have proper content-based cache tags attached to cache items yet, but there's proof of concept code that integrates with cache tags with varnish already, so if varnish can handle it, then HttpCache should be able to as well.

Crell’s picture

Agreed entirely with #2. I have no idea how cache tag support could tie in here, frankly, but it should be investigated. :-)

That said, there are key differences between the way Drupal has traditionally cached things and the way Symfony assumes you'll cache things.

In Drupal, we assume "cache forever, but flush often". That emphasizes data freshness over performance or cache efficency, and results in less useful caches.

Symfony, from what I've seen, encourages "cache briefly, don't bother flushing." That emphasizes cache efficiency over data freshness, at the expense of some rendered information being briefly stale.

I don't believe we should switch over to "cache briefly, don't flush" wholesale; however, we should consider where that is a more effective strategy; if the code is simplier and more stable and more reliable if we cache for 5 seconds and accept that "meh, this view could be stale for 5 seconds, whatever", is that OK? Where is that OK?

I don't know, but we should be asking that.

catch’s picture

is that OK? Where is that OK?

I think that's a site-specific question. You can currently set Drupal up to take either of these approaches via contrib and/or configuration, and we should be careful not to close either of them off to much (for example if you don't want content to clear on cache tags, it's 5 minutes to write a cache backend that ignores them).

Crell’s picture

Well, then we need to make it code-free to adjust your caching strategy, at least within reason. That won't be a simple task. :-)

Either way, I think the first implementation should be kept as simple as possible and just restructure the code to use/not use HttpCache wrapped around the kernel.

Niklas Fiekas’s picture

By cache tags you mean a unique identifier for the content, so that if the tags are equal, then the content does not need to be regenerated? HTTP (and Symfony2) supports the validation approach of caching via ETags.

beejeebus’s picture

re #6, cache tags as in the cache tags support in Drupal's cache API - http://drupalcode.org/project/drupal.git/blob/refs/heads/8.x:/core/lib/D...

msonnabaum’s picture

StatusFileSize
new12.7 KB
FAILED: [[SimpleTest]]: [MySQL] 39,577 pass(es), 170 fail(s), and 0 exception(s).
[ View ]

Here's a rough POC that I've been working on.

This should replace our default page caching with the file-based HttpCache. We'll obviously need to make additional storage backends, but I'm not against changing our default MySQL backend for flat files.

The biggest problem I see thus far is that the earliest we can serve a page from cache is after DRUPAL_BOOTSTRAP_CODE. This is much later than I'd like, but as long as we know there's work to be done to instantiate the HttpKernel earlier, this shouldnt be a blocker.

Crell’s picture

Status:Active» Needs review

bot snack?

Status:Needs review» Needs work

The last submitted patch, drupal_httpcache.patch, failed testing.

Crell’s picture

+++ b/core/lib/Drupal/Core/HttpCache.php
@@ -0,0 +1,14 @@
+use Symfony\Component\HttpKernel\HttpCache\HttpCache as SymfonyHttpCache;
+
+class HttpCache extends SymfonyHttpCache {
+  protected function getOptions() {
+    $config = config('system.performance');
+    return array(
+      'debug'                  => TRUE
+    );
+  }
+}

As soon as the bundles patch goes in, which I believe puts the config system into the DIC, we should start injecting it and eliminating the use of the config() wrapper function.

I especially like how this is 2/3 removing code rather than adding it. :-) Is that because the existing Symfony code does what we were doing, or because you just didn't get to that part yet?

msonnabaum’s picture

As soon as the bundles patch goes in, which I believe puts the config system into the DIC, we should start injecting it and eliminating the use of the config() wrapper function.

Agreed.

I especially like how this is 2/3 removing code rather than adding it. :-) Is that because the existing Symfony code does what we were doing, or because you just didn't get to that part yet?

A little of both. Its hard to follow at the moment because we left in a bunch of code that calls header() directly after the initial kernel patch. In this case it was easier to just rip that code out and get the cache working as we'd expect it to. A lot of what is removed is handling around 304s, which I think we're getting already. It's possible that we'll need to use HttpCache's validation to do some additional 304 checking.

Crell’s picture

Gotcha. I'm OK with a temporary regression in the elegance of our 304 handling while we suss out what if anything we need to do atop Symfony's existing support for that.

The original kernel patch had a target of "keep tests passing", so there's definitely some needlessly redundant header() calls lying about still. I think those should all be eliminated entirely.

moshe weitzman’s picture

Bundles patch is in. Hoping someone can get to this soon.

R.Muilwijk’s picture

The patch is not what I would expect. The HttpCache is added at the end just before the kernel handle's the request. This is after the complete bootstrap. Shouldn't the cache be utilized in the DRUPAL_BOOTSTRAP_PAGE_CACHE phase. Is this even possible with the Kernel + HttpCache at that early stage?

R.Muilwijk’s picture

Another thing we have to think about is the File Store because we can't assume we can use /tmp. How can we retrieve the file store location without getting the configuration from db?

pounard’s picture

@#15 I guess that full bootstrap is slowly being freed from its fat while services are being ported into the DIC. I hope that in stable 8.x bootstrap will only be a matter of a few millisec.

pounard’s picture

@#16 I think that this kind of early needed configuration variables should be set in settings.php (pretty much like advanced cache backends configuration).

R.Muilwijk’s picture

What would be the proper solution to be able to override HttpCache and Store in for examply your settings.php?

pounard’s picture

I don't think I can answer that much, I'm, just like you, waiting to see a working patch to know :)

R.Muilwijk’s picture

@#17 so should we just aim for removing the DRUPAL_BOOTSTRAP_PAGE_CACHE completely?

@#18 Then settings.php should be changed to have the configuration in it and it should be set during install. Also what to do with the UI where you can set the temporary directory?

@#20 I'll do something similar like the cache_backends:

$class = isset($cache_backends[$bin]) ? $cache_backends[$bin] : $cache_backends['cache'];
$cache_objects[$bin] = new $class($bin);
R.Muilwijk’s picture

StatusFileSize
new12.82 KB
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch drupal_httpcache_0.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]

My latest version of the patch. Lets meet up at the sprint.

R.Muilwijk’s picture

StatusFileSize
new13.72 KB
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch drupal_httpcache_1.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]

Forgot to add the HttpCache.

R.Muilwijk’s picture

So these are the things I think we need to sort out:

  1. Is FinishResponseSubscriber.php the correct place to set these headers with latest wscii branch?
  2. How are we going to deal with stores. We can go for a few directions
    • a Drupal Store which will just use the current Drupal cache backend and therefore the modules like memcached, apc, filestore etc.
    • Use the default Symfony (File) Store. If people want to use a different store they can get the Components from Symfony
    • Create a DocTrine Cache Store which is able to leverage the caches of Doctrine currently in core.
  3. Whats going to be the default store?
  4. At the moment the cache is checked early in the bootstrap just after loading the settings file (DRUPAL_BOOTSTRAP_PAGE_CACHE) . At that point we do not have access to the kernel so a suggestion would be to just remove that fase (Though we would have to move the drupal_block_denied()). We can also remove the whole cache_backends part which was introduced for early page cache. One thing that's important would be that the bootstrap is just setting up but not actually booting the database connection and other intensive parts when the kernel is just going to serve the cache Response.
  5. We did some changes to index.php. Are those ok?
  6. The function drupal_page_is_cacheable() is something to set the state whether or not the page is cacheable and does some extra checks. For example in session.inc when it's an authenticated user a FALSE would be saved so the page isn't cached. The function however doesn't have access to kernel. Also the checks like it's a GET or HEAD request should be handled by the Response / HttpCache. I think one approach would be override the Response isCacheable method though I'm not sure how we can handle a total page state, HttpKernel?
  7. When we are using symfony or doctrine store we should remove the cache_page cruft.
  8. Make drupal_flush_all_caches work with HttpCache
  9. How are we going to deal with the seperate caching systems. Contrib owners are used to do cache_clear_all(''). But after HttpCache there will be two different caceh clear approaches. Is this something we really want?
  10. There is some If-Modified-Since logic in the old page cache which I do not understand fully yet. I don't think the HttpCache will deal with this for us.
  11. In Drupal 7 we have cache_page which is stored as CACHE_PERMANENT unless you utilize the minimum page cache variable. The HttpCache however assumes a Ttl and does isFresh calls. There for we need a default_value (max_age) the page will be cached. We can set this on for example 5 minutes and allow the minimum page cache variable to change this. However it should be cool if you could change this for different types of Responses for example blocks, content areas, views on frontpage etc.

UPDATE: added 11

iamEAP’s picture

On 11: D7 stores cache_page (and cache_block) entries as CACHE_TEMPORARY, even with minimum cache lifetime.

If HttpCache assumes a ttl, it sounds like we may be scrapping cache minimum for cache maximum.

aspilicious’s picture

Status:Needs work» Needs review

Triggering bot

+    return array(
+      'debug'                  => TRUE

Formatting issue

Status:Needs review» Needs work

The last submitted patch, drupal_httpcache.patch, failed testing.

cosmicdreams’s picture

+++ b/core/lib/Drupal/Core/HttpCache.phpundefined
@@ -0,0 +1,29 @@
+    $config = config('system.performance');
+    return array(
+      'debug'                  => TRUE

Why define $config here if we're not going to do anything with it?

+++ b/index.phpundefined
@@ -13,7 +13,9 @@
use Drupal\Core\DrupalKernel;
use Symfony\Component\HttpFoundation\Request;
-
+use Drupal\Core\HttpCache;
+use Symfony\Component\HttpKernel\HttpCache\Store;

Would be nice if we could group all included Symfony libraries and all Drupal libraries, Symfony first.

+++ b/index.phpundefined
@@ -30,6 +32,11 @@ drupal_bootstrap(DRUPAL_BOOTSTRAP_CODE);
+if (config('system.performance')->get('cache') && drupal_page_is_cacheable()) {
+  $kernel = new HttpCache($kernel, new Store("/tmp/meuk"));

Another reason not to define $config within Drupal\Core\HttpCache if we're not going to use it. We're loading it again here.

msonnabaum’s picture

Status:Needs work» Needs review
StatusFileSize
new6.08 KB
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch drupal_httpcache-29.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]

Thanks to everyone who has recently reviewed this patch. Although many points brought up here are valid, I'd like to postpone most of them until we get the basics working with tests passing. For example, I'm purposefully not dealing with alternative stores atm. Just need to get the default working first. Also, being called so late in the bootstrap is a problem, but perhaps not this patch's problem. Let's revisit once it works.

Some changes in #1698108: Update Drupal's dependencies broke this patch pretty badly, so here's a quick reroll which I believe only takes those changes into account. Actual caching of pages seems to work correctly, but the tests that aren't passing are legit. We still need to figure out how and where to handle the logic around "page_cache_invoke_hooks".

Status:Needs review» Needs work

The last submitted patch, drupal_httpcache-29.patch, failed testing.

msonnabaum’s picture

StatusFileSize
new6.1 KB
FAILED: [[SimpleTest]]: [MySQL] 40,611 pass(es), 98 fail(s), and 5 exception(s).
[ View ]

Let's try that again.

geerlingguy’s picture

Status:Needs work» Needs review

Status:Needs review» Needs work

The last submitted patch, drupal_httpcache-31.patch, failed testing.

cosmicdreams’s picture

if we intend to completely replace our caching mechanisms with the HttpCache, it might be beneficial to rip that out now and see what breaks. There seems to be a lot of issues involving the caching of user information. It's unclear to me what's the cause. With guidance I can try to reproduce the tests and focus on part of this patch.

msonnabaum’s picture

StatusFileSize
new8.01 KB
FAILED: [[SimpleTest]]: [MySQL] 40,289 pass(es), 50 fail(s), and 0 exception(s).
[ View ]

Found quite a few problems caused by my subclassing HttpCache from HttpKernel instead of from the framework bundle which the docs show. That said, there's parts of the framework bundle version that we don't want, so I just pulled in the relavant bits into our HttpCache class. Attached is a new patch that fixes some of this.

Unfortunately, I don't think we can fix the bootstrap hook issues with the way things are now. We need to move all of our bootstrap, with maybe DRUPAL_BOOTSTRAP_CONFIGURATION as an exception, inside the kernel. It's very awkward that we're mixing them right now and too much work is already done by the time we even instantiate the kernel.

This would allow us to fix the hook issues and it would also eliminate the performance regression I mentioned earlier, so it's worth doing asap. I'm considering this issue blocked until that is done.

catch’s picture

if we intend to completely replace our caching mechanisms with the HttpCache, it might be beneficial to rip that out now and see what breaks. There seems to be a lot of issues involving the caching of user information.

We can eventually replace all our HTML caching with HttpCache (i.e. page/block/drupal_render() in core), but there's plenty else that's not replaceable, and Symfony has plenty of its own caching (just renamed to compiling so it can pretend it's not).

@msonnabaum I'd consider having only one bootstrap path a critical task for Drupal 8 - it's OK if we have a small Drupal bootstrap before the kernel, but it's not really OK to have these two bootstraps mixed in with each other wrapping around.

msonnabaum’s picture

StatusFileSize
new8.73 KB
FAILED: [[SimpleTest]]: [MySQL] 40,274 pass(es), 55 fail(s), and 0 exception(s).
[ View ]

I agree with catch that we should move it all in, but it turned out that it was a tiny change to get what I need and also not break anything by leaving DRUPAL_BOOTSTRAP_CONFIGURATION out.

However, this has the rest of the bootstrap in the DrupalKernel, not the HttpCache wrapper. We all talked in IRC and were mostly in favor of removing the "normal" cache setting in favor of "aggressive" only. Supporting bootstrap hooks will be messy, so whether we can do it eventually or not, this is the simplest version start with.

msonnabaum’s picture

Actually, as beejeebus pointed out to me, having the bootstrap in init() doesn't fix anything. It needs to go in boot().

beejeebus’s picture

Status:Needs work» Needs review
StatusFileSize
new8.96 KB
FAILED: [[SimpleTest]]: [MySQL] Failed to run tests: failed to enable simpletest module.
[ View ]

updated patch to add #38.

Status:Needs review» Needs work

The last submitted patch, 1597696-39-http-cache.patch, failed testing.

Crell’s picture

+++ b/core/lib/Drupal/Core/EventSubscriber/FinishResponseSubscriber.php
@@ -83,6 +80,21 @@ class FinishResponseSubscriber implements EventSubscriberInterface {
+    $vary = array('Accept-Encoding');
+    if (!variable_get('omit_vary_cookie', FALSE)) {
+      $vary[] = 'Cookie';
+    }
+    $response->setVary($vary);

On the assumption that this variable_get() will turn into a config object lookup, we should drop a @todo here noting that and that the config object should get injected when we get to that part.

+++ b/core/lib/Drupal/Core/EventSubscriber/FinishResponseSubscriber.php
@@ -83,6 +80,21 @@ class FinishResponseSubscriber implements EventSubscriberInterface {
+    $max_age = (int)config('system.performance')->get('cache.page.max_age');

In fact, that's probably the config object it should be on! :-)

+++ b/core/lib/Drupal/Core/EventSubscriber/FinishResponseSubscriber.php
@@ -83,6 +80,21 @@ class FinishResponseSubscriber implements EventSubscriberInterface {
+    $max_age = !isset($_COOKIE[session_name()]) ? $max_age : 0;

Cookies and sessions should come from the $request. This is where the sessions->symfony issue comes in useful. If we can't do that yet because it's blocked on that patch, at least we should drop a @todo in here.

+++ b/core/lib/Drupal/Core/HttpCache.php
@@ -0,0 +1,73 @@
+      \AD::ffs('HttpCache::__construct');

WAT? That could be why this won't install, perhaps?

+++ b/core/lib/Drupal/Core/HttpCache.php
@@ -0,0 +1,73 @@
+    public function cacheEnabled() {
+      return config('system.performance')->get('cache.page.enabled');
+    }

Hm. I don't know how we can properly inject that config object. Probably it will have to just be hand-rolled in index.php. This far down, I think I'm OK with that.

-1 days to next Drupal core point release.

beejeebus’s picture

Status:Needs work» Needs review
StatusFileSize
new8.91 KB
FAILED: [[SimpleTest]]: [MySQL] 23,717 pass(es), 14,357 fail(s), and 1,878 exception(s).
[ View ]

updated patch addresses #41.

Status:Needs review» Needs work

The last submitted patch, 1597696-42-http-cache.patch, failed testing.

beejeebus’s picture

Status:Needs work» Needs review
StatusFileSize
new9.17 KB
FAILED: [[SimpleTest]]: [MySQL] 40,346 pass(es), 48 fail(s), and 1 exception(s).
[ View ]

spent some fun time* figuring out why 14k tests fail. the main problem seems to be this flow:

Drupal\Core\HttpCache::cacheEnabled --> config --> Drupal\Core\Config\Config::load --> Drupal\Core\Config\DatabaseStorage::read --> Drupal\Core\Config\DatabaseStorage::getConnection --> Drupal\Core\Database\Database::getConnection --> Drupal\Core\Database\Database::openConnection --> Drupal\Core\Database\Driver\mysql\Connection::__construct --> Drupal\Core\Database\Connection::__construct --> Drupal\Core\Database\Connection::setPrefix

this happens before _drupal_bootstrap_database(), so we haven't had a chance to munge global $databases with the test prefix, so all the db queries hit the parent drupal.

i've fixed that with a total hack of just calling drupal_bootstrap(DRUPAL_BOOTSTRAP_DATABASE) in Drupal\Core\HttpCache::cacheEnabled(), which should let us see what the next series of fails are.

fixing the way we handle needing the database early in the request is out of scope for this issue.

* that may have been sarcasm.

Status:Needs review» Needs work

The last submitted patch, 1597696-44-http-cache.patch, failed testing.

beejeebus’s picture

a note for reviewers/testers: as well as turning on the page cache, you need to set a max lifetime.

beejeebus’s picture

ok, spent some time actually reading the Symfony HttpCache docs, and in the very helpful #symfony channel.

i don't think the pattern we've started with here is the right way to go, at all.

the symfony docs and the IRC channel make it clear we should only wrap our app kernel *if there's no http cache in front of drupal*.

so i guess we need to work out the best way to implement this pseudo-code for real:

<?php
drupal_bootstrap
(DRUPAL_BOOTSTRAP_CONFIGURATION);

if (
config('system.performance')->get('use.symfony.http.cache')) {
 
// We're on a shared host or something, so use a slow, non-scaling http cache.
 
$kernel = new HttpCache(new DrupalKernel('prod', FALSE), new Store(variable_get('file_public_path', conf_path() . '/files/http_cache')));
}
else {
 
// We have a real http cache in front of us.
 
$kernel = new DrupalKernel('prod', FALSE);
}
?>
Stof’s picture

Hi,

Just my 2 cents after the discussion with beejeebus on IRC to give the same knownledge to others.

Wrapping the kernel in the HttpCache will indeed be bad when you want to use Varnish, as it would continue to handle the ESI in PHP instead of letting Varnish doing it.

However, it is possible to check very easily if the HttpCache is needed. See https://gist.github.com/3551077

I haven't read the full discussion and I haven't checked the latest status of Drupal, so forgive me if the following advice is useless now.
When using the HttpCache, the instantiation of the DrupalKernel should be as lightweight as possible, as it will be done even when the cache is used (and so the drupal code does not need to be called). The heavy initialization should be done only when booting the kernel (which will occur only on cache miss, when the HttpCache needs to forward the request to the DrupalKernel).
If the instantiation of the kernel is heavy, you will have a bigger performance impact when switching between HttpCache and Varnish (as Varnish will also avoid the instantiation when the cache is used of course).

EDIT: here the code so that you don't need to go to the gist, as your comments support syntax highlighting

<?php
use Drupal\Core\DrupalKernel;
use
Symfony\Component\HttpFoundation\Request;
use
Symfony\Component\HttpKernel\HttpCache\Esi;
use
Symfony\Component\HttpKernel\HttpCache\HttpCache;
use
Symfony\Component\HttpKernel\HttpCache\Store;

// Build a drupal kernel
$kernel = new DrupalKernel()

$request = Request::createFromGlobals()

$esi = new Esi()
if (!
$esi->hasSurrogateEsiCapability($request)) {
   
// There is no reverse proxy supporting ESI in front, so add the Symfony HttpCache
    // this is not executed when you have Varnish in front and properly configured
   
$kernel = new HttpCache($kernel, new Store(), $esi);
}

$response = $kernel->handle($request);
?>
manarth’s picture

The hasSurrogateEsiCapability() function depends on headers from the proxy:

    public function hasSurrogateEsiCapability(Request $request)
    {  
        if (null === $value = $request->headers->get('Surrogate-Capability')) {
            return false;
        }  

        return false !== strpos($value, 'ESI/1.0');
    } 

The ESI module already has a client-side JavaScript implementation for ESI, so there's a use-case where ESIs should be delivered, but there are no HTTP headers to indicate this.

I'd prefer to handle it via config instead.

pounard’s picture

@manarth Could the JS add additional headers instead as part of the downgrade behavior?

Crell’s picture

If we wanted to use in-browser ESI, we should just use hInclude instead. Symfony already includes support for that in the HttpKernel from the FrameworkBundle, that we're already using.

beejeebus’s picture

i've created #1766186: Move test prefix $databases munging earlier in the bootstrap to deal with the test-side global $databases munging.

with that patch, we should be able to call config() right after drupal_bootstrap(DRUPAL_BOOTSTRAP_CONFIGURATION), so i'm going to reroll with that patch included just to keep this moving.

things will shift further as config changes, but at least we can get the 'check config to see if we should wrap' pattern in, and fix the remaining fails.

Crell’s picture

#48 implies that we don't need a conditional. We just detect headers and we're done with it. What's the use case for doing otherwise? (JS-based pseudo-ESI is not one, because we can either add the headers we need or drop that entirely in favor of hInclude.)

beejeebus’s picture

ESI or no ESI, who cares.

'just check http headers' is not going to fly as the sole determinant of 'should we wrap our kernel in HttpCache or not'.

beejeebus’s picture

i'm not working on this anymore, sorry if i've held it up.

msonnabaum’s picture

I completely agree that the "Surrogate-Capability" header is not enough to decide whether or not to use HttpCache. We should use either config or a $conf variable.

One thing that occurred to me while looking at this again however, is that our cache settings don't make much sense now that we've removed CACHE_TEMPORARY. We still have a cache.page.enabled option, yet it does nothing unless you set cache.page.max_age.

I'm thinking we should remove cache.page.enabled and replace it with http.cache.use.internal. If you are using a reverse proxy, all you have to do is set the max_age to something. I started on that but noticed we're checking the existing cache.page variables in some odd places, so I wanted to make sure we had consensus before making that change.

We should also conditionally pass the esi object in \Drupal\Core\HttpCache's constructor since we don't need that running if you don't have the equivalent of block_cache = TRUE. That can probably just be a @todo for now since removing block cache is out of scope for this issue.

Crell’s picture

Mark: So, you're saying request cache (which for now is synonymous with page cache, but later on won't be) is always-enabled, period, and to effectively disable it you just set the page 0 seconds? I'm OK with that. The default value should probably be something somewhat useful, say 5 min, but I'm easy on that part.

Or do I have it backward?

msonnabaum’s picture

I'm saying that the max age setting determines whether pages get cached. If it's at 0, drupal sends out the typical "Expires: Sun, 19 Nov 1978 05:00:00 GMT" that you'd get when page cache is off. If it's > 0, we send out the appropriate cache headers.

The only thing you explicitly enable is the internal page cache, which would mean we wrap the kernel with HttpCache.

Crell’s picture

That sounds fine to me.

catch’s picture

I'd be fine with #58, we'll need to update the help text on the performance settings to reflect the change. Also fwiw completely fine with having the default max_age as 5 minutes (probably only in the standard profile).

Crell’s picture

msonnabaum’s picture

Crell’s picture

Status:Needs work» Needs review
StatusFileSize
new41.47 KB
FAILED: [[SimpleTest]]: [MySQL] 53,935 pass(es), 46 fail(s), and 0 exception(s).
[ View ]

Getting back to this at last...

Here's a new patch, mostly from scratch but with some code copied from Mark's patch above. For those playing along at home there's also a branch in the WSCCI sandbox. Overall, there should be more minus signs than plus signs, which is a good sign.

Changes:
- Removes the old page cache, and the bootstrap phase along with it. Ooo...
- Because there's no more page cache bootstrap phase, and the functions in cache.inc are now tiny, I moved them into bootstrap.inc and eliminated cache.inc. Ooo...
- Wrap the kernel in an HttpCache object, if a setting is set. It defaults to true.
- Sets cache headers that seem to be triggering the cache appropriately. I think. :-) It works in my testing, but could use more.
- Sets the headers conditionally, so if a response object from a controller already has cache-related headers set they don't get overridden. That way individual pages can opt-out of caching, or opt for a longer cache lifetime, or whatever else they feel like doing.

Still todo:
- Lots of testing.
- Figure out what happens to drupal_is_page_cacheable(). That sort of global approach totally won't work anymore. Best suggestion: Eliminate it outright and be done with it. If you want to mess with caching, that's what the response event is for. Fire after FinishResponseSubscriber and do whatever black magic you want.
- We should probably wire clearing the HttpCache page cache into drupal_flush_all_caches(), but I've not done so yet.
- Hook up the ESI support in HttpCache, which I've ignored for the time being.

Why???
- Because we want to have a single cache API all the way through, regardless of where it's getting cached. That cache API is Http. This lets a controller set a response object and headers on it, and they'll be honored regardless of where or how the response is being cached.
- Because we want to eliminate the "anonymous caches forever, authenticated never caches" distinction and move toward "things cache when they should cache, and don't when they shouldn't". This is a step in that direction.

Crell’s picture

StatusFileSize
new17.67 KB
FAILED: [[SimpleTest]]: [MySQL] 54,163 pass(es), 45 fail(s), and 0 exception(s).
[ View ]

Erm. One thing I forgot to mention. This does include the contents of #1945024: Remove subrequests from central controllers and use the controller resolver directly., because I expected it would matter. So far it doesn't.

Attached is a patch that contains just the cache changes, or should. :-)

msonnabaum’s picture

Why???
- Because we want to have a single cache API all the way through, regardless of where it's getting cached. That cache API is Http. This lets a controller set a response object and headers on it, and they'll be honored regardless of where or how the response is being cached.
- Because we want to eliminate the "anonymous caches forever, authenticated never caches" distinction and move toward "things cache when they should cache, and don't when they shouldn't". This is a step in that direction.

None of this has anything to do with wrapping $kernel with HttpCache.

There are some worthwhile changes here, but it really should be handled outside of trying to use HttpCache. This issue should be about making page cache work with the new Request/Response objects.

katbailey’s picture

+++ b/core/includes/bootstrap.incundefined
@@ -2207,6 +2104,11 @@ function drupal_handle_request($test_only = FALSE) {
   $kernel->boot();
   drupal_bootstrap(DRUPAL_BOOTSTRAP_CODE);

+  if (settings()->get('use_http_cache', TRUE)) {
+    $store_path = variable_get('file_public_path', conf_path() . '/files/http_cache');
+    $kernel = new HttpCache($kernel, new Store($store_path));
+  }
+

Pretty sure we don't want the kernel booted for cached pages.

Status:Needs review» Needs work

The last submitted patch, 1597696-httpcache-2.patch, failed testing.

xjm’s picture

Issue tags:+Needs profiling
sdboyer’s picture

looking at the patch in #64, my primary comment is along the lines of katbailey's in #66 - if we're going to wrap the main kernel with the cache kernel, it needs to be done a little earlier. i'm not sure about booting/not booting the kernel, but i am sure that we should be slotting in this caching mechanism as early in bootstrap as possible, at least similar to where the old page caching implementation was. that may have implications on where/how we set the storage location var, but that shouldn't be awful - sys_get_temp_dir() is diiirty, but perhaps adequate for these purposes?

beyond that, i agree with #65 - the rest of this issue is about fixing our interaction with Request/Response elsewhere in the code. fixing the failed tests should be a good start to that.

iamEAP’s picture

Status:Needs work» Needs review
StatusFileSize
new17.68 KB
FAILED: [[SimpleTest]]: [MySQL] 57,613 pass(es), 61 fail(s), and 10 exception(s).
[ View ]

Re-roll of #64.

beejeebus’s picture

are we dropping support for configuring caching from the UI?

<?php
+  if (settings()->get('use_http_cache', TRUE)) {
?>
Crell’s picture

Status:Needs review» Postponed

We already have. This just moves around where things happen. And should probably be postponed on #1984766: Change notice: Start relying on Request/Response objects for cache handling

beejeebus’s picture

ah, that will make things easier then.

i'll file a follow up to remove these values from core/modules/system/config/system.performance.yml:

cache:
  page:
    use_internal: '0'
    max_age: '0'

and take those fields out of the performance form.

msonnabaum’s picture

To clarify, we have NOT removed support for enabling cache from the UI, and I do not think we should.

Crell’s picture

Then what was it we removed from the UI?

katbailey’s picture

msonnabaum’s picture

Yes, my patch only removed the "turn cache on" setting because it was redundant with max age. The setting is now "use internal cache", which you'd only check if you didn't have a reverse proxy cache.

beejeebus’s picture

ok, so, #71 remains. using settings() is unpossible.

we need to use config() to check for 'use internal cache', which means we need a booted kernel.

Crell’s picture

Well that's going to be a rather major problem, since avoiding a booted kernel is the whole point. :-) The toggle for invoking HttpCache needs to be readable before we boot anything or it's not useful.

katbailey’s picture

Yeah, that would make me very sad. It's not clear to me why it would be so terrible to remove support for enabling the internal cache via the UI. Would be good to get some clarification on that.

msonnabaum’s picture

Why are we talking about needing page cache to come before the kernel is booted? We're again trying to solve problems we don't have while we have many other caching related problems to work on.

msonnabaum’s picture

Title:Switch page caching to HttpCache» Consider whether HttpCache offers any significant benefit over the existing page cache

Changing the title to better reflect reality. This is still postponed and dependent on the rest of core using the kernel properly.

giorgio79’s picture

Doctrine/commons is recommended here for caching instead of httpcache. Great read, take a look http://nerdpress.org/2012/07/10/caching-data-in-symfony2/

manarth’s picture

That article refers to using Doctrine/commons for caching data, rather than rendered pages/page fragments, and does acknowledge that for HTTP caching, the httpcache is fine.

catch’s picture

Priority:Major» Normal
Crell’s picture

Version:8.x-dev» 9.x-dev

I think we've determined that HttpCache isn't actually viable until we get rid of Simpletest and rewrite the installer, neither of which are happening in Drupal 8.

Very sad panda. :-(

chx’s picture

Issue summary:View changes

Note that ~/www/d8/core/vendor/symfony/http-kernel/Symfony/Component/HttpKernel/HttpCache/HttpCache.php has an eval() in it which should be considered a blocker.

Wim Leers’s picture

Status:Postponed» Active

#2263981: Introduce a robust and extensible page cache-policy framework just landed.

According to #2257695: [meta] Modernize the page cache, this is now the next step. Not sure though, since this is now marked for 9.x-dev…

Crell’s picture

Version:9.x-dev» 8.0.x-dev

It was marked for 9.x when we didn't think we'd be able to get the bootstrap process sufficiently unravelled before 9.x. Turns out we did (neclimdul++), so it would be technically possible to use HttpCache now. I think it's still worth trying it out now that it's technically possible to do. It should be trivial to write a middleware for it to see what happens.

chx: That eval() does look a little suspect. Since the method isn't well commented I can't tell if the purpose of it is to protect against code injection or to specifically allow it in a special circumstance. I recommend asking upstream in Symfony about it. I would also be open to asking that the method be made protected so we can override it if necessary, but we should ask first.

znerol’s picture

Status:Active» Needs review
StatusFileSize
new17.77 KB
FAILED: [[SimpleTest]]: [PHP 5.4 MySQL] 77,294 pass(es), 14 fail(s), and 2 exception(s).
[ View ]

Moves the pieces into the right place, but currently fails because session_name() is set too late (should be fixed in #2331909: Move DrupalKernel::initializeCookieGlobals() into page cache kernel decorator).

znerol’s picture

Moves the pieces into the right place, but currently fails because session_name() is set too late (should be fixed in #2331909: Move DrupalKernel::initializeCookieGlobals() into page cache kernel decorator).

Status:Needs review» Needs work

The last submitted patch, 90: 1597696-use-symfony-http-cache.patch, failed testing.

Wim Leers’s picture

Should this also remove some overhead and make Drupal 8 faster? I have no idea, I'm curious :)

This and other failures in PageCacheTest suggest that HttpCache doesn't follow the HTTP spec:

Conditional request without If-None-Match returned 200 OK.

… or that Drupal's current page cache doesn't, of course. At least it's a change in behavior that we should assess.

msonnabaum’s picture

Why are we trying to fix something that's not broken again?

Crell’s picture

Because "broken" is not the issue. "Works" does not preclude "could be made better/simpler/less coupled". Also, HttpCache can do its own ESI tag handling whereas our current cache system does not, as far as I am aware.

msonnabaum’s picture

*claps*

That's some masterful trolling there.

chx’s picture

Status:Needs work» Postponed (maintainer needs more info)

We are not adding another eval() into Drupal after years of work removed them. Just say no.

Especially not when

I can't tell if the purpose of it is to protect against code injection or to specifically allow it in a special circumstance.

so postponed until at least we know more about that eval(). This status was made for it.

znerol’s picture

+++ b/core/lib/Drupal/Core/StackMiddleware/PageCache.php
@@ -7,49 +7,113 @@
+    // Disable the file inclusion / evaluation feature of the parent class.
+    $response->headers->remove('X-Body-Eval');
+    $response->headers->remove('X-Body-File');

The eval is indeed ugly, but we can actually disable that part. Probably we'd need to do that in lookup as well.

That said I'm okay with postponing this. Given that the page cache now lives in a stack middleware it is easy enough to swap it out.

Wim Leers’s picture

This is the offending code:

    /**
     * Restores the Response body.
     *
     * @param Request  $request  A Request instance
     * @param Response $response A Response instance
     */
    private function restoreResponseBody(Request $request, Response $response)
    {
        if ($request->isMethod('HEAD') || 304 === $response->getStatusCode()) {
            $response->setContent(null);
            $response->headers->remove('X-Body-Eval');
            $response->headers->remove('X-Body-File');

            return;
        }

        if ($response->headers->has('X-Body-Eval')) {
            ob_start();

            if ($response->headers->has('X-Body-File')) {
                include $response->headers->get('X-Body-File');
            } else {
                eval('; ?>'.$response->getContent().'<?php ;');
            }

            $response->setContent(ob_get_clean());
            $response->headers->remove('X-Body-Eval');
            if (!$response->headers->has('Transfer-Encoding')) {
                $response->headers->set('Content-Length', strlen($response->getContent()));
            }
        } elseif ($response->headers->has('X-Body-File')) {
            $response->setContent(file_get_contents($response->headers->get('X-Body-File')));
        } else {
            return;
        }

        $response->headers->remove('X-Body-File');
    }

As you can tell, wonderfully undocumented. Googling for it only yields results that simply list source code. Zero documentation to be found.

dawehner’s picture

This seems to be a case where the use of HTTP as standardtool even for internal documentation, falls down. On the other hand I don't even understand why this is needed.

Also, HttpCache can do its own ESI tag handling whereas our current cache system does not, as far as I am aware.

Is there any reason why our system could not be expanded to do that?

catch’s picture

Per #99 I looked at blame and this was added by Fabien Potencier in the initial commit of the cache system during 2010, was also unable to find any documentation.

HttpCache "doing it's own ESI tag handling" is a red herring and we should really examine what this means between HttpCache and Drupal's render caching and the various use cases. Have had enough of empty assertions constantly repeated.

Much of Symfony's use of ESI is to avoid having to deal with cache invalidation. They emphasise setting short expires on content listings so that they're 'fresh', then longer expires on the overall page. http://symfony.com/doc/current/book/http_cache.html has quite a long lead in, that boils down to "it's hard to do cache invalidation so we don't really bother but you can try if you want".

By implementing cache tags in Drupal 8, and specifically the bubbling of them from child elements to parents, we've almost entirely removed the need for different TTLs on cached content. All elements of the page, as well as the page itself, can have a very long TTL. When we invalidate, only parents of the cleared content get affected, children and siblings (etc.) can still be fetched from cache. Varnish has support for invalidating cache tags already since we put them in a custom header, so this also works with reverse proxies.

This means that for anonymous users, our cache hit rates should be considerably higher already than if we were using Symfony's short TTL + ESI tag approach - we don't need ESI to fix that particular performance issue, it's already solved with cache tags.

There's still things to clean up, like figuring out Etags #2259489: Use strong validator for ETag and leverage symfony for HTTP revalidation, but again HttpCache doesn't help with that at all.

Apart from different TTLs on content, the actual valid use case for ESI remaining is personalisation.

#post_render_cache is what we settled on for handling placeholders and replacement within HTML. Rather than using this for different TTLs, we only use that in Drupal for personalisation at the moment, because we just don't need different TTLs any more.

Since #post_render_cache replacement in PHP can be bypassed if necessary, it's possible to send the render cache placeholders to the reverse proxy or browser and replace them there instead of PHP (via Redis + memcache, Varnish or Akamai ESI tag with some massaging, or JavaScript). We don't need anything extra for actual ESI support except for ensuring this works and cleaning up various bits of work that are quite far along.

So the remaining question is whether we should add #post_render_cache support to the full page cache. With cache tag support, there is no value to doing this for anonymous users - we already have a viable model for all but the most extreme edge cases (and those can and should be handled by JavaScript usually).

What there could be value in doing though, is implementing 'full page caching' for authenticated users, then using #post_render_cache for personalisation. This means we'd no longer need to render the page template each time, which is one of the main places we could save time in HTML rendering that is not viable to cache for authenticated users. I think Fabianx has plans to do this, but again the benefit is for authenticated users - it'd be making the page cache viable for them too.

Another reason that Symfony's approach to ESI doesn't really help us, is that the recommended way to implement ESI is by doing subrequests inside Twig templates (see the docs at http://symfony.com/doc/current/book/http_cache.html#using-edge-side-incl...). That only works if you can decide beforehand where the dynamic content should be included in your template. I don't think that ESI caching is a decision that should be made in a Twig template.

See Wim, Marco and Fabian's talk here for a lot more detail: https://amsterdam2014.drupal.org/session/render-caching-drupal-7-and-8

Fabianx’s picture

So as Nat said, with using a placeholder based approach where placeholders are better defined than just post_render_cache random functions and everything does our own we can easily do:

- ESI
- AJAX
etc.

The advantage is that we can get data from the cache also directly, which is superior to internal micro ESI requests to assemble a page, which would not even propagate the render chain correctly.

The only advantage in my opinion is that HttpCache allows to save pages to disk, but that is something we can also have easily with the layer based system we have been building.

But we would need to extend it anyway to handle #attached, so in this case we gain not much as we can't use stock anyway, the way Drupal works ...

And for ESI and Varnish caching, etc. to work properly there are other things we need to take care of first, too - for example fix form and route CSRF handling.

It is OSS software, so if this will be better and cool and allow some really nice things that we really need, for all means lets implement it and let the implementation prove why its so more cooler for itself, but just for the sake of using it, I would say this needs to show why it is so much better first.

- So using a persistent page cache on disk? Yes, we can do that.
- Using HttpCache out of the box? We probably do not gain much at this point anymore.

Leaving as postponed for now.

I will create the meta for the authenticated user page cache as we are sooo close to having that working - or at least empowering contrib to do so and do it in 8.1 or such.

znerol’s picture

The next step in #2257695: [meta] Modernize the page cache is #2348679: Move the remaining procedural page cache code to the page cache stack middleware which will pack remaining scattered bits of internal page cache into a swappable middleware.

I will create the meta for the authenticated user page cache as we are sooo close to having that working - or at least empowering contrib to do so and do it in 8.1 or such.

He, the very reason I actually started working on the now superseeded #2177461: Refactor page caching into a service issue was that I realized that a D8 port of Authcache will be much easier when core gets more flexible in this regard.

Crell’s picture

I am sitting with Fabien at Symfony Live NYC right now and asked about the eval(). That's part of the ESI handling in HttpCache. In a nutshell, ESI tags in the cached page are converted to small PHP snippets that, when the cached page is loaded, will call back to the inner Kernel to make the esi request.

See Esi::handleEsiIncludeTag() for the code that creates the PHP.

chx’s picture

So there's some capability to set ESI tags and then ... you resolve them request time by making more Drupal requests? That may be useful for authcache but for normal cache I would rather not sure anything like that...

Fabianx’s picture

We do support something like that via #post_render_cache already now in Core (without eval) and we need to store additional information anyway like cache tags, so as already stated this Component is not useful at the moment.

But thanks for the information!

Crell’s picture

Any in-process ESI emulation is going to have to do something like that one way or another, as you will need to make a new request to get the content to replace. I didn't ask how deeply it supports force clearing via tags, which I agree would be a problem.