Problem/Motivation

The settings page for says "WARNING: This feature is NOT compatible with the Internal Page Cache module or external page caching software like Varnish."

I think we can do better than this.

Proposed resolution

Disable page caching when necessary.

Remaining tasks

User interface changes

API changes

Data model changes

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

alexpott created an issue. See original summary.

alexpott’s picture

Status: Active » Needs review
FileSize
4.73 KB

Here's a patch. I'm not sure if it is 100% necessary for forced logins because they are using an uncacheable redirect but I guess it is better to be safe than sorry.

bkosborne’s picture

Out of curiosity what made you go down this path, since we are already returning non-cacheable redirect responses anytime we redirect for a forced login and gateway login?

alexpott’s picture

@bkosborne well given that the response will always depend on what CAS does or does not do I just think it is safer to disable it. I think you are correct that if the redirect is issued then this is not needed. So it might not be needed for forced redirect but with gateway and the different configurations it seems safer just to disable caching on pages where it is active.

metzlerd’s picture

I don't have a specific use case, but I have an uneasy feeling regarding affecting all pages without a configuration switch to disable it. Using servers like varnish that have the possibility for things like edge site includes and "big pipe" implementations, makes me nervous. I just don't know enough about those technologies to know whether we should be mucking with the cache headers for non cas pages.

What do others think here? SHould we leave it on by default but allow disabling in a miscellaneous settings? Given the response by @alexpott should we wait until we have a real use case in mind before implementing this patch?

I certainly don't want to be the enemy of page caching....

metzlerd’s picture

I should also mention that drupal 8 supports page caching for logged in users, as well as cache settings in the render array. Are we interested in disabling those features? Do we understand the implications of this policy hook here on those settings?

alexpott’s picture

@metzlerd but the point is if you cache pages nothing in CAS's subscriber will fire. At least with the patch attached this is acknowledged so that if a user users the forced login or gateway feature on specific pages they can benefit from page cache on other pages. Atm the instructions basically say these features are incompatible with page cache.

alexpott’s picture

@metzlerd also page_cache module only works for anonymous users - what you might be referring to is the fact that authenticated users can use render cached parts of the page. That is completely unaffected by this.

metzlerd’s picture

I agree with the goal, I'm just left uneasy about the implementation choice and how heavy handed it is.

One of the big moves in the caching api changes was to allow more sophisticated caching methodologies. You can, for example, make all of the page except a specific block be cached. Imagine that this block made some decisions based on path about whether to fire the redirect, and only then decided to do so.

Also there is the point about caching for pages after we've already logged in. I think you are disabling that. In d7 logged in pages were never cached... in D8 I don't think that is true anymore, so this solution is putting in a performance penalty that may not be necessary.

Seems to me the choices moving forward:

1) Figure out a cache friendly way to do the redirect, either with JS support for the decision and an edge side include strategy, or looking at the non-cacheable block strategy.

2.) Proceed on this patch, and come back to remove it later when we figure out a better way to do it.

3). Differ the question till we can make a more informed decision.

metzlerd’s picture

@alexpott: Yes that's what I'm referring to, but I hadn't read your PS before you responded. I guess I need to do some research about page caching and how this policy fits into the decision before I'm comfortable, or hear from someone else that I misunderstand page caching.

yalet’s picture

@metzlerd:

Right now, if you have the gateway feature enabled, it interacts poorly with page caching: if the page requested exists in the cache, it will get returned with no gateway check even if gateway is enabled on that page. What the patch here does is prevent pages from getting cached if gateway ( or forced login ) is configured on that page.

What this means is that if you have gateway configured for all pages, you can't benefit from page caching ( which is different from benefiting from cached portions of a page during page construction ) at all. This differs from the current scenario where you can't benefit from gateway if you have page caching enabled. This scenario only occurs if the site is configured in a way that we call out pretty clearly as being broken.

This patch allows you to benefit from page caching on pages where you weren't going to use gateway or forced login at all. Currently, those pages are collateral damage, as if you want to use gateway mode correctly, you have to turn off all page caching.

@alexpott:
Is there a way to get a test for this?

bkosborne’s picture

I was writing an explanation similar to yalets. I agree, and this is a good patch to implement.

metzlerd’s picture

@yalet, @bkosborne: I think you are all assuming that I don't understand the problem, or what the patch does, which isn't really true. I admit to being a little confused about how the cache policy hook that we're using affects dynamic page caching described at: https://www.drupal.org/documentation/modules/dynamic_page_cache. I was worried about unintentional affects, but I think @alexpott has allayed those concerns.

I continue to believe that there are solutions worth pursuing that will allow compatibility with the page caching module and still retain the benefits of the gateway and forced login solutions. Remember the starting point for this was "we can do better", and I was just reaching for a solutions that didn't make you choose between performance and the gateway option, which I find useful.

When my institution finally makes the jump to Drupal 8, I'll probably be back with patches to that effect.

bkosborne’s picture

I think we can actually cache the redirect responses for forced login. When a specific page has forced auth enabled, all requests can be safely redirected there.

So I updated the patch to:

(1) Only disable page caching for gateway auth instead of gateway auth AND forced auth.
(2) Adjust the forced auth redirect in our subscriber to be a cacheable redirect. It was not previously.

For gateway auth I added a separate issue so we can discuss how to improve that feature to be more compatible with page caching: #2798343: Make the gateway redirect feature more compatible with page caching, but in the mean time it would be nice to get this committed since it improves our interaction with page caching a bit.

Status: Needs review » Needs work

The last submitted patch, 14: 2783235.14.patch, failed testing.

bkosborne’s picture

Previous patch was empty

The last submitted patch, 14: 2783235.14.patch, failed testing.

bkosborne’s picture

Forgot to mention that I removed the unit test we had for the forced login controller. That's because I wrote a functional test for that controller instead, which is much easier to write and maintain and makes a bit more sense for a controller (we don't have to mock the request).

bkosborne’s picture

metzlerd’s picture

I downloaded and started to test this patch and found out that I couldn't test it because the gateway option appears to be broken with my cas server. I want to get back to testing this, but I feel I need to identify why this code doesn't work. Log messages indicate that the cas server cannot find the ticket in the response, but we've checked the urls with fiddler and the ticket appears to be there.

Will do more research on this on monday. I don't know if this is something at our end so I won't mark this as needs work yet.

metzlerd’s picture

I was able to get the gateway option working locally, but am still having troubles getting any pages to be cached even without this patch applied. Every page load even with page caching enabled appears to cause a redirect to the cas server with the gateway option fully enabled EVEN when I'm configured to use the page cache. Which makes one wonder what the patch will accomplish and how we test it ;).

I haven't given up yet, but wanted to know if any others were able to get the gateway function to fail on cached pages WITHOUT this patch applied and did you have to do anything special to make that happen?

metzlerd’s picture

I found out the reason that I wasn't getting cached pages is that I did have the gateway option enabled. I'm once I disabled the gateway option caching resumed. Then when I re-enabled the gateway option once per session, and cleared my cache, then no anonymous pages appear to be cached, regardless as to whether I'm on the path configured for gateway check. This appears to be the case both before and after applying this patch.

I'm not sure why this is but I'd like to be able to demonstrate the problem before we try and "fix it", especially since there are no tests.

bkosborne’s picture

Sorry I missed your 5 day old comment. I think I know what's going on and will comment later on when I have more time.

bkosborne’s picture

Gateway is working properly for me without this patch:

1. Created an article node
2. Configure CAS to trigger gateway for this node only, once per browser session. (make sure you include the preceding slash for the path: /node/1)
3. Make sure render cache for the page your visiting is cleared (I just clear all entries in cache_render table)
4. Visit the article node as an anonymous user and get redirected to CAS server for gateway auth check

I tried with setting it to check "every page load" and it worked as expected as well.

Note that the module is already telling Drupal NOT to cache the redirect response to the CAS server like you've already observed. I think it's possible that the page could still end up cached if our gateway redirect logic did not occur (which is why this patch takes the safe route and just disables cache on these pages entirely), though I'm having trouble coming up with scenarios of how that could happen at the moment except for the scenario of a web crawler hitting the page. The web crawler would not trigger out gateway redirect, and the page response would be cached as a standard page render. Then any subsequent traffic from legit users would never get the gateway auth check until the cache was cleared.

metzlerd’s picture

Perhaps then we need a way to force the issue to prove the fix?

@bkosborne: Do you think we should commit the untestable patch on the off chance that we think it protects the robots problem? Or do we feel like we should postpone this issue? Or do you feel that we should come up with a way to test the robot scenario.

Like I said, long term it would be nice to develop something that allowed caching using dynamic page cache or some other edge side include or ajax technology rather than fully disabling the page cache options, but I don't want to be a barrier here, so I am interested in your thoughts on moving this issue forward?

bkosborne’s picture

I think I could write a patch that proves the effectiveness of this patch. I'll see about doing that soon.

bkosborne’s picture

Ugh, so in my attempt to write a patch for this I discovered that it seems like we cannot spoof the user agent server variable in functional tests. Drupal apparently relies on having a very specific user agent present on web tests and if it's not there tests break.

I wonder if it would be enough for me to provide you steps to reproduce the problem instead?

1. Enable internal page cache
2. Clear all your caches.
3. Create a basic page
4. Configure CAS to perform gateway auth for that specific page only (make sure you use the leading slash in the path /node/1)
5. Open chrome and configure it to spoof the user agent. Not sure if you have a Mac, but on my version you have to open Dev Tools, then click the three little dots on the top right and go to More Tools -> Network Conditions. Then within that tab you can set the user agent to contain the string "Google".
6. In chrome, visit that page for which you enabled gateway auth.
7. Observe that you are NOT redirected.
8. Now, in a separate browser (or disable user agent spoofing in Chrome), visit the same page. You SHOULD be redirected to the CAS server for gateway auth, but you won't be.

I'm desperately looking for a way to spoof user agents in functional tests because this is a very testable feature and it would be great to get it setup.

metzlerd’s picture

Rather than spoofing user agents for the test, which is complicating things, we should find a way to inject configuration into our tests that will override the user-agent detection phase. User agent detection should be unit tested anyway rather than integration tested. It seems to me that some of the new events we're talking about may allow us to override that behavior in the tests.

That being said, I'll run through the paces manually so we can get this committed.

metzlerd’s picture

Status: Needs review » Reviewed & tested by the community

Was able to run this patch through its paces and it works as advertised. @bkosborne: Thanks for the user-agent trick.

  • bkosborne committed 8721cbe on 8.x-1.x
    Issue #2783235 by bkosborne, alexpott: Disable page cache if forced...
bkosborne’s picture

Status: Reviewed & tested by the community » Fixed

Thanks for the review.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.