Looking to use this as an SSO cache bypass for a D7 site using simpleSAMLphp.

Thanks.

Comments

iamEAP’s picture

I have a sandbox project called ua_cache_bypass that was based off of code from this project.

I ported it to D7, so I bet you could take cues from my hook_boot implementation for a D7 version of this. I believe it's much simpler in D7. Check this file:

http://drupalcode.org/sandbox/iamEAP/1418344.git/blob/20754e6859c7908825...

iamEAP’s picture

Version: » 6.x-1.0-rc1

Sandbox project is now a full project. See: http://drupal.org/project/ua_cache_bypass

brian_c’s picture

Well... nuts.

So I've been toying around with a D7 port... and due to differences in D7's bootstrap sequence, this module's technique of re-invoking drupal_bootstrap() causes hook_boot() to be invoked TWICE.

  • In D6, hook_boot() is only invoked once in the bootstrap sequence, in the same place regardless of whether caching is active or not.
  • In D7, hook_boot() is invoked from two separate locations, depending on whether caching is active or not (from _drupal_bootstrap_page_cache() or _drupal_bootstrap_page_header(), respectively). So calling drupal_bootstrap() from inside _drupal_bootstrap_page_cache() results in _drupal_bootstrap_page_header() being reached, causing a second invocation of hook_boot(). Boourns.

iamEAP, have you noticed this "doubling up" of hook_boot() in your ua_cache_bypass module?

So I'm a bit bamboozled about what to do here. Two potential options come to mind:

  1. Live with the hook_boot() doubling problem, and warn people about it in the documentation.
  2. Instead of calling drupal_bootstrap() again, execute the remaining bootstrap phases ourselves manually, and re-implement _drupal_bootstrap_page_header() to omit the 2nd hook_boot() invocation.

I imagine running hook_boot() twice could cause all sorts of weird, hard-to-anticipate problems (heck, it even broke my Dynamic Cache Test module, which stores an "original" value for the cache setting... which was then promptly overwritten on the 2nd hook_boot() call)

And the problem with the 2nd approach is that it won't update Drupal's current bootstrap phase properly, so it basically breaks drupal_get_bootstrap_phase(). I'm not sure what the implications of that might be.

So I think a version as clean as the D6 one is impossible in D7 :( Both approaches outlined above have potential side effects.

Perhaps a way of allowing developers to select one method over the other...?

Blech.

brian_c’s picture

So sleeping on this gave me a crazy thought...

The D7 version can come with a dependant, secondary sub-module (call it "Dynamic Cache Double Boot Fix" or something), that has a very LOW weight (so it will execute FIRST, instead of last), which will intercept the second hook_boot() call from _drupal_bootstrap_page_header(), and never return from that... preventing any other hook_boot()'s from executing a second time. DOUBLE BOOTSTRAP HIJACK! Ha!

In theory, should work perfectly with no side effects. Will take a crack at this later.

brian_c’s picture

So the crazy idea worked. :) Initial D7 release is up!

iamEAP, really interested to hear if your ua_agent_bypass module (D7 version) is experiencing duplicate invocations of hook_boot(). If so maybe we can find a way for ua_agent_bypass to leverage dynamic_cache?

brian_c’s picture

Version: 6.x-1.0-rc1 » 7.x-1.0-rc1
Status: Active » Needs review
iamEAP’s picture

UA Cache Bypass dependency on Dynamic Cache is definitely in the plan.

When I was testing it, I never experienced the duplicate hook_boot invocation, though I was testing on a very simple setup with very few (if any) contrib modules in the boot space. It's possible it's happening and I'm just not aware of it.

I'm in the middle of a D6->D7 upgrade currently, but I'll take a look at this and integrating UACB with it soon.

Thanks for the work and debugging, brian_c.

thedavidmeister’s picture

ah, looks like i re-invented the wheel a bit with the UA cache bypass but it was only 3 or 4 hooks for my use-case. I'm using browscap to detect and send IE6 users to a "browser unsupported" page on hook_init and the D7 version of dynamic_cache.

I've got all the page caching options on in my Performance settings and I'm also using the filecache module, so I've got a slightly-more-than-vanilla setup. Anonymous IE6 users are still getting redirected on hook_init with the caching all turned on :)

Thanks heaps, without this module I'd be jumping through hoops to either get drupal_goto working in hook_boot or re-implementing a core function for one tiny feature in my site!

awm’s picture

I looked at the module and I am not sure why I d use instead of doing the following:

function MYMODULE_boot() {
   // Some logic...
    drupal_page_is_cacheable(FALSE);
    // AND We're done.
}

drupal_page_is_cacheable is called by drupal_page_get_cache() which is called by drupal_bootstrap_page_cache
if drupal_page_is_cacheable is FALSE
then drupal_page_get_cache is false
which leads to full bootstrap.

Am I missing something?

brian_c’s picture

awm, your code would indeed prevent the current page request from being cached. But it wouldn't let you switch back and forth between cached/non-cached for the same page request.

If caching is turned on, *and there is a cached version of the page available*, then setting drupal_page_is_cacheable(FALSE); will not have any effect; you would still get the cached page no matter what.

This happens because hook_boot() is not invoked until _drupal_bootstrap_page_cache() has ALREADY determined whether to serve a cached page or not. See the following code from bootstrap.inc:

    $cache = drupal_page_get_cache();
    // If there is a cached page, display it.
    if (is_object($cache)) {
      header('X-Drupal-Cache: HIT');
      // Restore the metadata cached with the page.
      $_GET['q'] = $cache->data['path'];
      drupal_set_title($cache->data['title'], PASS_THROUGH);
      date_default_timezone_set(drupal_get_user_timezone());
      // If the skipping of the bootstrap hooks is not enforced, call
      // hook_boot.
      if (variable_get('page_cache_invoke_hooks', TRUE)) {
        bootstrap_invoke_all('boot');
      }
      drupal_serve_page_from_cache($cache);

Note that drupal_serve_page_from_cache($cache); will always be called, regardless of what happens as a result of bootstrap_invoke_all('boot').

drupal_page_is_cacheable(FALSE); will "work" insofar as it will prevent a cached version of the page from ever being created in the first place. But if you need a cached version of the page for some situations, and a dynamic version of the SAME PAGE for a different situation, you're screwed (without Dynamic Cache).

The classic scenario is the "shopping cart problem"... you want to serve cached versions of your pages (to anonymous users) for speed, UNTIL they have started added items to a "shopping cart", at which point you want to serve a dynamic page (ie, with "You have X items in your shopping cart").

Hope this clarifies things.

awm’s picture

I actually noticed that too. When the page is cached then my solution does not prevent it from switching it on / off.. thanks for prompt reply.
Another thing is why do you have the boot_fix module. would it just be similar to do drupal_boostrap(DRUPAL_BOOTSTRAP_FULL,FASLE) where the second parameter $new_phase: A boolean, set to FALSE if calling drupal_bootstrap from inside a function called from drupal_bootstrap (recursion).

Thanks

brian_c’s picture

I'm pretty sure I tried that approach with $new_phase first and it didn't work.

The problem is that D7 invokes hook_boot() from two separate locations (which are never intended to both be run in the same request), one is in DRUPAL_BOOTSTRAP_PAGE_CACHE (as described above, which only runs for cached pages) and one is in DRUPAL_BOOTSTRAP_PAGE_HEADER (which runs on new page generation).

Dynamic Cache causes both of these blocks to be run (invokes a full bootstrap from DRUPAL_BOOTSTRAP_PAGE_CACHE, which core is not expecting). So we need to intercept the hook_boot() call from DRUPAL_BOOTSTRAP_PAGE_HEADER and do nothing (ie, prevent subsequent hook_boot() calls from firing) with another "hijack".

The reason this requires an entirely separate module is because Dynamic Cache needs to run LAST (after all other hook_boot()'s), and the Bootfix module needs to run FIRST (before all other hook_boot()'s), which means they need module weights at extreme opposite ends.

awm’s picture

what could happen if I do not use boot_fix? Drupal bootstraping twice?

brian_c’s picture

If the bootfix sub-module is not used, hook_boot() is called twice (NOT the whole bootstrap process run twice, just hook_boot invoked from 2 diff locations).