Rather than wait for cron to trigger, it would be really great if the recache process could be triggered via hook_exit(), so that it happened instantly.

Members fund testing for the Drupal project. Drupal Association Learn more

Comments

DamienMcKenna created an issue. See original summary.

DamienMcKenna’s picture

Status: Active » Needs review
FileSize
2.36 KB

As described. I need to test it a bit.

Vacilando’s picture

Related issues: +#2714211: Rules Integration

Hmm, good one. Thought of it as well, in fact posted some thoughts about immediate recaching in #2714211: Rules Integration / 3.

We could fire a re-caching request immediately as we detect a node change, or at hook_exit().

The problem I see is that at that moment caches *most probably* are not yet expired. Esp. any external cache (Boost, Varnish, Akamai, ...) needs a certain time to receive the purge requests and process them. If we fired our re-caching request so soon, we would probably still hit the old cache.

That said, I am happy to implement recaching on hook_exit() as an option. It probably will work for standard Drupal page caching (database), and maybe for local types of caching, such as Boost. I just doubt it can ever work like that when using Akamai / Varnish / Fastly / ...

What do you think, @DamienMcKenna? Please correct me if I am wrong.

DamienMcKenna’s picture

You bring up an excellent point. The site we want to use it with runs cron every minute, so we're going to start with hook_cron() and look into further optimizations later as needed.

DamienMcKenna’s picture

Status: Needs review » Needs work
Issue tags: +Needs tests

Probably needs some tests too.

Vacilando’s picture

Does not apply to the current dev; could you do a quick re-roll?

DamienMcKenna’s picture

Status: Needs work » Needs review
FileSize
2.67 KB

Rerolled.

  • Vacilando committed 86d47a1 on 7.x-1.x
    Issue #2828873 by DamienMcKenna: Optionally trigger recaching via...
Vacilando’s picture

Committed to 7.x-1.x-dev. Needs the tests -- and testing before we can close this issue.

Vacilando’s picture

Changed the checkbox that added caching via the exit hook to a radio button, so that the exit hook and cron become alternatives. Also made sure that the cron hook does not get executed if the exit hook is selected. Already committed to 7.x-1.x-dev. Is backward-compatible.

Vacilando’s picture

@DamienMcKenna -- some comments:

  1. I'd say the &drupal_static lines may be moved outside the foreach() in recacher_expire_cache() -- correct?
  2. Using static caching, you guarantee recacher_crawler() gets executed in recacher_exit(). But only once, if I am not mistaken. If variable 'recacher_max_urls' specifies that only a few URLs may be recached at a time, the rest of the expired URLs will never be recached.
  3. About the tests -- could you add what you have in mind? I don't need they have to be exhaustive for now.

If you do any improvements please provide a patch based on the current dev. Thanks!

  • Vacilando committed eeaa7f3 on 7.x-1.x
    #2828873 disabling hook cron code if hook exit option is selected
    
DamienMcKenna’s picture

I'm on vacation until January, I'll try to review the feedback then,

DamienMcKenna’s picture

FileSize
9.46 KB

Yeah, moving the drupal_static make sense.

It might be better to not disable hook_cron() if hook_exit() is being used. At the same time it might be worthwhile adding a lock around it so that there isn't any stampeding. I've added a patch that adds this functionality.

Vacilando’s picture

Thanks, @Damien!

Had to change lock_acquiare() to lock_acquire(), LOL :-)
Committed to 7.x-1.x-dev.

Testing both the scheduled and immediate caching. Please test this also on your side before I roll a new stable release.

Vacilando’s picture

Added a test to see whether there are any modules implementing hook_recacher_role_options(). Without this test, variable $recacher_role_options was being emptied.
See patch. Committed to 7.x-1.x-dev.

  • Vacilando committed 8580cb5 on 7.x-1.x
    Issue #2828873 by DamienMcKenna, Vacilando: Optionally trigger recaching...
Vacilando’s picture

@Damien: I understand why you prefer not to disable hook_cron() if hook_exit() is being used.
What worries me is that after hook_exit() runs, even if it is not successful, it will clear the URLs from the queue, meaning hook_cron() will never get a chance to pick them up.

Therefore I propose to allow URL deletion from the queue only when hook_cron() runs.
hook_exit() may try to re-cache even earlier (if switched on at the config page), but will not delete the URLs it crawled.
The only disadvantage of this approach is that if hook_exit() manages to achieve the recrawl, then hook_cron() will do nothing. On the other hand, it will be a very fast and cheap call hitting the cache.

Patch attached, code committed to 7.x-1.x-dev!

Vacilando’s picture

Seems to work perfectly for me. What hook_exit() does not catch is cached by hook_cron(). Test OK.

Vacilando’s picture

Status: Needs review » Fixed

Seems stable; marking as fixed. If you see any issues, don't hesitate to reopen.

Vacilando’s picture

Now also in 7.x-1.2.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

DamienMcKenna’s picture

(just catching up after two months of email overload) Thanks for all the improvements, @Vacilando!

Vacilando’s picture

Hey this is thanks to your report and a great initial patch DamienMcKenna ✓