Updated: Comment #N

Problem/Motivation

Currently it's possible that extensive use of rules leads to stampeding because of the time it requires to build the full rules cache. Since the rules cache is required very early (hook_init()) even the usage of enhanced caching mechanisms can't cover all cases.
This can lead to serious trouble that can make a site completely unresponsive because all resources are used by the incoming requests to build the rules cache.

Proposed resolution

Introduce a locking to ensure only one request can build the rules cache. In a worst case scenario there will be requests waiting quite some time for an answer or even time-out - but at least the server resource's wont be exceeded.

Remaining tasks

Reviews needed.
Tests?

User interface changes

None.

API changes

No changes, but introduction of rules_event_invocation_enabled() to enable / disable the invocation of rules events.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

das-peter’s picture

Oh, I think there's something wrong with the patch. Update attached.

fago’s picture

Status: Needs review » Needs work

Thanks. This looks quite ready, here a few remarks:

  1. +++ b/rules.module
    @@ -318,22 +318,46 @@ function &rules_get_cache($cid = 'data') {
    +        rules_event_invocation_enabled(FALSE);
    +        watchdog('rules', 'Cache rebuild lock hit: !cid', array('!cid' => $cid), WATCHDOG_WARNING);
    +        rules_event_invocation_enabled(TRUE);
    

    This seems sane to do for all of the rules logging stuff. Maybe, we could add add a helper doing that sort of watchdog logging to the rules logger?

  2. +++ b/rules.module
    @@ -318,22 +318,46 @@ function &rules_get_cache($cid = 'data') {
    +      if (isset($locking)) {
    

    How can this be false?

  3. +++ b/rules.module
    @@ -1634,3 +1658,21 @@ function rules_entity_metadata_wrapper_all_properties_callback(EntityMetadataWra
    +  if (!is_null($enable)) {
    

    Minor: Should use isset() as usually.

das-peter’s picture

Thanks for the feedback, much appreciated!

  1. Added that to rules_log(). Wrapped $logger->log() as well - if I undestand it correctly it could be exchanged and thus use watchdog()
  2. Looks like a refactoring leftover - removed.
  3. Changed to isset()
fago’s picture

oh, you got me wrong on 1), sry for being unspecific there. I was thinking watchdog is called from RulesLog, but it isn't. I don't think one can replace the rules logger right now, so no need to wrap the log() method call also.

I've updated the patch accordingly, please take a look. The rest of the patch looks good to me now, thanks!

das-peter’s picture

Status: Needs review » Reviewed & tested by the community

Thanks for the feedback.
Looks good to me too. So I'd say RTBC :)

fago’s picture

Status: Reviewed & tested by the community » Fixed

Thanks, committed it!

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

das-peter’s picture

Status: Closed (fixed) » Reviewed & tested by the community

Looks like the commit newer made it into the repo: http://drupalcode.org/project/rules.git/blob/e1cdbd9d49df21a23f4309811a3...
Thus re-opening.

fago’s picture

Status: Reviewed & tested by the community » Fixed

hm, not sure what I messed up, probably I forgot to push. Committed & pushed it now.

  • Commit 34c3ac6 on 7.x-2.x authored by das-peter, committed by fago:
    Issue #2190553 by das-peter, fago: Add locking to cache rebuild to avoid...

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

kitikonti’s picture

I just wanted to ask if it is a normal behaviour if i get this regularly, i would say 3 times per hour.

jelo’s picture

I have the same question as kitikonti which I had posted here:
https://www.drupal.org/node/2406863

I have 10 rules active on my site:
- 7 act on before saving a certain content type (each one for a different content type)
- 2 are notifications when scheduler published or unpublished content
- 1 notifies a user after a successful feed import every week

There is hardly any content being saved right now, scheduler has not fired a single time and the import has not run. This does not sound like "extensive use of rules" to me. When is a cache rebuild triggered? I looks strange to me that I would get these warnings basically every few hours.

Shane Birley’s picture

In my case, I am running Drupal Commerce. And with the site running in maintenance mode, this is triggering every few minutes. In some cases, several times in a minute.

tomrog’s picture

In my case I have ~20% of db_logs containing "Cache rebuild lock hit". Is this normal? I have a pretty big site, ~10k users, Drupal Commerce, Userpoints, loads of nodes.

If this is normal, then maybe this whole preventing shouldn't be logged every single time? Cause, one full page of this log (in one minute) is pretty messy. Please give more info on that topic.

FSheFF’s picture

Since the recent upgrade to Rules 7.x-2.9 we are seeing "Cache rebuild lock hit: event_init" over and over again although we have nothing scheduled. Have rebuilt Rule's cache via Advanced tab several times. Drupal 7.34.

loze’s picture

Since we upgraded to 7x.2.9 are are seeing "Cache rebuild lock hit: rules_event_whitelist" in the logs

mdupont’s picture

margyly’s picture

After updating some code, with Rules 7x.2.9, our database had overrides, which I reverted. I got this message over and over:

WD rules: Cache rebuild lock hit: data [warning]

I tried to clear the cache and got a zillion of these:

WD rules: Cache rebuild lock hit: rules_event_whitelist [warning]

Drupal was hung.

klausi’s picture

#2406863: stampede protection should be flexible ("Cache rebuild lock hit" watchdog message) has now a patch to remove the spamy logging altogether. Please review and test there!

znerol’s picture

This is causing a lock-stampede on one of my sites (i.e. all mysql processes waiting on the lock to be released). Removing the lock-code brought that site back to normal again.

znerol’s picture

Turns out in my case the cache_rules table was somehow damaged (it was visible in mysql but not accessible). What happens if an exception is thrown while a process holds a lock? Is there any situation where we fail to release it?

lmeurs’s picture

ShaunDychko’s picture

Rules threw this error "Cache rebuild lock hit: rules_event_whitelist" three times in the same minute, and failed to execute the actions it was supposed to do.

ShaunDychko’s picture

Status: Closed (fixed) » Needs work
ShaunDychko’s picture

Status: Needs work » Closed (fixed)

Guess my comment belongs in a new issue.

leisurman’s picture

Has this issue been fixed yet? Has anyone tried to downgrade the module version?
This is still a problem
https://www.drupal.org/node/2406863