When inserting a new task, rules_scheduler tries to delete matching tasks by calling db_delete on two, non-indexed keys, which causes a table scan.

When this happens under moderate load on our site, frequently with a simultaneous cron run, lock timeouts occur.

PDOException: SQLSTATE[HY000]: General error: 1205 Lock wait timeout exceeded; try restarting transaction: DELETE FROM {rules_scheduler} 
WHERE (config = :db_condition_placeholder_0) AND (identifier = :db_condition_placeholder_1) ; Array
(
[:db_condition_placeholder_0] => rules_send_invoice_due_emails
[:db_condition_placeholder_1] => invoice-336968-due
)
in rules_scheduler_schedule_task() (line 148 of .../rules/rules_scheduler/rules_scheduler.module).

Patch to follow.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

glennpratt’s picture

Untested patch, I will test this change and report back.

I thought about just using drupal_write_record to insert or update, but that would change behavior somewhat.

glennpratt’s picture

Status: Active » Needs review
pdrake’s picture

There is already a unique key index defined for the config and identifier fields, so I removed the index from the patch in #1.

glennpratt’s picture

Thanks @pdrake. I should have noticed the update hook. Apparently those schema changes in 7202 somehow failed on our systems even though the system schema appeared correct.

Bug in my earlier patch, this one is tested a bit more, though probably not needed if we can verify the UNIQUE KEY fixes the issue.

fago’s picture

Issue summary: View changes
Status: Needs review » Postponed (maintainer needs more info)

Given we have the unique key now, this should be fixed?

bshensky’s picture

Got hit by this bug. We had over 120,000 tasks scheduled.

What makes this worse that it seems at first sight is that a single failed database transaction might induce the "delete scheduled task" function and table DELETE, resulting in a full table scan (with requisite DB deadlocks) that can result in the failure of similar future transactions, creating a downward spiral of DB transaction failure and app performance.

Patch #1 is essential, as it insures creation of an index on config + identifier. Do we presume that config+identifier are necessarily unique (Patch #3)? Or not (Patch #4)? By design, is that index created at install (Patch #1) unique or not?

There's not much to resolve here to get this into release. I'd love to help make it happen.

UPDATE: @glennpratt experience mirrors mine - somehow the rules_scheduler_update_7202() update did not get applied, which adds the (nonunique) index on config + identifier to the table. We make it a point to update modules by the book (er, drush), so I wonder if this points to an upstream bug during module upgrade.

bshensky’s picture

Yup - the module upgrader function, rules_scheduler_update_7202(), will silently fail on unique index creation iff the existing data has duplicates on config + identifier prior to creation.

Patch is coming - stand by.

TR’s picture

@bshensky: Any progress on that patch?

TR’s picture

Status: Postponed (maintainer needs more info) » Needs work

@bshensky?