statistics.module contains the following query in hook_cron:
db_query('DELETE FROM {accesslog} WHERE '. time() .' - timestamp > '. variable_get('statistics_flush_accesslog_timer', 259200));
This looks like a straightforward query: the timestamp column has an index on it, so this linear condition should be optimized away. However, if I do some "explain" queries, I get this:
mysql> explain select uid from accesslog where timestamp > 100000000;
+-----------+------+---------------------+------+---------+------+--------+------------+
| table | type | possible_keys | key | key_len | ref | rows | Extra |
+-----------+------+---------------------+------+---------+------+--------+------------+
| accesslog | ALL | accesslog_timestamp | NULL | NULL | NULL | 221344 | where used |
+-----------+------+---------------------+------+---------+------+--------+------------+
1 row in set (0.02 sec)
mysql> explain select uid from accesslog where 100 - timestamp < 100000100;
+-----------+------+---------------+------+---------+------+--------+------------+
| table | type | possible_keys | key | key_len | ref | rows | Extra |
+-----------+------+---------------+------+---------+------+--------+------------+
| accesslog | ALL | NULL | NULL | NULL | NULL | 221349 | where used |
+-----------+------+---------------+------+---------+------+--------+------------+
1 row in set (0.12 sec)
Even though the two conditions are equivalent, it would seem mysql is not smart enough to recognize this. Instead of using the index, it does a full table scan in the second case.
We can fix this by changing the statistics statement by:
db_query('DELETE FROM {accesslog} WHERE timestamp < %d', time() - variable_get('statistics_flush_accesslog_timer', 259200));
There is a similar problem in watchdog_cron().
I've attached a patch, but I'd like some comments on this by some of our resident db gurus. Is this normal behaviour?
| Comment | File | Size | Author |
|---|---|---|---|
| timestampindex.patch | 1.46 KB | Steven |
Comments
Comment #1
dries commentedCommitted to HEAD and DRUPAL-4-5. Thanks. This should fix the cron timouts at drupal.org.
Comment #2
Steven commentedSetting to fixed, but comments still appreciated. Oh and there's a mistake in the explain queries above: the two conditions are not 100% equivalent, but differ in the literal constants used. Still, this shouldn't matter for the optimizations.
Comment #3
(not verified) commented