statistics.module contains the following query in hook_cron:

db_query('DELETE FROM {accesslog} WHERE '. time() .' - timestamp > '. variable_get('statistics_flush_accesslog_timer', 259200));

This looks like a straightforward query: the timestamp column has an index on it, so this linear condition should be optimized away. However, if I do some "explain" queries, I get this:

mysql> explain select uid from accesslog where timestamp > 100000000;
+-----------+------+---------------------+------+---------+------+--------+------------+
| table     | type | possible_keys       | key  | key_len | ref  | rows   | Extra      |
+-----------+------+---------------------+------+---------+------+--------+------------+
| accesslog | ALL  | accesslog_timestamp | NULL |    NULL | NULL | 221344 | where used |
+-----------+------+---------------------+------+---------+------+--------+------------+
1 row in set (0.02 sec)

mysql> explain select uid from accesslog where 100 - timestamp < 100000100;
+-----------+------+---------------+------+---------+------+--------+------------+
| table     | type | possible_keys | key  | key_len | ref  | rows   | Extra      |
+-----------+------+---------------+------+---------+------+--------+------------+
| accesslog | ALL  | NULL          | NULL |    NULL | NULL | 221349 | where used |
+-----------+------+---------------+------+---------+------+--------+------------+
1 row in set (0.12 sec)

Even though the two conditions are equivalent, it would seem mysql is not smart enough to recognize this. Instead of using the index, it does a full table scan in the second case.

We can fix this by changing the statistics statement by:
db_query('DELETE FROM {accesslog} WHERE timestamp < %d', time() - variable_get('statistics_flush_accesslog_timer', 259200));

There is a similar problem in watchdog_cron().

I've attached a patch, but I'd like some comments on this by some of our resident db gurus. Is this normal behaviour?

CommentFileSizeAuthor
timestampindex.patch1.46 KBSteven

Comments

dries’s picture

Committed to HEAD and DRUPAL-4-5. Thanks. This should fix the cron timouts at drupal.org.

Steven’s picture

Setting to fixed, but comments still appreciated. Oh and there's a mistake in the explain queries above: the two conditions are not 100% equivalent, but differ in the literal constants used. Still, this shouldn't matter for the optimizations.

Anonymous’s picture