I have attached a patch which significantly speeds up the forum system. In addition it includes a shadow copy fix from Bart Jansens.

Forum List Page (9 Forums, 9201 Topics, 24616 Replies) no cache
Before 72 queries 6686 ms
After 55 queries 2263 ms

Forum List Page (9 Forums, 9201 Topics, 24616 Replies) w/ cache
Before 34 queries 2527 ms
After 34 queries 1185

Forum Topic List (4194 Topics, 12117 Replies) no cache
Before 75 queries 4322 ms
After 50 queries 1159 ms

Forum Topic List (4194 Topics, 12117 Replies) w/ cache
Before 72 queries 4060 ms
After 47 queries 1063 ms

All pages had the forum block enabled also.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

moshe weitzman’s picture

what about adding the new fields to the node table, instead of forum? this would benefit the tracker query, and any other node listing pages (like recent recipes, ...).

ccourtne’s picture

Yea that's possible. Anyone else have an opinion on that before I move the code around.

Also I'm going to have to remove the cacheing from this module as the cache is not compatible with node level access.

Bart Jansens’s picture

A few comments:

- avoid using time() or $user in hooks like _insert and _comment. These arent necessarily correct.
- dont forget about $op == 'update' in hook_comment or the counter will be wrong if someone unpublishes a comment. Instead of increasing/decreasing the counter i would just count the number of comments again.
- your change to comment.module uses $edit which isnt set. I attached a patch which adds 'delete' to the comment hook and makes sure the entire comment is always passed to the hook (knowing there once used to be a comment with a certain cid but it has been removed, isnt very helpful).
- i dont understand why you changed theme_forum_topic_list, shadow copies should not be shown as regular topics. The idea is to tell people their topic has been moved.

I agree some things should probably be in the node table (num comments and last comment timestamp), as i mentioned on drupal-devel in my first post :)

ccourtne’s picture

forum_theme_list change is an oversight. I'll add that back in tonight.

Good catches on the comment delete and update, I'll revise tonight.

Doh! You know I read you post but it didn't register because those changes where not in the patch. I just had it on my task list to take a look at the queries in forum modules which had several SQL issues in general. So I grabbed your patch and starting hacking.

Craig

ccourtne’s picture

FileSize
27.21 KB

New patch. This one has moved the comment stat cache fields to the node table so they can be used by other modules. I have also removed the broken caching which was screwing up node level access restrictions. I created some indexes to help with the forum block and also removed a few round trips.

Forum List Page (9 Forums, 9201 Topics, 24616 Replies)
Before 72 queries 6686 ms (First hit no cache)
Before 34 queries 2527 ms (After cache is generated)
Patch 1 55 queries 2263 ms
Patch 2 44 queries 964 ms

Forum Topic List (4194 Topics, 12117 Replies) no cache
Before 75 queries 4322 ms (First hit no cache)
Before 72 queries 4060 ms (After cache is generated)
Patch 1 50 queries 1159 ms
Patch 2 48 queries 700 ms

This patch supersedes the previous ones.

ccourtne’s picture

One more patch, once again it supersedes the previous patches. I have had to add USE INDEX hints to tell the sometimes brain dead mysql optimizer what to do.

Forum List Page (9 Forums, 9201 Topics, 24616 Replies)
Before 72 queries 6686 ms (First hit no cache)
Before 34 queries 2527 ms (After cache is generated)
Patch 1 55 queries 2263 ms
Patch 2 44 queries 964 ms
Patch 3 44 queries 700 ms

Forum Topic List (4194 Topics, 12117 Replies) no cache
Before 75 queries 4322 ms (First hit no cache)
Before 72 queries 4060 ms (After cache is generated)
Patch 1 50 queries 1159 ms
Patch 2 48 queries 700 ms
Patch 3 47 queries 180 ms

ccourtne’s picture

FileSize
28.93 KB

Well helps if I attach the new patch.

ccourtne’s picture

FileSize
28.93 KB

New patch that fixes a typo that removed a space before an INNER statement.

Dries’s picture

I'm interested in this patch but would like to know other people's thoughts on this.

One concern is that I don't like caching user names (but I don't mind caching user IDs).

ccourtne’s picture

It greatly simplifies the code and removes a join out of some performance critical sql statements. Remember that the cached user name may either be a user name or an anonymous name. I'll take a look at what things look like if we eliminate that field.

Dries’s picture

Thanks. (I can't help but think it is strange to cache the name of the last commenter, but not cache the node's author itself.)

Still looking forward to feedback from others.

killes@www.drop.org’s picture

I obviously like the benchmarks that Craig presented to us. What I don't /really/ like about the patch is the use of the node table as a cache table. I'd prefer to have the cached info in a separate table if that doesn't mean we lose the increased performance again.

ccourtne’s picture

I did some testing per killes suggestion of moving these fields into their own table. I created the following table.

CREATE TABLE node_last_comment (
  nid int(10) unsigned NOT NULL auto_increment,
  last_comment_timestamp int(11) NOT NULL default '0',
  last_comment_name varchar(60) NOT NULL default '0',
  last_comment_uid int(10) NOT NULL default '0',
  comment_count int(10) unsigned NOT NULL default '0',  
  PRIMARY KEY  (nid),
  KEY lasttimestamp (last_comment_timestamp)
) TYPE=MyISAM;

This does allows me to remove all changes from node.module and encapsulate them completely in comment.module.

There are a few drawbacks to doing this. It expands the use of nodeapi in the comment module as a new record will have to get created on node insert and an extra delete statement will have to be fired on node delete. In addition it will be another round trip to the database for ever node load call to add this data to the node. The extra join also add about 200ms to the forum page and about 350ms to the forum topic list page. I will not be able to tell the extra impact of the node load calls with out fully implementing this solution.

killes@www.drop.org’s picture

Thanks for implementing this. The node_load stuff will be for free if Dries applies my node object caching patch. We'd only need to remember to clear the node's cache after adding a comment.

ccourtne’s picture

I haven't implemented it yet I just simulated all the queries. I'd like to hear if there are any other requested changes before I do more rework.

As a follow up to Dries request about caching user names. I can add an IF() to the SELECT and eliminate that need it will add about 6ms per forum on the forum list page and about 6ms on the forum topic list pages.

I'll go ahead and start working on implementing these.

Dries’s picture

I think Gerhard's idea is good (if it works). Separating cached data from normal data is considered good practice, IMO. So I for one, am interested to see how this would perform.

ccourtne’s picture

Color me surprised... after making all the changes above it actually got faster. I'm not sure where the speed up is coming from yet since my manual testing of the queries indicated it should be slower.

Forum List Page (9 Forums, 9201 Topics, 24616 Replies)
Before 72 queries 6686 ms (First hit no cache)
Before 34 queries 2527 ms (After cache is generated)
Patch 1 55 queries 2263 ms
Patch 2 44 queries 964 ms
Patch 5 44 queries 835 ms

Forum Topic List (4194 Topics, 12117 Replies) no cache
Before 75 queries 4322 ms (First hit no cache)
Before 72 queries 4060 ms (After cache is generated)
Patch 1 50 queries 1159 ms
Patch 2 48 queries 700 ms
Patch 5 47 queries 600 ms

I'll post patch 5 tomorrow once I do some more testing.

ccourtne’s picture

FileSize
28.58 KB

Here is the new patch.

Changes from last patch
* Last comment details have been moved to node_last_comment
* Added cid of the last comment to the data (0 if no comments)
* last_comment_name will no longer store user name just anonymous names

ccourtne’s picture

FileSize
30.11 KB

New patch fixes counts (last one didn't check status) and updates comment_num_all to take advantage of the new stats. Also renamed the table to node_comment_stats.

Dries’s picture

  • We try to avoid abbreviations so please don't use the word 'stats'. I can probably search and replace the word '_stat' for you.
  • I'd like to know whether ... FROM {node} n USE INDEX(node_status_type) ... validates as ANSI SQL and works with PostgreSQL. If it doesn't, it should be removed.
Dries’s picture

A couple more details:

  • What is up with $sql_count = $sql_count = "SELECT ...?
  • The code comments that go with _comment_update_node_stats are not very clear to me. They are a bit cryptic.
  • Sentences that start with a capital letter should end with some punctuation.
  • Your indentation is not always correct (eg. search for $counts = array();).

Otherwise this patch looks good. I'll benchmark it shortly.

ccourtne’s picture

FileSize
29.13 KB

Ok. Here is a new patch that address the further issues. I have removed the USE INDEX which slows it down. Still a gain but not near as nice. We really should have the ability to do things like this in queries. Maybe something similar to translations where database plugins would take one sql and replace it with another.

Anyways here are the current stats I've rerun the baseline since all the new flags are now ancient. Times is query time/total page time.

Forum List Page (9 Forums, 9201 Topics, 24616 Replies)
Before 72 queries 7529/7961 ms (First hit no cache)
Before 34 queries 2616/2930 ms (After cache is generated)
Patch 7 44 queries 1688/2040 ms

Forum Topic List (4194 Topics, 12117 Replies) no cache
Before 75 queries 5420/5975 ms (First hit no cache)
Before 72 queries 4581/4944 ms (After cache is generated)
Patch 7 47 queries 736/1261 ms

Dries’s picture

The update.php script worked fine on one machine running MySQL 4, but on my other machine running MySQL 3 I get several SQL errors. Here are the errors:

user error: INSERT TABLE 'term_node' isn't allowed in FROM table list
query: INSERT INTO term_node (nid, tid) SELECT f.nid, f.tid FROM forum f LEFT JOIN term_node r ON f.nid = r.nid AND f.tid = r.tid WHERE f.shadow = 1 AND r.tid IS NULL in /mnt/redhat/foo/cvs/web/drupal.org/cvs/includes/database.mysql.inc on line 125.
user error: You have an error in your SQL syntax near 'n, forum_conv_temp t, comments c SET n.comment_count = t.comment_count, n.last_c' at line 1
query: UPDATE node_comment_statistics n, forum_conv_temp t, comments c SET n.comment_count = t.comment_count, n.last_comment_timestamp = c.timestamp, n.last_comment_name = c.name, n.last_comment_uid = c.uid, n.cid = t.cid WHERE t.cid = c.cid AND n.nid = t.nid in /mnt/redhat/foo/cvs/web/drupal.org/cvs/includes/database.mysql.inc on line 125.
INSERT INTO {term_node} (nid, tid) SELECT f.nid, f.tid FROM {forum} f LEFT JOIN {term_node} r ON f.nid = r.nid AND f.tid = r.tid WHERE f.shadow = 1 AND r.tid IS NULL 
FAILED
Dries’s picture

I installed the drupal.org database on my MySQL 4 system and benchmarked the latest version of this patch:

                                             before        after
                                           ---------     ---------
index.php with forum block           :     2228,9 ms      611,9 ms  (3.64 times faster)
index.php?q=forum with forum block   :     2553,9 ms     1470,8 ms  (1.73 times faster)
index.php?q=forum/1 with forum block :     3734,7 ms      774,0 ms  (4.83 times faster)

Truly excellent job. Now, if only we could fix the MySQL 4 dependency. ;)

ccourtne’s picture

FileSize
30.06 KB

Ok.. found in documentation that you can bracket USE INDEX in /*! */ to make other RDBMS tread it as a comment but MYSQL will still execute it. That should speed up the /forum screen quite a bit.

In addition I have fixed the MySQL 4 dependencies in the updates.inc script (I think I don't have a 3.23 install anywhere).

ccourtne’s picture

FileSize
29.85 KB

Ok.. hopefully the last fix. This one fixes the shadow updates in updates.inc pointed out by Bart Jansens.

ccourtne’s picture

FileSize
29.85 KB

Well here it is again... sans typo *sigh*

Dries’s picture

What happens with the cache when a username is changed/updated?

ccourtne’s picture

Per your request we don't cache usernames anymore. We do cache the "anonymous" name given at comment submit time which should get refreshed if the comment gets updated.

Dries’s picture

Now, after the upgrade of my MySQL 3 system, I get errors on the forum pages:

user error: You have an error in your SQL syntax near 'INNER JOIN users cu ON l.last_comment_uid = cu.uid ON n.nid = l.nid  INNER JOIN ' at line 1
query: SELECT n.nid, f.tid, n.title, n.sticky, u.name, u.uid, n.created AS timestamp, n.comment AS comment_mode, l.last_comment_timestamp, IF(l.last_comment_uid, cu.name, l.last_comment_name) as last_comment_name, l.last_comment_uid, l.comment_count AS num_comments FROM node n INNER JOIN node_comment_statistics l INNER JOIN users cu ON l.last_comment_uid = cu.uid ON n.nid = l.nid  INNER JOIN term_node r ON n.nid = r.nid AND r.tid = '1' INNER JOIN users u ON n.uid = u.uid INNER JOIN forum f ON n.nid = f.nid WHERE n.status = 1 AND '1' ORDER BY n.sticky DESC, l.last_comment_timestamp DESC LIMIT 0, 25
user error: You have an error in your SQL syntax near 'INNER JOIN users cu ON l.last_comment_uid = cu.uid ON n.nid = l.nid INNER JOIN t' at line 1
query: SELECT n.nid, l.last_comment_timestamp, IF(l.last_comment_uid, cu.name, l.last_comment_name) as last_comment_name, l.last_comment_uid FROM node n INNER JOIN node_comment_statistics l INNER JOIN users cu ON l.last_comment_uid = cu.uid ON n.nid = l.nid INNER JOIN term_node r ON n.nid = r.nid AND r.tid = 25  WHERE n.status = 1 AND n.type = 'forum' AND '1' ORDER BY l.last_comment_timestamp DESC LIMIT 0, 1 in /mnt/redhat/foo/cvs/web/drupal.org/cvs/includes/database.mysql.inc on line 125.

Maybe because of the 'double ON'-clause?

ccourtne’s picture

FileSize
28.54 KB

I reverted those queries to where joins which should work fine on MySQL 3.x

Dries’s picture

Committed to HEAD. Thanks for the great work and expertise, Craig. Onto the next SQL problem? ;-)

Dries’s picture

Anonymous’s picture