I have attached a patch which significantly speeds up the forum system. In addition it includes a shadow copy fix from Bart Jansens.
Forum List Page (9 Forums, 9201 Topics, 24616 Replies) no cache
Before 72 queries 6686 ms
After 55 queries 2263 ms
Forum List Page (9 Forums, 9201 Topics, 24616 Replies) w/ cache
Before 34 queries 2527 ms
After 34 queries 1185
Forum Topic List (4194 Topics, 12117 Replies) no cache
Before 75 queries 4322 ms
After 50 queries 1159 ms
Forum Topic List (4194 Topics, 12117 Replies) w/ cache
Before 72 queries 4060 ms
After 47 queries 1063 ms
All pages had the forum block enabled also.
Comment | File | Size | Author |
---|---|---|---|
#31 | forumspeedup11.patch | 28.54 KB | ccourtne |
#27 | forumspeedup10.patch | 29.85 KB | ccourtne |
#26 | forumspeedup9.patch | 29.85 KB | ccourtne |
#25 | forumspeedup8.patch | 30.06 KB | ccourtne |
#22 | forumspeedup7.patch | 29.13 KB | ccourtne |
Comments
Comment #1
moshe weitzman CreditAttribution: moshe weitzman commentedwhat about adding the new fields to the node table, instead of forum? this would benefit the tracker query, and any other node listing pages (like recent recipes, ...).
Comment #2
ccourtne CreditAttribution: ccourtne commentedYea that's possible. Anyone else have an opinion on that before I move the code around.
Also I'm going to have to remove the cacheing from this module as the cache is not compatible with node level access.
Comment #3
Bart Jansens CreditAttribution: Bart Jansens commentedA few comments:
- avoid using time() or $user in hooks like _insert and _comment. These arent necessarily correct.
- dont forget about $op == 'update' in hook_comment or the counter will be wrong if someone unpublishes a comment. Instead of increasing/decreasing the counter i would just count the number of comments again.
- your change to comment.module uses $edit which isnt set. I attached a patch which adds 'delete' to the comment hook and makes sure the entire comment is always passed to the hook (knowing there once used to be a comment with a certain cid but it has been removed, isnt very helpful).
- i dont understand why you changed theme_forum_topic_list, shadow copies should not be shown as regular topics. The idea is to tell people their topic has been moved.
I agree some things should probably be in the node table (num comments and last comment timestamp), as i mentioned on drupal-devel in my first post :)
Comment #4
ccourtne CreditAttribution: ccourtne commentedforum_theme_list change is an oversight. I'll add that back in tonight.
Good catches on the comment delete and update, I'll revise tonight.
Doh! You know I read you post but it didn't register because those changes where not in the patch. I just had it on my task list to take a look at the queries in forum modules which had several SQL issues in general. So I grabbed your patch and starting hacking.
Craig
Comment #5
ccourtne CreditAttribution: ccourtne commentedNew patch. This one has moved the comment stat cache fields to the node table so they can be used by other modules. I have also removed the broken caching which was screwing up node level access restrictions. I created some indexes to help with the forum block and also removed a few round trips.
Forum List Page (9 Forums, 9201 Topics, 24616 Replies)
Before 72 queries 6686 ms (First hit no cache)
Before 34 queries 2527 ms (After cache is generated)
Patch 1 55 queries 2263 ms
Patch 2 44 queries 964 ms
Forum Topic List (4194 Topics, 12117 Replies) no cache
Before 75 queries 4322 ms (First hit no cache)
Before 72 queries 4060 ms (After cache is generated)
Patch 1 50 queries 1159 ms
Patch 2 48 queries 700 ms
This patch supersedes the previous ones.
Comment #6
ccourtne CreditAttribution: ccourtne commentedOne more patch, once again it supersedes the previous patches. I have had to add USE INDEX hints to tell the sometimes brain dead mysql optimizer what to do.
Forum List Page (9 Forums, 9201 Topics, 24616 Replies)
Before 72 queries 6686 ms (First hit no cache)
Before 34 queries 2527 ms (After cache is generated)
Patch 1 55 queries 2263 ms
Patch 2 44 queries 964 ms
Patch 3 44 queries 700 ms
Forum Topic List (4194 Topics, 12117 Replies) no cache
Before 75 queries 4322 ms (First hit no cache)
Before 72 queries 4060 ms (After cache is generated)
Patch 1 50 queries 1159 ms
Patch 2 48 queries 700 ms
Patch 3 47 queries 180 ms
Comment #7
ccourtne CreditAttribution: ccourtne commentedWell helps if I attach the new patch.
Comment #8
ccourtne CreditAttribution: ccourtne commentedNew patch that fixes a typo that removed a space before an INNER statement.
Comment #9
Dries CreditAttribution: Dries commentedI'm interested in this patch but would like to know other people's thoughts on this.
One concern is that I don't like caching user names (but I don't mind caching user IDs).
Comment #10
ccourtne CreditAttribution: ccourtne commentedIt greatly simplifies the code and removes a join out of some performance critical sql statements. Remember that the cached user name may either be a user name or an anonymous name. I'll take a look at what things look like if we eliminate that field.
Comment #11
Dries CreditAttribution: Dries commentedThanks. (I can't help but think it is strange to cache the name of the last commenter, but not cache the node's author itself.)
Still looking forward to feedback from others.
Comment #12
killes@www.drop.org CreditAttribution: killes@www.drop.org commentedI obviously like the benchmarks that Craig presented to us. What I don't /really/ like about the patch is the use of the node table as a cache table. I'd prefer to have the cached info in a separate table if that doesn't mean we lose the increased performance again.
Comment #13
ccourtne CreditAttribution: ccourtne commentedI did some testing per killes suggestion of moving these fields into their own table. I created the following table.
This does allows me to remove all changes from node.module and encapsulate them completely in comment.module.
There are a few drawbacks to doing this. It expands the use of nodeapi in the comment module as a new record will have to get created on node insert and an extra delete statement will have to be fired on node delete. In addition it will be another round trip to the database for ever node load call to add this data to the node. The extra join also add about 200ms to the forum page and about 350ms to the forum topic list page. I will not be able to tell the extra impact of the node load calls with out fully implementing this solution.
Comment #14
killes@www.drop.org CreditAttribution: killes@www.drop.org commentedThanks for implementing this. The node_load stuff will be for free if Dries applies my node object caching patch. We'd only need to remember to clear the node's cache after adding a comment.
Comment #15
ccourtne CreditAttribution: ccourtne commentedI haven't implemented it yet I just simulated all the queries. I'd like to hear if there are any other requested changes before I do more rework.
As a follow up to Dries request about caching user names. I can add an IF() to the SELECT and eliminate that need it will add about 6ms per forum on the forum list page and about 6ms on the forum topic list pages.
I'll go ahead and start working on implementing these.
Comment #16
Dries CreditAttribution: Dries commentedI think Gerhard's idea is good (if it works). Separating cached data from normal data is considered good practice, IMO. So I for one, am interested to see how this would perform.
Comment #17
ccourtne CreditAttribution: ccourtne commentedColor me surprised... after making all the changes above it actually got faster. I'm not sure where the speed up is coming from yet since my manual testing of the queries indicated it should be slower.
Forum List Page (9 Forums, 9201 Topics, 24616 Replies)
Before 72 queries 6686 ms (First hit no cache)
Before 34 queries 2527 ms (After cache is generated)
Patch 1 55 queries 2263 ms
Patch 2 44 queries 964 ms
Patch 5 44 queries 835 ms
Forum Topic List (4194 Topics, 12117 Replies) no cache
Before 75 queries 4322 ms (First hit no cache)
Before 72 queries 4060 ms (After cache is generated)
Patch 1 50 queries 1159 ms
Patch 2 48 queries 700 ms
Patch 5 47 queries 600 ms
I'll post patch 5 tomorrow once I do some more testing.
Comment #18
ccourtne CreditAttribution: ccourtne commentedHere is the new patch.
Changes from last patch
* Last comment details have been moved to node_last_comment
* Added cid of the last comment to the data (0 if no comments)
* last_comment_name will no longer store user name just anonymous names
Comment #19
ccourtne CreditAttribution: ccourtne commentedNew patch fixes counts (last one didn't check status) and updates comment_num_all to take advantage of the new stats. Also renamed the table to node_comment_stats.
Comment #20
Dries CreditAttribution: Dries commented... FROM {node} n USE INDEX(node_status_type) ...
validates as ANSI SQL and works with PostgreSQL. If it doesn't, it should be removed.Comment #21
Dries CreditAttribution: Dries commentedA couple more details:
$sql_count = $sql_count = "SELECT ...
?_comment_update_node_stats
are not very clear to me. They are a bit cryptic.$counts = array();
).Otherwise this patch looks good. I'll benchmark it shortly.
Comment #22
ccourtne CreditAttribution: ccourtne commentedOk. Here is a new patch that address the further issues. I have removed the USE INDEX which slows it down. Still a gain but not near as nice. We really should have the ability to do things like this in queries. Maybe something similar to translations where database plugins would take one sql and replace it with another.
Anyways here are the current stats I've rerun the baseline since all the new flags are now ancient. Times is query time/total page time.
Forum List Page (9 Forums, 9201 Topics, 24616 Replies)
Before 72 queries 7529/7961 ms (First hit no cache)
Before 34 queries 2616/2930 ms (After cache is generated)
Patch 7 44 queries 1688/2040 ms
Forum Topic List (4194 Topics, 12117 Replies) no cache
Before 75 queries 5420/5975 ms (First hit no cache)
Before 72 queries 4581/4944 ms (After cache is generated)
Patch 7 47 queries 736/1261 ms
Comment #23
Dries CreditAttribution: Dries commentedThe update.php script worked fine on one machine running MySQL 4, but on my other machine running MySQL 3 I get several SQL errors. Here are the errors:
Comment #24
Dries CreditAttribution: Dries commentedI installed the drupal.org database on my MySQL 4 system and benchmarked the latest version of this patch:
Truly excellent job. Now, if only we could fix the MySQL 4 dependency. ;)
Comment #25
ccourtne CreditAttribution: ccourtne commentedOk.. found in documentation that you can bracket USE INDEX in /*! */ to make other RDBMS tread it as a comment but MYSQL will still execute it. That should speed up the /forum screen quite a bit.
In addition I have fixed the MySQL 4 dependencies in the updates.inc script (I think I don't have a 3.23 install anywhere).
Comment #26
ccourtne CreditAttribution: ccourtne commentedOk.. hopefully the last fix. This one fixes the shadow updates in updates.inc pointed out by Bart Jansens.
Comment #27
ccourtne CreditAttribution: ccourtne commentedWell here it is again... sans typo *sigh*
Comment #28
Dries CreditAttribution: Dries commentedWhat happens with the cache when a username is changed/updated?
Comment #29
ccourtne CreditAttribution: ccourtne commentedPer your request we don't cache usernames anymore. We do cache the "anonymous" name given at comment submit time which should get refreshed if the comment gets updated.
Comment #30
Dries CreditAttribution: Dries commentedNow, after the upgrade of my MySQL 3 system, I get errors on the forum pages:
Maybe because of the 'double ON'-clause?
Comment #31
ccourtne CreditAttribution: ccourtne commentedI reverted those queries to where joins which should work fine on MySQL 3.x
Comment #32
Dries CreditAttribution: Dries commentedCommitted to HEAD. Thanks for the great work and expertise, Craig. Onto the next SQL problem? ;-)
Comment #33
Dries CreditAttribution: Dries commentedComment #34
(not verified) CreditAttribution: commented