Dear Friends,
I am migrating a large PhpBB forum with +500.000 posts to Drupal. I am using PostgreSQL, which has very neat logging, query analysis and statics features. I log any query which executes in more than 30 ms. It seems that some queries are executing to slowly :
2009-03-13 17:23:47 CET LOG: durée : 53.388 ms, instruction : SELECT COUNT(n.nid) FROM (SELECT DISTINCT ON (nid) * FROM node) n INNER JOIN term_node tn ON n.vid = tn.vid AND tn.tid = 31 LEFT JOIN history h ON n.nid = h.nid AND h.uid = 1 WHERE n.status = 1 AND n.created > 1234369419 AND h.nid IS NULL
2009-03-13 17:23:47 CET LOG: durée : 83.108 ms, instruction : SELECT COUNT(n.nid) FROM (SELECT DISTINCT ON (nid) * FROM node) n INNER JOIN term_node tn ON n.vid = tn.vid AND tn.tid = 50 LEFT JOIN history h ON n.nid = h.nid AND h.uid = 1 WHERE n.status = 1 AND n.created > 1234369419 AND h.nid IS NULL
2009-03-13 17:23:47 CET LOG: durée : 52.884 ms, instruction : SELECT COUNT(n.nid) FROM (SELECT DISTINCT ON (nid) * FROM node) n INNER JOIN term_node tn ON n.vid = tn.vid AND tn.tid = 4 LEFT JOIN history h ON n.nid = h.nid AND h.uid = 1 WHERE n.status = 1 AND n.created > 1234369419 AND h.nid IS NULL
2009-03-13 17:23:47 CET LOG: durée : 54.215 ms, instruction : SELECT COUNT(n.nid) FROM (SELECT DISTINCT ON (nid) * FROM node) n INNER JOIN term_node tn ON n.vid = tn.vid AND tn.tid = 28 LEFT JOIN history h ON n.nid = h.nid AND h.uid = 1 WHERE n.status = 1 AND n.created > 1234369419 AND h.nid IS NULL
2009-03-13 17:23:47 CET LOG: durée : 54.393 ms, instruction : SELECT COUNT(n.nid) FROM (SELECT DISTINCT ON (nid) * FROM node) n INNER JOIN term_node tn ON n.vid = tn.vid AND tn.tid = 29 LEFT JOIN history h ON n.nid = h.nid AND h.uid = 1 WHERE n.status = 1 AND n.created > 1234369419 AND h.nid IS NULL
2009-03-13 17:23:47 CET LOG: durée : 54.316 ms, instruction : SELECT COUNT(n.nid) FROM (SELECT DISTINCT ON (nid) * FROM node) n INNER JOIN term_node tn ON n.vid = tn.vid AND tn.tid = 26 LEFT JOIN history h ON n.nid = h.nid AND h.uid = 1 WHERE n.status = 1 AND n.created > 1234369419 AND h.nid IS NULL
2009-03-13 17:23:47 CET LOG: durée : 59.272 ms, instruction : SELECT COUNT(n.nid) FROM (SELECT DISTINCT ON (nid) * FROM node) n INNER JOIN term_node tn ON n.vid = tn.vid AND tn.tid = 25 LEFT JOIN history h ON n.nid = h.nid AND h.uid = 1 WHERE n.status = 1 AND n.created > 1234369419 AND h.nid IS NULL
2009-03-13 17:23:47 CET LOG: durée : 58.622 ms, instruction : SELECT COUNT(n.nid) FROM (SELECT DISTINCT ON (nid) * FROM node) n INNER JOIN term_node tn ON n.vid = tn.vid AND tn.tid = 27 LEFT JOIN history h ON n.nid = h.nid AND h.uid = 1 WHERE n.status = 1 AND n.created > 1234369419 AND h.nid IS NULL
2009-03-13 17:23:47 CET LOG: durée : 54.749 ms, instruction : SELECT COUNT(n.nid) FROM (SELECT DISTINCT ON (nid) * FROM node) n INNER JOIN term_node tn ON n.vid = tn.vid AND tn.tid = 23 LEFT JOIN history h ON n.nid = h.nid AND h.uid = 1 WHERE n.status = 1 AND n.created > 1234369419 AND h.nid IS NULL
Running EXPLAIN ANALYSE:
"Aggregate (cost=4418.38..4418.39 rows=1 width=4) (actual time=74.428..74.429 rows=1 loops=1)"
" -> Nested Loop (cost=12.98..4418.37 rows=1 width=4) (actual time=74.421..74.421 rows=0 loops=1)"
" -> Merge Left Join (cost=12.98..4033.42 rows=56 width=8) (actual time=70.333..71.815 rows=1019 loops=1)"
" Merge Cond: (n.nid = h.nid)"
" Filter: (h.nid IS NULL)"
" -> Subquery Scan n (cost=0.00..4020.11 rows=113 width=8) (actual time=70.258..71.366 rows=1025 loops=1)"
" Filter: ((n.created > 1234369419) AND (n.status = 1))"
" -> Unique (cost=0.00..3006.18 rows=67595 width=96) (actual time=0.019..55.687 rows=67484 loops=1)"
" -> Index Scan using node_pkey on node (cost=0.00..2837.19 rows=67595 width=96) (actual time=0.018..35.562 rows=67484 loops=1)"
" -> Sort (cost=12.98..13.00 rows=10 width=4) (actual time=0.070..0.073 rows=11 loops=1)"
" Sort Key: h.nid"
" Sort Method: quicksort Memory: 25kB"
" -> Bitmap Heap Scan on history h (cost=4.33..12.81 rows=10 width=4) (actual time=0.035..0.039 rows=11 loops=1)"
" Recheck Cond: (uid = 1)"
" -> Bitmap Index Scan on history_pkey (cost=0.00..4.33 rows=10 width=0) (actual time=0.023..0.023 rows=11 loops=1)"
" Index Cond: (uid = 1)"
" -> Index Scan using term_node_vid_idx on term_node tn (cost=0.00..6.86 rows=1 width=4) (actual time=0.002..0.002 rows=0 loops=1019)"
" Index Cond: ((tn.vid)::integer = (n.vid)::integer)"
" Filter: ((tn.tid)::integer = 31)"
"Total runtime: 74.522 ms"
Displaying 20 topics demands 1,5 second. If 500 users are connected, any database can die.
This may even be considered like a security issue, as anyone running queries on a forum can kill a database.
(only on large forums). This issue does not exist on small forums.
This seems to be this part of code in unread topics:
function _forum_topics_unread($term, $uid) {
$sql = "SELECT COUNT(n.nid) FROM {node} n INNER JOIN {term_node} tn ON n.vid = tn.vid AND tn.tid = %d LEFT JOIN {history} h ON n.nid = h.nid AND h.uid = %d WHERE n.status = 1 AND n.created > %d AND h.nid IS NULL";
$sql = db_rewrite_sql($sql);
return db_result(db_query($sql, $term, $uid, NODE_NEW_LIMIT));
}
I don't have a solution yet.
Comment | File | Size | Author |
---|---|---|---|
#13 | Screenshot from 2016-06-09 17-01-03.png | 27.02 KB | Elin Yordanov |
Comments
Comment #1
grub3 CreditAttribution: grub3 commentedThe DISTINCT CLAUSE is not needed as nid is the primary key from nodes.
This does not solve this issue.
Comment #2
grub3 CreditAttribution: grub3 commentedA suitable solution would be to hide topic NEW answers on request.
In large forums, there is X topics (> 50.000) and Y (>500.000) posts.
If 1.000 simultaneous users are connecting, it make impossible to drive down queries with COUNT.
There can be virtually an infinity of SQL queries involving history.
It makes possible for attackers to run simultaneous queries.
The only solution would be to use MATERIALIZED views in PostgreSQL.
It is a wizer solution to disable NEW post count on large forums, based on user settings.
Comment #3
grub3 CreditAttribution: grub3 commentedI think we should only display if NEW messages have been added. I looked at PhpBB code and this how they solve this problem. Counting new messages is very dangerous and should not be done. The solution is to look for queries which use "COUNT" and transform them into queries which only test if new messages exist.
I am not *** very sure *** but I think any wize attacker could drive down Drupal forum reading this message.
Comment #4
grub3 CreditAttribution: grub3 commentedPriority: critical?
Comment #5
Damien Tournoud CreditAttribution: Damien Tournoud commentedThis is exactly why we have Materialized views.
Comment #6
catch@Damien, yes it is, but since there's currently no code for materialized view API in CVS, nor a core issue to get it in, we should keep the actual core performance issues it's supposed to solve critical to avoid losing track of them.
I was running a forum on Drupal with tens of thousands of nodes and hundreds of thousands of comments on shared hosting and then a very small VPS for ages - it's not only sites with lots of resources which run up against these queries.
Comment #7
sun.core CreditAttribution: sun.core commentedStill an issue with revised taxonomy index?
Comment #8
catchhmm..
Comment #9
andypostCurrent query was wrong and fixed in #877020: _forum_topics_unread() broken
Comment #10
grub3 CreditAttribution: grub3 commentedTo run an actual EXPLAIN ANALYSE, you need a real database.
The query planner will choose different options depending on table size.
On small tables, the tables will ALWAYS be cached in memory.
I can only suggest to use PostgreSQL + pgAdmin3 and 500.000+ rows for your testings.
You may be interested to read my developer guide here:
Guidelines for SQL:
http://drupal.org/node/559302
The most interesting part if :
Run EXPLAIN on slow queries to fin sequential scans
http://drupal.org/node/559720
Typically, in a Forum, we need to avoid any kind of sequential scan.
Comment #11
Dries CreditAttribution: Dries commentedNo patch yet.
Comment #12
andypost@jmpoure I have no forum with big number of posts to explain
Comment #13
Elin Yordanov CreditAttribution: Elin Yordanov commentedIs that issue a duplicate of #145353: Forums performance improvements?
This is a big performance issue on content heavy sites. On my site, it takes about 340 ms to calculate unread topics of all forums.
See the attached screenshot from XHProf report.