Drupal devel module displays the SQL execution time in cache, not on the SQL server.

Example taken from my website (500.000 posts), opening a forum thread:
SELECT n.nid, n.title, n.sticky, l.comment_count, l.last_comment_timestamp FROM node n INNER JOIN node_comment_statistics l ON n.nid = l.nid INNER JOIN term_node r ON n.nid = r.nid AND r.tid = 9 WHERE n.status = 1 ORDER BY n.sticky DESC, l.last_comment_timestamp DESC
371ms

You can believe it is a lot ...

In fact, you should always check execution time in a graphical SQL client like pgAdmin3:
SELECT n.nid, n.title, n.sticky, l.comment_count, l.last_comment_timestamp FROM node n INNER JOIN node_comment_statistics l ON n.nid = l.nid INNER JOIN term_node r ON n.nid = r.nid AND r.tid = 9 WHERE n.status = 1 ORDER BY n.sticky DESC, l.last_comment_timestamp DESC
This query returns 21000 rows in 7412 ms.

You cache is propably full, your memory is eaten-up and your CPU time is lost.

If 100 people connect to the forum, even with cache and a good PostgreSQL server (don't even think of MySQL), it will be down in 2 minutes. People can also attack a forum quite easily.

So ... We need to fix this in code. We don't need 21000 rows for a single thread.
I will propose a fix ASAP.

CommentFileSizeAuthor
#22 vacances-011.png197.41 KBazertya123
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

grub3’s picture

Status: Closed (won't fix) » Active

The problem comes from this part of code:

function template_preprocess_forum_topic_navigation(&$variables) {
  $output = '';

  // get previous and next topic
  $sql = "SELECT n.nid, n.title, n.sticky, l.comment_count, l.last_comment_timestamp FROM {node} n INNER JOIN {node_comment_statistics} l ON n.nid = l.nid INNER JOIN {term_node} r ON n.nid = r.nid AND r.tid = %d WHERE n.status = 1 ORDER BY n.sticky DESC, ". _forum_get_topic_order_sql(variable_get('forum_order', 1));
  $result = db_query(db_rewrite_sql($sql), isset($variables['node']->tid) ? $variables['node']->tid : 0);

  $stop = $variables['prev'] = $variables['next'] = 0;

  while ($topic = db_fetch_object($result)) {
    if ($stop == 1) {
      $variables['next'] = $topic->nid;
      $variables['next_title'] = check_plain($topic->title);
      $variables['next_url'] = url("node/$topic->nid");
      break;
    }
    if ($topic->nid == $variables['node']->nid) {
      $stop = 1;
    }
    else {
      $variables['prev'] = $topic->nid;
      $variables['prev_title'] = check_plain($topic->title);
      $variables['prev_url'] = url("node/$topic->nid");
    }
  }
}

db_fetch_object($result) should not be used to fetch and analyse 21.000 records.

I cannot fix this PHP function and leave this to core developers.

grub3’s picture

Priority: Normal » Critical

Setting to critical as no serious forum can work with kind of query.

Damien Tournoud’s picture

Priority: Critical » Normal
Status: Active » Closed (won't fix)

Jean-Michel, this is a friendly warning, please change your behavior. This is getting really annoying.

grub3’s picture

Status: Active » Closed (won't fix)

Sorry, I apologize.
I modified all recent messages to correct them.

This message is part of a tutorial which I am writing here:
Guidelines for writing efficient SQL code

Kind regards,
Jean-Michel

grub3’s picture

Can you modify the title of this message to:
"Drupal query returns 21000 rows, which are computed using PHP"

Why cannot-it be fixed?
Is this a no-op for you?

Scott Reynolds’s picture

good news its in a theme function and you can override it: http://drupal.org/node/55126

What this query is doing of course is finding the next and previous. For an outstanding discussion on the problems of finding next and previous: http://2bits.com/articles/previousnext-api-high-performance-api-module-p...

And: http://www.slideshare.net/Eweaver/efficient-pagination-using-mysql

grub3’s picture

Really thank you, I am relieved.
I will transfer bug to the theme maintainer.

grub3’s picture

Do you think it can be used:
http://drupal.org/project/prev_next

grub3’s picture

Project: Drupal core » Acquia Marina
Version: 6.x-dev » 6.x-2.x-dev
Component: forum.module » Code

The bug was closed by Damien and handed to the theme developers by myself. I hope that D7 offers such a feature, integrated. It not normal to query 21.000 rows just for the sole display of Previous/Next and compute them in PHP one by one.

Now I have to download http://drupal.org/project/prev_next and modify Drupal core. I am getting lost.

I have to admit strange things are going on in Drupal and don't blame me. I am not a troll: I participated in pgAdmin3 development, kdenlive video editor. I optimized several PHPBB SQL queries for scalibility and I am quite amazed by the low-level of attention to this previous/next link. And more over to database performance. I did my best and spent a couple of days reviewing Drupal problems under PostgreSQL and wrote two guides: http://drupal.org/node/559302 and http://drupal.org/node/555514.

Previous/next is one the bases of the web. Many people concentrate on cache here, people discuss about farms, but is it really needed if a framework cannot scale Prev/Next to 650.000 records. There are SQL design problems to handle quickly. Again, I hope this is fixed in D7.

My website is going live in a few days. If I publish it with Drupal current internals, it may explode. Now I am prepared for anything. Of course, I will disable prev/next link.

Can we discuss what could be done for Prev/Next navigation to get it into core Drupal 6-dev forum. Or is this impossible by nature of Drupal coding?

I would like to escalate directly to Dries Buytaert and pledge him to ask the community to fix this prev/next problem, quickly. Several sites may be suffering from heavy load problems. If admins do not have access to MySQL or PostgreSQL settings, they may not be able to log queries server-side and thus will never discover this simple pagination problem.

This is the problem of using tools like MySQL. You never really know what is going on. People are discussing about caching MySQL, but Previous/Next is not fixed in the drupal forum. I can only encourage your teams to read my two guides and migrate to PostgreSQL. Today, drupal.org became very slow. Why not migrate to PostgreSQL and log all slow query plans. This would allow you to optimize Drupal quickly. This is not a joke, I really believe it. If you are interested, I can help.

Dries, please, help, what is your opinion? Can you get scalable Prev/Next links in Drupal forum? With not provide a scalable API for Prev/Next within Drupal. Are there reason?

Kind regards,
Jean-Michel

grub3’s picture

Title: Hunt for speed:#1 lesson » Pagination using SQL scripts leads to heavy load
Status: Closed (won't fix) » Active

Finally, I decided to reopen this bug and escalate to Dries directly.

Damien, I still don't believe why you closed that bug.
This is a heavy problem in forum and closing the bug will not remove problems.

Dave Reid’s picture

Project: Acquia Marina » Drupal core
Version: 6.x-2.x-dev » 6.x-dev
Component: Code » forum.module

Holy crap jmpoure, this is not how the community works. We don't beg to Dries.

grub3’s picture

Thanks, I got confirmation from the dev list that this issue would be studied and probably fixed. Thank you very much Drupalers.

philipnet’s picture

Jean-Michel,

Do you have any suggestions of a better way to write the SQL so that we don't have PHP iterating through potentially tens of thousands of records just to print Previous and Next links?

Can we just select the n.nid we need and then ask the RDMS for the previous and next nids?
* I think only MySQL supports 'previous_record' and 'next_record' functions?
* Do PostgreSQL and SQLite also offer those functions?

Do we need to bring {node_comment_statistics} in?

A quick Google has revealed this thread on the MySQL site: http://forums.mysql.com/read.php?52,221129
Does that scale better?
Does it work with PostgreSQL and SQLite?

Any other ideas?

grub3’s picture

Pardon me, but I am very new to Drupal API to I might write nonsence.
Presently I am working like hell to verify SQL scalibility reading server-side queries.

For previous, current and next record, why not use a SQL combination of LIMIT and OFFSET?

This makes it possible to fetch only three records:

  • Previous (n-1)
  • Current (n)
  • Next (n+1)

SELECT select_list
FROM table_expression
[ ORDER BY ... ]
LIMIT 3 OFFSET n-1

Where n is the record that you would like.

http://www.postgresql.org/docs/8.4/interactive/queries-limit.html

Again, I don't know how to handle that in Drupal, but it ***seems*** to make sense to use LIMIT OFFSET
Usually the database is able to use indexes and access data directly, thus it can be lightning fast with no overhead.

We can always check the detailed plans.

But it seems more reasonable to run 100 queries with LIMIT OFFSET and store results in cache
rather than sending ONE huge SQL Query to the database.

LIMIT OFFSET scales to all databases.

Generally, it is more interesting to run a collection of queries returning 1 to 10 records max that running 1 huge SQL query and storing it in cache for later PHP processing. Processing 21.000 records in PHP demands too much PHP memory. This can drive performance down.

Running multiple short SQL queries returning 1 to 10 records scales better.
In the case of LIMIT 3 OFFSET n-1 it is pretty obvious.

What is your opinion?

grub3’s picture

For example, I run this arbitrary query fetching 3 records out of 21.000 thousands in 266 ms:

SELECT n.nid, n.title, n.sticky, l.comment_count, l.last_comment_timestamp 
FROM node n 
INNER JOIN node_comment_statistics l ON n.nid = l.nid 
INNER JOIN term_node r ON n.nid = r.nid AND r.tid = 9 
WHERE n.status = 1 
ORDER BY n.sticky DESC, l.last_comment_timestamp DESC 
LIMIT 3 OFFSET 14420

This seems very slow.

Let us add an index:

CREATE INDEX node_sticky_idx
ON node (sticky)

Now, the query plan is:

"Limit  (cost=7523.73..7523.74 rows=3 width=57) (actual time=167.907..167.908 rows=3 loops=1)"
"  ->  Sort  (cost=7487.68..7540.42 rows=21096 width=57) (actual time=164.617..166.831 rows=14423 loops=1)"
"        Sort Key: n.sticky, l.last_comment_timestamp"
"        Sort Method:  quicksort  Memory: 3335kB"
"        ->  Hash Join  (cost=4264.89..5972.50 rows=21096 width=57) (actual time=73.043..113.917 rows=21081 loops=1)"
"              Hash Cond: ((l.nid)::integer = n.nid)"
"              ->  Seq Scan on node_comment_statistics l  (cost=0.00..1074.86 rows=67486 width=12) (actual time=0.004..7.980 rows=67486 loops=1)"
"              ->  Hash  (cost=4001.19..4001.19 rows=21096 width=53) (actual time=73.021..73.021 rows=21081 loops=1)"
"                    ->  Hash Join  (cost=1292.15..4001.19 rows=21096 width=53) (actual time=12.306..65.201 rows=21081 loops=1)"
"                          Hash Cond: (n.nid = (r.nid)::integer)"
"                          ->  Seq Scan on node n  (cost=0.00..1907.58 rows=67486 width=49) (actual time=0.003..24.152 rows=67486 loops=1)"
"                                Filter: (status = 1)"
"                          ->  Hash  (cost=1028.45..1028.45 rows=21096 width=4) (actual time=12.272..12.272 rows=21081 loops=1)"
"                                ->  Bitmap Heap Scan on term_node r  (cost=399.75..1028.45 rows=21096 width=4) (actual time=1.939..8.014 rows=21081 loops=1)"
"                                      Recheck Cond: ((tid)::integer = 9)"
"                                      ->  Bitmap Index Scan on term_node_tid_idx  (cost=0.00..394.48 rows=21096 width=0) (actual time=1.881..1.881 rows=21081 loops=1)"
"                                            Index Cond: ((tid)::integer = 9)"
"Total runtime: 168.508 ms"

This query can execute in 168ms

Notice a cast from int_unsigned to int happens.
This should not be the case and I contacted PostgreSQL hackers for that problem.

Because of ORDER BY, we cannot avoid a sequential scan or a sort in memory.
i believe that this query should run more quickly when the CAST problem is fixed.

PostgreSQL hackers believe that this is a problem with my install.

ngaur’s picture

Thanks for your effort and persistence jmpoure.

johnny guimauve’s picture

Sorry for my english I post also from France and like jmpoure this is not my natural language.

I would like to say thank you to jmpoure for all the advices and the bug repair he do for us drupal users.

I had to migrate a very big forum to drupal soon and the capacity of jean michel poure to do this thing is very a lot of help in the french informatician communauty, let me tell you he is a very good help because he translate all the advices and patches into is forum which is read by 650.000 persons every week, and a lot of informaticians here are teached by him.

Thanx jm, god bless you

azertya123’s picture

FileSize
197.41 KB

Thank you for your effort to became a great developper. Is it your first experience with Drupal ? I think you are the best to floute very beautiful houses.
For example this one :

Dave Reid’s picture

Well this was interesting thread to read via e-mail subscriptions.

Jody Lynn’s picture

FYI this function was removed from D7: #556136: Remove theme_forum_topic_navigation()

Damien Tournoud’s picture

Status: Active » Closed (won't fix)

Then it's won't fix. This query is a known performance problem, but nothing we can fix in a stable version of Drupal.