We had our first customer running a Drupal site make the front page of digg.com yesterday and have some interesting observations. My hope in posting this thread is 2 fold - 1) to provide some information to users of Drupal on how to better optimize their install and 2) provide some feedback to the drupal development team.
First off, we run a clustered environment, where all services are separated onto dedicated resources (so, control panel, dns1, dns2, web, email, mysql, etc) are all located on separate servers.
As soon as our clients site (www.medopedia.com) made the home page of digg.com, our server loads sky rocketed. The link to the digg.com reference:
Web Server - Dual Xeon, 4 GB RAM, SCSI Drives, baseline load average (before digg effect) = 0.50
MySQL Server - Dual Xeon, 8 GB RAM, SCSI Drives, baseline load average (before digg effect) = 0.20
As soon as the site was featured on digg.com, the web server load shot over 40 (effectively rendering service unavailable) and the mysql server load increased slightly.
cache was already enabled on this site. However, adding the following directives (gzip) to .htaccess reduced the load of the web server to below 6
mod_gzip_item_include file \.(html?|txt|css|js|php|pl)$
mod_gzip_item_include handler ^cgi-script$
mod_gzip_item_include mime ^text/.*
mod_gzip_item_exclude mime ^image/.*
mod_gzip_item_exclude rspheader ^Content-Encoding:.*gzip.*
After a couple of outages on this particular web server, we were able to stabilize load and the clients site remained up for the rest of the duration of their front page listing. However, I am a little concerned about drupal here, as it does not seem to perform as well as other CMS solutions we regularly use (ie Joomla). We have been watching Drupal for quite some time and are anxious to include it in our supported applications list - its a wonderful solution, supports pgsql and clients love it - however, we need to be able to handle these sorts of loads without issue in a shared environment in order to take on this application. Even with the various modifications we made to the web server as well as .htaccess changes, etc - a load average of 6 on a server dedicated to web service only is still far too high for this amount of traffic. We will regularly see 2-3 times the traffic from a Joomla site and loads will not reach much above 2 - this is without any specific optimization, caching, etc... My concern is that a presence on digg.com crashed our server until we were able to identify the issue and modify the install so that loads could be sustained - and this just makes us, as providers, look bad.
I would like to see Drupal further improve this product with respect to server resource utilization. I believe the queries can be cleaned up and streamlined to reduce relative server load during periods of high traffic burst. Obviously we understand that Data Base driven applications will inherently produce higher loads then static sites for example - however, I see no reason, based on our experience with other solutions, that the relative load affect of drupal under higher then normal loads cannot be further optimized.
Your thoughts, comments and any other suggesstions towards optimizing Drupal in a shared environment are all welcome.