About a week ago I upgraded to 7.x-1.0-rc6. And before that, we had an audience of approximately 6000 (4000 anonymous and 2000 authenticated) on the site over a 2 hour period without a hitch. Yesterday, however, we were gearing up for our traffic when we noticed the following. At 45 minutes prior to the start of the show, we had about 300 show up early. I turned off dblog (for performance reasons) and then my server response time went through the roof (300ms to 10,000 ms) where it remained even though the traffic was a 10th of what we serve. It wasn't until (after lots of poking around), that we realized it was the httpbl module. The second it was disabled, our server response time went back to 0.3 seconds for authenticated users.

I'm not 100% sure if it was the module or some incompatibility or configuration issue. But we do know that turning it off fixed our performance issues.

My question: what could have caused this issue? Is there anything that is happening in the backend that has some limit to the number of page requests a second or the number of authenticated users?

I'd like to re-enable the module and move forward... but this left us in a bit of a panic, particularly as a result of the timing. And why disabling dblog would have anything to do with it is a bit puzzling (considering there were no errors being written at that time). Again, this didn't happen when we were using versions below 7.x-1.0-rc6 (we've had an event every month since we first enabled this module back in late July... so we've had 3 of these high traffic events without a hitch).

Thoughts? Thanks in advance!

Comments

bryrock’s picture

Status: Active » Postponed (maintainer needs more info)

"My question: what could have caused this issue? Is there anything that is happening in the backend that has some limit to the number of page requests a second or the number of authenticated users?"

I don't have enough information to answer your question nor to offer any informed speculation. httpbl does not impose any limits on the number of authenticated users you can have.

I have rc6 running on two sites with over 10K authenticated users and am not seeing any problems as you described. Am about to put it on a site with about 80k authenticated users.

If you're running without dblog, you could try turning off the logging in httpbl (off = errors only).

Other than that, I can only propose the usual: making sure you run your dbupdates, clear cache, etc. Where you running into any of this at the same time as #1844638: metatag_update_7004 goes into an infinite loop?

rickmanelius’s picture

Hi bryrock,

Thanks for your followup. I hope my post didn't come across as accusatory (and if so, that wasn't my intention). But replying item by item.

  1. I'm glad that httpbl doesn't have any limitations/throttling. And because I had an event the previous month with more traffic and no issues, I figured as much. The only difference (with respect to anything related to httpbl and its configurations) was that I had recently upgraded to the 7.x-1.0-rc6 version.
  2. The only thing I can recall happening at the exact moment of the performance glitch was disabling dblog.
  3. I can confirm (based on the db snapshot that I maintain) that (off = errors only) was set.
  4. All hook_updates were run. I cleared cache several times and even disabled memcache. And I attempted the metatag the morning after this issue.

I'm happy to leave this as postponed or even closed. But all I know is that disabling dblog seemed to coincide with the performance snafu and disabling httpbl made everything run snappy (I was clearing cache at various intervals and that didn't help).

So I'm also at a loss... particularly if you are running sites with much larger pools of authenticated users. Regardless, thanks for the followup!

bryrock’s picture

Status: Postponed (maintainer needs more info) » Closed (cannot reproduce)

Ok, thanks. I'll close this, but feel free to report back if it happens again and if you can gather more info.