Currently all PDOStatements created in DatabaseConnection::prepareQuery() are cached. However, when using the query builder, the placeholder names are never reused (i.e. :db_condition_placeholder_1 is only used once per request), so there is not much point in caching this.

When running Simpletests I noticed that for each testXxx() method call, PHP's memory footprint increases with more than 10 MB. This is mostly cached queries used during the bootstrap process.

This patch prevents caching in DatabaseConnection::prepareQuery() of queries made using the query builder. Instead the PDOStatement is stored in the Query instance so that in can be reused in case the same instance is executed twice. The patch incorporates the fix for #550010: DatabaseConnection::prepareQuery() ignores $cache argument.

With the patch, the memory footprint is now much nicer.

A note about the change in @@ -1207,23 +1219,24 @@ class DatabaseCondition implements Query:
This was actually a bug. foreach was iterating over $condition['value'], even if this was a SelectQuery object. This didn't have any effect, but it does now when Query has two public properties.

Comments

c960657’s picture

StatusFileSize
new14.5 KB

Updated a bit.

Crell’s picture

Actually, it would probably be worth benchmarking to see if the prepared query cache is useful to begin with. It was included originally on the assumption that it would save us CPU time, but I don't believe that's ever been tested on the macro-level. If it doesn't buy us much and breaks on some queries anyway, it may be a better trade off to just remove rather than adding conditional caching as in this patch.

c960657’s picture

StatusFileSize
new6.07 KB

I'm not sure how to do a good benchmark. But here is a patch that removes the caching. Note that neither this patch prepared-statements-2.patch has been tested with PostgreSQL yet.

Crell’s picture

Use the setup from: http://drupal.org/node/282862

With CSS cache on and page cache off, hit some page with lots of SELECT queries repeatedly. (Normally I'd say front page with 50 nodes showing, but with the node_load_multiple() work I don't know if that's still going to be a good test there. You can use devel to find a good page with lots of SELECT queries on it.) Try it with and without each of these patches and report back the results. Requests/second and time per request are the main things we're interested in.

c960657’s picture

See also #496500: Remove global placeholder counter that will make the cache useful even for query builder queries.

Status: Needs review » Needs work

The last submitted patch failed testing.

Crell’s picture

This is not an API change so is code-freeze safe, so I'm not paying it much attention right now. :-) However, we should get back to it at some point and really benchmark our prepared statement handling so that we know we're doing it in the most performant way.

catch’s picture

subscribing.

Crell’s picture

Title: Do not cache prepared statements made with the query builder » Prepared statement caching: Good, Bad, or Ugly?
Category: feature » task

Merging in this issue: #553770: Performance: Support for prepared statements under MySQL

So there's 4 possible configurations we could be testing here:

1) Use MySQL's native prepared statements, cache prepared statements.
2) Use MySQL's native prepared statements, do not cache prepared statements.
3) Emulate prepared statements in PDO, cache prepared statements.
4) Emulate prepared statements in PDO, do not cache prepared statements.

Right now, HEAD does #3.

So let's throw some benchmarking at this. Drupal 7 HEAD, 10 nodes on the front page. 64 queries getting run (damn that node_load_multiple() for being so efficient). According to devel, there's a lot of reused queries, especially menu queries that get run as many as 10 times (vis, the prepared statement gets recycled).

Using this benchmark command:
ab -n200 -c10 http://localhost/~lgarfiel/projects/drupal7/

With APC disabled, MySQL query cache set to cache limit 1 MB, size 16 MB:

Case 1:
Peak memory usage: 15.75 MB
Requests per second:    5.74 [#/sec] (mean)
Time per request:       1742.631 [ms] (mean)
Time per request:       174.263 [ms] (mean, across all concurrent requests)

Case 2:
Peak memory usage: 15.75 MB
Requests per second:    5.65 [#/sec] (mean)
Time per request:       1770.866 [ms] (mean)
Time per request:       177.087 [ms] (mean, across all concurrent requests)

Case 3 (Current HEAD):
Peak memory usage: 15.75 MB
Requests per second:    5.69 [#/sec] (mean)
Time per request:       1758.344 [ms] (mean)
Time per request:       175.834 [ms] (mean, across all concurrent requests)

Case 4:
Peak memory usage: 15.75 MB
Requests per second:    5.69 [#/sec] (mean)
Time per request:       1757.054 [ms] (mean)
Time per request:       175.705 [ms] (mean, across all concurrent requests)

Much to my surprise... using the MySQL prepared statements rather than PDO's prepared statements seems slightly faster, provided we do cache the prepared statements in PHP space. At least for this number of queries there doesn't appear to be a memory hit to doing so. If someone can point me at a more query-expensive read-only page that we can test we can try a more robust test, but it looks right now like there's a slide benefit to just trusting MySQL. I wouldn't have expected that given that, last I heard, MySQL prepared statements bypass the query cache. (Naturally this test only tests MySQL; Postgres and SQLite folks can do their own thing, including caching or not caching prepared statements.)

Can someone confirm that, or offer an alternate explanation?

hass’s picture

Not sure if there is any speed difference between 1 and 3, but I know how good coldfusion caches prepared statements and this is also done across all requests and this cache is only invalidated if the cf server is restarted.

catch’s picture

Crell - could you run those again with -c1 -n1000 - the concurrency might be stressing your machine. I'm hoping there are few pages with > 100 queries in HEAD at this point, so might be worth generating one for testing purposes.

Crell’s picture

I'll see if I can run more tonight.

@hass: I don't know how Cold Fusion is built, but PHP is shared-nothing. The connection terminates at the end of each request and the prepared statements with them.

One other thing that I forgot to mention is that I completely disabled caching for these tests by assigning the fake-cache implementation as the active cache system. Surprisingly I didn't really notice a difference, even in the query count, which has me scratching my head.

catch’s picture

We do issue a very large number of cache_get() queries, it's possible you managed to replace exactly the number of cache_get() with direct queries (menu system does two per menu for example, whereas only one when no cache iirc).

hass’s picture

@Crell: CF save the pre-compiled statements in memory for reusing them. With PHP on IIS (DLL) or FastCGI I would expect some more caching... but I do not know if this is the case. PHP may need to learn from others first :-)

Crell’s picture

PHP is by design shared-nothing. To do otherwise we'd need to mandate APC and cache stuff in APC between requests, and I suspect DB connections and prepared statements couldn't be stored that way anyway.

Crell’s picture

OK, here we go again. For consistency I have NOT updated my HEAD since the test above, so it's the exact same code.

Command: ab -n1000 -c1 http://localhost/~lgarfiel/projects/drupal7/

1) Use MySQL's native prepared statements, cache prepared statements.
Requests per second:    3.24 [#/sec] (mean)
Time per request:       308.513 [ms] (mean)
Time per request:       308.513 [ms] (mean, across all concurrent requests)

2) Use MySQL's native prepared statements, do not cache prepared statements.
Requests per second:    3.24 [#/sec] (mean)
Time per request:       308.232 [ms] (mean)
Time per request:       308.232 [ms] (mean, across all concurrent requests)

3) Emulate prepared statements in PDO, cache prepared statements.
Requests per second:    3.25 [#/sec] (mean)
Time per request:       307.754 [ms] (mean)
Time per request:       307.754 [ms] (mean, across all concurrent requests)

4) Emulate prepared statements in PDO, do not cache prepared statements.
Requests per second:    3.25 [#/sec] (mean)
Time per request:       307.689 [ms] (mean)
Time per request:       307.689 [ms] (mean, across all concurrent requests)

Well hum. That's annoying. Virtually no difference either way?

On a lark, I re-enabled the cache system and ran the 4th test again:

Requests per second:    3.26 [#/sec] (mean)
Time per request:       307.000 [ms] (mean)
Time per request:       307.000 [ms] (mean, across all concurrent requests)

I've got to be doing something wrong here... Can anyone else replicate these results?

damien tournoud’s picture

With page cache enabled, you should get around 300 requests per second. Something is definitely wrong here ;)

Crell’s picture

No, I didn't enable page caching. I swapped out the cache backend for the null implementation in cache-install.inc. I sometimes do that in order to make development easier. Page caching was off for all tests.

Crell’s picture

Priority: Normal » Critical
Status: Needs work » Needs review
StatusFileSize
new976 bytes
new1.06 KB
new1.06 KB
new1.06 KB
new1.06 KB

So it turns out that runtime benchmarks work better when you have APC enabled. Who'da thunk it, eh? (Hat tip to catch.)

OK, attached is the raw data from all four combinations. For those who just want the executive summary:

Command: ab -n500 -c1 ?q=node/1

Number of queries: Around 55 or so.

Cache no, Emulate no: 8.86 requests/second
Cache no, Emulate yes: 10.23 requests/second
Cache yes, Emulate no: 25.84 requests/second
Cache yes, Emulate yes: 10.25 requests/second (Current Drupal HEAD)

Holy performance black hole Batman! I ran the 3rd test above multiple times to be sure. Someone please check my numbers, because they actually run counter to what I've heard before about MySQL performance with prepared statements.

In case we decide to go with that, I've also attached a patch to remove prepared statement emulation in MySQL, which makes core do #3 above instead of #4 above. Arguably, we could also make that configurable via the settings array for MySQL drivers if we wanted to.

Crell’s picture

Status: Needs review » Needs work

On second thought, I just realized what I did wrong. Ignore the tests above while I retweak. Sorry for the false alarm. :-(

sun’s picture

Crell’s picture

StatusFileSize
new1.06 KB
new1.06 KB
new1.06 KB
new1.06 KB

Right then, this time without running tests that generate fatal errors, which are in fact really really fast from ab's point of view... *sigh*

Executive summary:

Cache no, Emulate no: 10.16 requests/second
Cache no, Emulate yes: 10.23 requests/second
Cache yes, Emulate no: 10.16 requests/second
Cache yes, Emulate yes: 10.25 requests/second (Current Drupal HEAD)

So we're back to negligible difference, with of all things current HEAD being marginally the best but within margin of error. Can anyone else confirm *these* results? :-)

I also discovered in the course of testing a serious OMGWTFBBQ. The fatal error in question is an "Exception thrown without stack trace" bug that appears if and only if we explicitly set prepared statement emulation to FALSE (which is the default) and we cache prepared statements. If we don't cache prepared statements, it doesn't die. Erk. So I stuck a print statement into _drupal_check_registry() to see if I could figure out where it's dying. That gave me plenty of debug output, most of it "headers already sent", but then of all things I get a fatal error calling _system_rebuild_theme_data() from theme.inc, out of the maintenance theme pathway. Because that function is part of the core of the theme rebuild process, yet lives in system.module, which is wrong. I tried moving it to theme.inc, which would make sense, but then I get a fatal on system_rebuild-theme_data(). That in turn calls system_get_files_database() and system_update_files_database(), which are not part of the theme system. So while trying to benchmark PDO, I've found a trainwreck of stupidly factored code scattered about system.module (as if we don't have enough of that already) that may or may not be causing PDO to crash. I have no idea how or why.

I'm just confused.

Crell’s picture

Unfortunately MySQL 5.1 is still very not widely deployed, and many versions out there have nastiness of their own. Eg: http://www.palantir.net/blog/beware-mysql-51-my-son

I don't think we can realistically require MySQL 5.1 yet.

sun.core’s picture

Are there any actionable tasks here?

Crell’s picture

As far as the DB layer goes, I don't think so. Turns out that in practice, at least for us, none of this tweaking makes much difference. :-) No change needed to server requirements. Besides, MySQL 5.1 isn't widely deployed enough yet for us to be able to require it AFAIK.

However, I think the brain-dead code organization in system.module warrants its own issue. Whether or not we can do anything about it in Drupal 7 I don't know, but the way that code is structured is completely and totally asinine and is apparently causing insanely mysterious errors.

catch’s picture

Version: 7.x-dev » 8.x-dev

Moving this on then, we should open a new issue for bizarre system module errors.

Crell’s picture

Opened here: #679112: Time for system.module and most of includes to commit seppuku

See you in this issue again in Drupal 8. :-)

webchick’s picture

I don't know that this is worth moving back to D7, or if it's another issue altogether, but one really odd thing I noticed was when I put:

  if ($query->hasTag('node_access')) {
    var_dump($query);
  }

... into a hook_query_alter(), the contents of every. single. query. executed on the page are printed out on the screen in a $query->schema->connection->preparedStatements property. Assuming every single other query is also carrying around the compiled preparedStatements of every single other query, I wonder if it's possible to get RAM usage along with those benchmarks, since the performance side of removing this cache seems negligible?

catch’s picture

Version: 8.x-dev » 7.x-dev

I'm moving this back to D7, the original bug report here is still critical - memory leakage. If the benchmarks make no difference, then we need to check if it fixes the memory issues and commit one of the initial patches.

Crell’s picture

@webchick: That's not a problem. Remember that objects pass by handle. The query object has a reference/handle to the schema object, of which there is only one, which itself has a reference/handle to the connection object, of which there is only one, which internally has an array of prepared statement objects. var_dump() just keeps traversing dereferences. There's still only one copy of that array.

webchick’s picture

Ok, I figured it was something like that, esp. since earlier benchmarks showed no RAM increase.

From purely a DX perspective, though, working with hook_query_alter() and trying to figure out what the heck is in a $query object was absolute hell until I commented out $this->preparedStatements[$query] = $stmt; in database.inc. catch also pointed out that this could be a culprit of our > 32MB of RAM install-time requirement, which might be worth looking into.

But if sucking all of the RAM out of the sky is not an issue, can someone remind me why this is a critical bug for D7?

webchick’s picture

Here are my results with drush during install btw: #281405-118: Drupal 7.x can't be installed with memory_limit=32M Removing that cache saved about 1.5 MB of memory on install. Probably pages like update.php, rebuilding node access permissions, etc. would see similar gains.

catch’s picture

Status: Needs work » Needs review
StatusFileSize
new1.83 KB

See #281405: Drupal 7.x can't be installed with memory_limit=32M around #120 and onwards - we should remove prepared statement caching, since #22 shows no measurable gain from it, and because it shaves 2.5mb-ish from the installer.

Here's a re-roll of webchick's patch from there, with some additional cleanup. Untested.

catch’s picture

StatusFileSize
new3.83 KB

Missed a bit.

Status: Needs review » Needs work

The last submitted patch, no_cache_for_you.patch, failed testing.

Crell’s picture

Status: Needs work » Needs review

Are you sure, bot? Nothing in that test should have anything to do with this cache.

Crell’s picture

#34: no_cache_for_you.patch queued for re-testing.

Crell’s picture

Status: Needs review » Reviewed & tested by the community

Quick, before the bot changes its mind!

webchick’s picture

Title: Prepared statement caching: Good, Bad, or Ugly? » Remove prepared statement caching
Status: Reviewed & tested by the community » Fixed

Great. Committed to HEAD! I'll go cross-post over at the memory suckage issue and see if we can now install in < 32MB.

Status: Fixed » Closed (fixed)
Issue tags: -Performance

Automatically closed -- issue fixed for 2 weeks with no activity.