Currently all PDOStatements created in DatabaseConnection::prepareQuery() are cached. However, when using the query builder, the placeholder names are never reused (i.e. :db_condition_placeholder_1 is only used once per request), so there is not much point in caching this.
When running Simpletests I noticed that for each testXxx() method call, PHP's memory footprint increases with more than 10 MB. This is mostly cached queries used during the bootstrap process.
This patch prevents caching in DatabaseConnection::prepareQuery() of queries made using the query builder. Instead the PDOStatement is stored in the Query instance so that in can be reused in case the same instance is executed twice. The patch incorporates the fix for #550010: DatabaseConnection::prepareQuery() ignores $cache argument.
With the patch, the memory footprint is now much nicer.
A note about the change in @@ -1207,23 +1219,24 @@ class DatabaseCondition implements Query:
This was actually a bug. foreach was iterating over $condition['value'], even if this was a SelectQuery object. This didn't have any effect, but it does now when Query has two public properties.
| Comment | File | Size | Author |
|---|---|---|---|
| #34 | no_cache_for_you.patch | 3.83 KB | catch |
| #33 | no_cache_for_you.patch | 1.83 KB | catch |
| #22 | cache-no-emulate-no.txt | 1.06 KB | Crell |
| #22 | cache-no-emulate-yes.txt | 1.06 KB | Crell |
| #22 | cache-yes-emulate-no.txt | 1.06 KB | Crell |
Comments
Comment #1
c960657 commentedUpdated a bit.
Comment #2
Crell commentedActually, it would probably be worth benchmarking to see if the prepared query cache is useful to begin with. It was included originally on the assumption that it would save us CPU time, but I don't believe that's ever been tested on the macro-level. If it doesn't buy us much and breaks on some queries anyway, it may be a better trade off to just remove rather than adding conditional caching as in this patch.
Comment #3
c960657 commentedI'm not sure how to do a good benchmark. But here is a patch that removes the caching. Note that neither this patch prepared-statements-2.patch has been tested with PostgreSQL yet.
Comment #4
Crell commentedUse the setup from: http://drupal.org/node/282862
With CSS cache on and page cache off, hit some page with lots of SELECT queries repeatedly. (Normally I'd say front page with 50 nodes showing, but with the node_load_multiple() work I don't know if that's still going to be a good test there. You can use devel to find a good page with lots of SELECT queries on it.) Try it with and without each of these patches and report back the results. Requests/second and time per request are the main things we're interested in.
Comment #5
c960657 commentedSee also #496500: Remove global placeholder counter that will make the cache useful even for query builder queries.
Comment #7
Crell commentedThis is not an API change so is code-freeze safe, so I'm not paying it much attention right now. :-) However, we should get back to it at some point and really benchmark our prepared statement handling so that we know we're doing it in the most performant way.
Comment #8
catchsubscribing.
Comment #9
Crell commentedMerging in this issue: #553770: Performance: Support for prepared statements under MySQL
So there's 4 possible configurations we could be testing here:
1) Use MySQL's native prepared statements, cache prepared statements.
2) Use MySQL's native prepared statements, do not cache prepared statements.
3) Emulate prepared statements in PDO, cache prepared statements.
4) Emulate prepared statements in PDO, do not cache prepared statements.
Right now, HEAD does #3.
So let's throw some benchmarking at this. Drupal 7 HEAD, 10 nodes on the front page. 64 queries getting run (damn that node_load_multiple() for being so efficient). According to devel, there's a lot of reused queries, especially menu queries that get run as many as 10 times (vis, the prepared statement gets recycled).
Using this benchmark command:
ab -n200 -c10 http://localhost/~lgarfiel/projects/drupal7/With APC disabled, MySQL query cache set to cache limit 1 MB, size 16 MB:
Much to my surprise... using the MySQL prepared statements rather than PDO's prepared statements seems slightly faster, provided we do cache the prepared statements in PHP space. At least for this number of queries there doesn't appear to be a memory hit to doing so. If someone can point me at a more query-expensive read-only page that we can test we can try a more robust test, but it looks right now like there's a slide benefit to just trusting MySQL. I wouldn't have expected that given that, last I heard, MySQL prepared statements bypass the query cache. (Naturally this test only tests MySQL; Postgres and SQLite folks can do their own thing, including caching or not caching prepared statements.)
Can someone confirm that, or offer an alternate explanation?
Comment #10
hass commentedNot sure if there is any speed difference between 1 and 3, but I know how good coldfusion caches prepared statements and this is also done across all requests and this cache is only invalidated if the cf server is restarted.
Comment #11
catchCrell - could you run those again with -c1 -n1000 - the concurrency might be stressing your machine. I'm hoping there are few pages with > 100 queries in HEAD at this point, so might be worth generating one for testing purposes.
Comment #12
Crell commentedI'll see if I can run more tonight.
@hass: I don't know how Cold Fusion is built, but PHP is shared-nothing. The connection terminates at the end of each request and the prepared statements with them.
One other thing that I forgot to mention is that I completely disabled caching for these tests by assigning the fake-cache implementation as the active cache system. Surprisingly I didn't really notice a difference, even in the query count, which has me scratching my head.
Comment #13
catchWe do issue a very large number of cache_get() queries, it's possible you managed to replace exactly the number of cache_get() with direct queries (menu system does two per menu for example, whereas only one when no cache iirc).
Comment #14
hass commented@Crell: CF save the pre-compiled statements in memory for reusing them. With PHP on IIS (DLL) or FastCGI I would expect some more caching... but I do not know if this is the case. PHP may need to learn from others first :-)
Comment #15
Crell commentedPHP is by design shared-nothing. To do otherwise we'd need to mandate APC and cache stuff in APC between requests, and I suspect DB connections and prepared statements couldn't be stored that way anyway.
Comment #16
Crell commentedOK, here we go again. For consistency I have NOT updated my HEAD since the test above, so it's the exact same code.
Command:
ab -n1000 -c1 http://localhost/~lgarfiel/projects/drupal7/Well hum. That's annoying. Virtually no difference either way?
On a lark, I re-enabled the cache system and ran the 4th test again:
I've got to be doing something wrong here... Can anyone else replicate these results?
Comment #17
damien tournoud commentedWith page cache enabled, you should get around 300 requests per second. Something is definitely wrong here ;)
Comment #18
Crell commentedNo, I didn't enable page caching. I swapped out the cache backend for the null implementation in cache-install.inc. I sometimes do that in order to make development easier. Page caching was off for all tests.
Comment #19
Crell commentedSo it turns out that runtime benchmarks work better when you have APC enabled. Who'da thunk it, eh? (Hat tip to catch.)
OK, attached is the raw data from all four combinations. For those who just want the executive summary:
Command:
ab -n500 -c1 ?q=node/1Number of queries: Around 55 or so.
Cache no, Emulate no: 8.86 requests/second
Cache no, Emulate yes: 10.23 requests/second
Cache yes, Emulate no: 25.84 requests/second
Cache yes, Emulate yes: 10.25 requests/second (Current Drupal HEAD)
Holy performance black hole Batman! I ran the 3rd test above multiple times to be sure. Someone please check my numbers, because they actually run counter to what I've heard before about MySQL performance with prepared statements.
In case we decide to go with that, I've also attached a patch to remove prepared statement emulation in MySQL, which makes core do #3 above instead of #4 above. Arguably, we could also make that configurable via the settings array for MySQL drivers if we wanted to.
Comment #20
Crell commentedOn second thought, I just realized what I did wrong. Ignore the tests above while I retweak. Sorry for the false alarm. :-(
Comment #21
sunDo we need to bump server requirements?
http://de2.php.net/manual/en/pdo.prepare.php#79178
http://dev.mysql.com/doc/refman/5.1/en/query-cache.html
Comment #22
Crell commentedRight then, this time without running tests that generate fatal errors, which are in fact really really fast from ab's point of view... *sigh*
Executive summary:
Cache no, Emulate no: 10.16 requests/second
Cache no, Emulate yes: 10.23 requests/second
Cache yes, Emulate no: 10.16 requests/second
Cache yes, Emulate yes: 10.25 requests/second (Current Drupal HEAD)
So we're back to negligible difference, with of all things current HEAD being marginally the best but within margin of error. Can anyone else confirm *these* results? :-)
I also discovered in the course of testing a serious OMGWTFBBQ. The fatal error in question is an "Exception thrown without stack trace" bug that appears if and only if we explicitly set prepared statement emulation to FALSE (which is the default) and we cache prepared statements. If we don't cache prepared statements, it doesn't die. Erk. So I stuck a print statement into _drupal_check_registry() to see if I could figure out where it's dying. That gave me plenty of debug output, most of it "headers already sent", but then of all things I get a fatal error calling _system_rebuild_theme_data() from theme.inc, out of the maintenance theme pathway. Because that function is part of the core of the theme rebuild process, yet lives in system.module, which is wrong. I tried moving it to theme.inc, which would make sense, but then I get a fatal on system_rebuild-theme_data(). That in turn calls system_get_files_database() and system_update_files_database(), which are not part of the theme system. So while trying to benchmark PDO, I've found a trainwreck of stupidly factored code scattered about system.module (as if we don't have enough of that already) that may or may not be causing PDO to crash. I have no idea how or why.
I'm just confused.
Comment #23
Crell commentedUnfortunately MySQL 5.1 is still very not widely deployed, and many versions out there have nastiness of their own. Eg: http://www.palantir.net/blog/beware-mysql-51-my-son
I don't think we can realistically require MySQL 5.1 yet.
Comment #24
sun.core commentedAre there any actionable tasks here?
Comment #25
Crell commentedAs far as the DB layer goes, I don't think so. Turns out that in practice, at least for us, none of this tweaking makes much difference. :-) No change needed to server requirements. Besides, MySQL 5.1 isn't widely deployed enough yet for us to be able to require it AFAIK.
However, I think the brain-dead code organization in system.module warrants its own issue. Whether or not we can do anything about it in Drupal 7 I don't know, but the way that code is structured is completely and totally asinine and is apparently causing insanely mysterious errors.
Comment #26
catchMoving this on then, we should open a new issue for bizarre system module errors.
Comment #27
Crell commentedOpened here: #679112: Time for system.module and most of includes to commit seppuku
See you in this issue again in Drupal 8. :-)
Comment #28
webchickI don't know that this is worth moving back to D7, or if it's another issue altogether, but one really odd thing I noticed was when I put:
... into a hook_query_alter(), the contents of every. single. query. executed on the page are printed out on the screen in a $query->schema->connection->preparedStatements property. Assuming every single other query is also carrying around the compiled preparedStatements of every single other query, I wonder if it's possible to get RAM usage along with those benchmarks, since the performance side of removing this cache seems negligible?
Comment #29
catchI'm moving this back to D7, the original bug report here is still critical - memory leakage. If the benchmarks make no difference, then we need to check if it fixes the memory issues and commit one of the initial patches.
Comment #30
Crell commented@webchick: That's not a problem. Remember that objects pass by handle. The query object has a reference/handle to the schema object, of which there is only one, which itself has a reference/handle to the connection object, of which there is only one, which internally has an array of prepared statement objects. var_dump() just keeps traversing dereferences. There's still only one copy of that array.
Comment #31
webchickOk, I figured it was something like that, esp. since earlier benchmarks showed no RAM increase.
From purely a DX perspective, though, working with hook_query_alter() and trying to figure out what the heck is in a $query object was absolute hell until I commented out $this->preparedStatements[$query] = $stmt; in database.inc. catch also pointed out that this could be a culprit of our > 32MB of RAM install-time requirement, which might be worth looking into.
But if sucking all of the RAM out of the sky is not an issue, can someone remind me why this is a critical bug for D7?
Comment #32
webchickHere are my results with drush during install btw: #281405-118: Drupal 7.x can't be installed with memory_limit=32M Removing that cache saved about 1.5 MB of memory on install. Probably pages like update.php, rebuilding node access permissions, etc. would see similar gains.
Comment #33
catchSee #281405: Drupal 7.x can't be installed with memory_limit=32M around #120 and onwards - we should remove prepared statement caching, since #22 shows no measurable gain from it, and because it shaves 2.5mb-ish from the installer.
Here's a re-roll of webchick's patch from there, with some additional cleanup. Untested.
Comment #34
catchMissed a bit.
Comment #36
Crell commentedAre you sure, bot? Nothing in that test should have anything to do with this cache.
Comment #37
Crell commented#34: no_cache_for_you.patch queued for re-testing.
Comment #38
Crell commentedQuick, before the bot changes its mind!
Comment #39
webchickGreat. Committed to HEAD! I'll go cross-post over at the memory suckage issue and see if we can now install in < 32MB.