The PDO driver for PostgreSQL is inserting boolean FALSE values in to numeric columns as an empty string which causes PostgreSQL errors, its a bigger battle to make developers actually typecast values correctly - I blame PHP for being so lenient towards it.
Until this bug is fixed in the PDO driver I think this patch needs to be implied to ensure no boolean values are passed as parameters to the PostgreSQL Driver in Drupal.
Thanks to Damian for the speed patch suggestion.
| Comment | File | Size | Author |
|---|---|---|---|
| #14 | 467474-typecasting-gone-wrong.2.patch | 1.8 KB | josh waihi |
| #3 | 467474-typecasting-gone-wrong.patch | 1.75 KB | josh waihi |
| postgres-work-around.patch | 858 bytes | josh waihi |
Comments
Comment #1
Crell commentedCan we see a benchmark on this? We're talking about iterating over every parameter in every query we run; that's a lot of loops, and I don't want to slow down the DB unnecessarily.
Comment #2
josh waihi commented@Crell, I'm open to other suggestions - I've talked with DamZ about this, the only other option I can see is making Drupal developers type cast correctly which will be an ongoing maintenance factor for both core and contrib. That not something, as a driver maintainer, I want to do - this is something that should be dealt with in the driver, and I see it as a temporary solution till it gets fixed in the PDO driver itself.
On a performance side note, I managed to optimize postgres so tests now take 2 1/2 hours instead of 4 hours +.
Comment #3
josh waihi commentedThe same issue occurred in the filter module, so I'm attaching a patch that fixes the query. However, one way around this would be to configure the mysql driver to act like the postgres driver on this one - this would make developers more aware and better programmers who typecast there values correctly?
Comment #4
Crell commentedThere's another issue in the queue somewhere to auto-cast booleans for MySQL, since it breaks there, too. Again, I'm worried about the performance implications so we really need benchmarks to know how badly we're shooting ourselves in the foot.
Comment #5
damien tournoud commentedPart of the answer: on my development VM (a VMWare inside my MacBook), running this loop on an array of 100 strings (in which no replacement needs to be made) consistently takes 55 us. For comparison purposes, running a straightforward prebuilt query with 100 parameters takes about 550 us, so it seems that we are talking of a loss of about 10% here.
Here is the script I used:
Comment #6
damien tournoud commentedComment #7
Crell commentedWait, so you mean this patch has a 10% performance hit?
Comment #8
damien tournoud commentedCrell: on 100 arguments yes.
Here is roughly how this scale:
For 1 argument, the overhead is < 1%, for 10 arguments, it is in the 3-4% range, for 100 arguments it reaches about 10%, and a little bit more then 10% for 1000 arguments.
Comment #9
Crell commentedCan we get a benchmark on a Drupal page load rather than a microbenchmark? That should give us an idea of how much we're actually going to lose here.
Comment #10
damien tournoud commented@Crell: given the microbenchmarks, I don't believe the difference is even measurable on a full page load.
Comment #11
josh waihi commentedFor Crell, I bootstrapped Drupal then ran this:
With out the patch: 80s
With the patch:87s
so thats 0.07 seconds difference for a page load of the modules page.
Comment #12
catchDamz just asked me to benchmark this. I'm not going to.
Let's not hold up something that's completely broken on benchmarks - get it actually working, then get a postgres test slave, then worry about optimization when that's done.
So many patches get in without any benchmarks at all, and slow Drupal down, we know the impact is minimal, that's enough information.
Comment #13
webchickCan someone explain to me why it's a good idea to make pgsql auto-typecast and not mysql/sqlite as well?
Comment #14
josh waihi commentedThe reason we're typecasting in PostgreSQL is because of a bug in the PDO Driver that doesn't handle binding parameters with associative arrays very well. For example:
Notice the second
executecall will work but the third will not. Its the way in which we bind parameters that breaks this driver. If we were to bind parameters numerically and in order, ie, not use associative arrays, then this issue would not be here.I've commented the code as per webchick's request.
Comment #15
snow-2 commentedI'm not convinced it's a problem with pdo_pgsql. It actually specifically has code in there to deal with boolean types (though it changes them to 't' and 'f' and not '1' and '0' which would be better in this case). Here's what I've seen under gdb:
(gdb) p *param
$37 = {paramno = 0, name = 0x10dec80 ":id", namelen = 3, max_value_len = 0, parameter = 0x10deb10, param_type = PDO_PARAM_STR, driver_params = 0x0,
driver_data = 0x0, stmt = 0x10dd398, is_param = 1}
(gdb) p *param->parameter
$38 = {value = {lval = 17684240, dval = 8.7371754568116058e-317, str = {val = 0x10dd710 "", len = 0}, ht = 0x10dd710, obj = {handle = 17684240,
handlers = 0x0}}, refcount = 1, type = 6 '\006', is_ref = 0 '\0'}
This appears to indicate that, for whatever reason, PHP/PDO thinks this is an empty string, not a false boolean. I'm not sure yet why it works with the regular parameters, but that's next.
Comment #16
webchickHm. Well. Given that:
a) Any performance implications (which seem to be pretty minor according to #11) incurred by this patch only affect pgsql (for now)
b) It's an upstream bug that we can't do anything about
c) It enables us to get tests passing for pgsql which can in turn lead to a more robust test framework
I've committed this to HEAD.
I have a feeling we'll be revisiting this before D7 is released. :P