field_sql_storage_field_storage_query() uses COUNT(DISTINCT f1, f2, f3), which is not supported by any current versions of Postgres. Support is apparently in the code base for Postgres 9.0. Current versions only support a single expression in COUNT(DISTINCT ) as does MSSQL and Oracle. The only other engine providing current support for a list of fields in COUNT(DISTINCT ) is DB/2 from version 8.
Patch rebuilds the query into the form SELECT COUNT(*) FROM (SELECT DISTINCT ...) ... since this offers better performance and fewer syntax compatibility problems than using CONCAT and CAST to create a single expression.
Comment | File | Size | Author |
---|---|---|---|
#10 | 706248-count-query-with-distinct.patch | 3.81 KB | Damien Tournoud |
#8 | field_sql_storage_count_2.patch | 1.23 KB | bellHead |
#3 | field_sql_storage_count_1.patch | 1.27 KB | bellHead |
field_sql_storage_count.patch | 1.28 KB | bellHead | |
Comments
Comment #1
bellHead CreditAttribution: bellHead commentedComment #2
yched CreditAttribution: yched commentedWill the patch impact performance on MySQL ? If so, should this go in a Postgres-only conditional code branch ?
Style-wise, the comment needs to use
//
, not/* ... */
Comment #3
bellHead CreditAttribution: bellHead commentedI'm not sure what the expected distribution of distinct values is in this case so I ran two performance tests.
There is a 10% performance penalty on a 55000 record set with 120 distincts and 10% performance gain on a 55000 record set with 30000 distincts. Performance neutral overall I'd say.
Comment marker style changed. The style guide seems to permit both, at what length of comment should the multiline comment style start being used?
Comment #5
bellHead CreditAttribution: bellHead commented#3: field_sql_storage_count_1.patch queued for re-testing.
Comment #6
Josh Waihi CreditAttribution: Josh Waihi commentedAwesome, this makes sense. And the test bot is happy also.
Comment #7
Dries CreditAttribution: Dries commentedThis has some style issues: extra space before ), needs to end with . instead of ).
Quick reroll?
Comment #8
bellHead CreditAttribution: bellHead commentedRerolled, shortened the bit on SQL99 since I couldn't get it to read nicely.
Comment #9
Damien Tournoud CreditAttribution: Damien Tournoud commentedI don't buy that. We should make sure that:
works as intended, ie: returns the count of distinct fields.
Comment #10
Damien Tournoud CreditAttribution: Damien Tournoud commentedThis patch implements proper support of distinct clauses in
SelectQuery::countQuery()
, uses that infield_attach_query()
and adds an additional unit test.Comment #11
Damien Tournoud CreditAttribution: Damien Tournoud commentedAnd yes, SQLite is affected too.
Comment #12
bellHead CreditAttribution: bellHead commentedLooks good. I wasn't aware that DBTNG was intended to support that construct. I have updated the page on the multi-field DISTINCT MySQLism accordingly
Comment #13
yched CreditAttribution: yched commentedRTBC then ?
Comment #14
webchickGreat, thanks folks! The new code is a bit easier to follow, too. AND comes with tests. Yay. :)
Committed to HEAD.