Ok, yet another postgresql issue :)

EntityQuery tests are failing, there are two issues...

- The ORDER BY column must be selected error I've already fixed in another issue for a specific problem. This time, I'm trying to fix it at the source, by overriding orderBy() in SelectQuery_pgsql. However, it currently only works when you pass a field name that contains the table name alias, like this $query->orderBy('node.title'). We could only do it when there are GROUP BY/distinct fields, but on the other hand, it doesn't hurt and we don't know in which order the methods are called...

- The second problem is that the tests use the CONTAINS (which translates to LIKE '%string%' in the "real" query) on integer/bigint column. Adapted them to be similiar to the STARTS_WITH tests, which are correct.

Files: 
CommentFileSizeAuthor
#38 postgresql_orderby-38.patch8.88 KBmradcliffe
PASSED: [[SimpleTest]]: [MySQL] 24,821 pass(es). View
#29 postgresql_orderby.patch8.18 KBBerdir
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch postgresql_orderby.patch. View
#15 fix_entityquery_tests6.patch7.58 KBBerdir
PASSED: [[SimpleTest]]: [MySQL] 23,313 pass(es). View
#11 fix_entityquery_tests5.patch6.5 KBBerdir
PASSED: [[SimpleTest]]: [MySQL] 23,287 pass(es). View
#9 fix_entityquery_tests4.patch6.45 KBBerdir
PASSED: [[SimpleTest]]: [MySQL] 23,287 pass(es). View
#4 fix_entityquery_tests3.patch5.36 KBBerdir
PASSED: [[SimpleTest]]: [MySQL] 23,295 pass(es). View
#3 fix_entityquery_tests2.patch5.2 KBBerdir
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch fix_entityquery_tests2.patch. View
fix_entityquery_tests.patch1.98 KBBerdir
PASSED: [[SimpleTest]]: [MySQL] 23,235 pass(es). View

Comments

chx’s picture

Why you can't addField('', $field) as it is?

Berdir’s picture

Because its broken :)

It creates invalid aliases because of the point. But I guess it could easily be fixed by running it through escapeTable() (not escapeField because that allow the ".")

Berdir’s picture

FileSize
5.2 KB
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch fix_entityquery_tests2.patch. View

Here is a try...

- New escape function (escapeTable() allows the point too) escapeAlias()
- Use that in __toString()
- use $this->addField(NULL, $field)
- does not add the field if one of the following conditions is true: the field is already added, any table defines all_fields (leads to ambigous alias issues) or if it is an expression (it is considered as an expression if escapeField($field) is different than field)

Berdir’s picture

FileSize
5.36 KB
PASSED: [[SimpleTest]]: [MySQL] 23,295 pass(es). View

The patch actually failed some database tests because I missed a check on the expression alias.

Updated, this time with the --no-prefix option..

Berdir’s picture

Actually, this patch also fixes test failures in Forum, Poll expiration and Image style path tests. Probably more, I'm still re-running the known broken tests...

Berdir’s picture

And it also fixes Statistics admin, Trigger tests and improves Search tests in a way that the actually run and report failures instead of dying :)

So, patch needs review! :)

chx’s picture

So what happens in the expression case? Say, ORDER BY LENGTH('foobar') ? Also escapeField()/Table needs to be DatabaseConnection::escapeField() / DatabaseConnection::escapeTable() . Are you trying to save characters in comments :) ?

tstoeckler’s picture

+++ includes/database/database.inc
@@ -810,6 +810,19 @@ abstract class DatabaseConnection extends PDO {
+   * contrast to escapeField()/Table, this doesn't allow the point.

Sorry to be nitpicky, but in case of a re-roll could you change this to "escapeField() and escapeTable()", I think it's currently a bit hard to read.

Leaving at needs review.
Powered by Dreditor.

Berdir’s picture

FileSize
6.45 KB
PASSED: [[SimpleTest]]: [MySQL] 23,287 pass(es). View

The search tests are easy to fix by moving the orderBy() a few lines down. Now the'yre green..

This also shows the problem of this approach. The order suddenly *is* relevant. You can not call orderBy() to order on a expression before adding the expression (totally explodes). Sounds kinda obvious, but it was possible before and the strange person* who wrote the search extender class did it that way.

* In case you don't know, *I* wrote the original version of the Search Extender class and that part has most probably not been changed ;)

Berdir’s picture

@chx: I think this is not supported on PostgreSQL in the first place :) And if we can get #829464: orderby() should verify direction [DONE] and escape fields in, it will not allow to do that anyway.

@tstoeckler: Will do that on the next re-roll, there will probably be more anyway.

Berdir’s picture

FileSize
6.5 KB
PASSED: [[SimpleTest]]: [MySQL] 23,287 pass(es). View

Re-roll with the comment fixed.

@chx: Correction, ORDER BY expressions work in PostgreSQL and this patch doesn't break because of the escapeField() comparisation hack. An expression with ( and other non-allowed characters $this->connection->escapeField($field) does not equal $field and so we don't add it as one.

Crell’s picture

Status: Needs review » Needs work
+++ includes/database/pgsql/query.inc
@@ -189,4 +189,56 @@ class SelectQuery_pgsql extends SelectQuery {
+    // If there is a table which loads all fields, we can't do this because we
+    // might produce illegal aliases.
+    foreach ($this->tables as $table) {
+      if (!empty($table['all_fields'])) {
+        return $return;
+      }
+    }

Can't do which? Order by? Is the implication that if I do db_select('node')->fields('node')->orderBy('uid')->execute(); there is no possible way for that query to work under PostgreSQL?

If so, that needs much cleaner documentation. MySQL people won't think about that.

+++ includes/database/pgsql/query.inc
@@ -189,4 +189,56 @@ class SelectQuery_pgsql extends SelectQuery {
+    // If the field is actually an expression, it can not be added either.
+    if ($this->connection->escapeField($field) != $field) {
+      return $return;
+    }

I don't understand this either. Don't we check for expressions above? And what we do here is simply return normally? I'm very confused on this part.

Powered by Dreditor.

Berdir’s picture

These checks are *only* to figure out if we want to add the field/alias/something which is *always* add to orderBy should also be automatically added as a field.

The problem with the all_fields is, that if I don't add this check, it would generate the following query:

SELECT node.*, node.uid as uid FROM {node} ORDER BY uid

And that fails because uid is ambigous.

So, this works perfectly fine:
db_select('node')->fields('node')->orderBy('uid')->execute();

However, what doesn't work automatically is the following:

$query = db_select('node', 'n');
$query->join('user', 'u', 'n.uid = u.uid');
$result = $query
  ->distinct() // The ORDER BY Problem only exists when DISTINCT or GROUP BY is used
  ->fields('node')
  ->orderBy('mail')
  ->execute();

Now, all fields of node have been selected, but we don't know if mail belongs to {node} or {user} and because of that, we can not add it automatically. You can easily fix the query by adding fields('user', array('mail'))

Exactly the same thing regarding the expression thing. It is perfectly fine to order by an expression but we can't automatically add that as a field as well. We could try to extend that logic even more and automatically add it as an expression and use the alias or something like that. But I'm not sure if we should go that far.

So, what this patch tries to do is handle as many cases as possible, but it can't handle everything. And it was able to resolve almost all test failures that PostgreSQL currently has, only search.module needed a small change because it called orderBy('alias') before it called ->addExpression('expression', 'alias').

Crell’s picture

Sounds like the inline docs need to be expanded then. We also then need to update the docblock for orderBy to specify that you can only orderBy a field/expression that has already been added.

Berdir’s picture

Status: Needs work » Needs review
FileSize
7.58 KB
PASSED: [[SimpleTest]]: [MySQL] 23,313 pass(es). View

What about this? No functional changes, I only tried to improve/extend the comments.

Josh Waihi’s picture

Status: Needs review » Needs work

I'd prefer that we just throw an error if a develop doesn't correctly add fields to the select list, otherwise, the query will return unexpected fields which could also lead to miss use of the framework.

Rather than fix up developers mistakes, I'd like to throw errors, that way, they might learn something :)

Berdir’s picture

I'm not sure.

Yeah, the SQL standard (I have read that, but not confirmed) specifies that but the problem is that we can not add an error ourself because it is not possible to detect all cases automatically. And just letting PostgreSQL throw the error makes it hard for developers who don't use PostgreSQL to be compatible with it. I guess this could lead to a similiar situation as we have it with the GROUP BY problem in D6.

So imho, the comment in orderBy() + the automatic handling for PostgreSQL would be the best way to go, but I can try to fix the wrong queries of others agree with fiasco. According to my test run, it seems that there are quite a lot of them, because this patch actually fixes quite a lot of currently failing tests.

PS: @DamZ: It seems that SQL Server requires that as well, or did that change in a recent version? See http://weblogs.sqlteam.com/jeffs/archive/2007/12/13/select-distinct-orde...

Josh Waihi’s picture

Its pointless fixing this issue just for PostgreSQL, it will lead to other issues and is different behaviour to other drivers. IMO, we should restrict the other drivers to behave the same way.

Damien Tournoud’s picture

PostgreSQL is right in that ORDER BY technically applies to the result set. As a consequence, SQL-92 mandates that only column aliases from the result set (or numeric position of those columns) be used in ORDER BY clauses.

On the other hand, MySQL and the other database engines (SQLite and SQL Server) choose not to comply with the standard by the letter, because it is often useful to be able to sort by a column that you don't actually want to return.

We could try to make the other implementations more strict, but it turns out that it is not really possible. The problem is that static query analysis cannot identify if the use of ORDER BY is valid in all cases. It is easy to see that this query is invalid:

SELECT nid FROM node ORDER BY title

But it is impossible to know that this one is not, without knowing the list of columns before executing the query:

SELECT n.*,nr.body FROM node LEFT JOIN node_revisions nr ORDER BY timestamp

(timestamp is a column of the node_revisions table)

As a consequence, I recommend we just implement a special case in PostgreSQL. To be complete, we should make sure to override SelectQuery and filter out the columns we added for the ORDER BY but don't actually want in the result set.

Berdir’s picture

"What Damien said". I actually wanted to write the exactly same argument. It is technically impossible to parse and block all wrong combinations and therefore, the only way other than fixing all instances in core and live with the incompatible contrib modules is to handle this in the PostgreSQL driver. It's the same with that, we can't handle *all* cases, but it's not so much a problem there because we can always fix those rare cases. On the other side, if the driver would tell you that you're query is wrong even if it is not, there is really no way around that.

Another problem is that people writing queries might not even be aware that the query will use DISTINCT because that's added by the node_access query_alter hook. Meaning, we dynamically extend/change the queries already.

Crell’s picture

I agree with making at least a best-guess effort to keep a query working if we can. We can't get all cases, but where we're able to divine the developer's intent and compensate for it we should. That has a better chance of getting MySQL developers to write PostgreSQL-compatible queries than just exploding on them.

Berdir’s picture

Status: Needs work » Needs review

Setting back to needs review since everyone except fiasco seems to think that the patch is the way to go and given that, the patch in #15 doesn't need anything additional. fiasco, can you live with this? :)

FYI, the good part of the story is that with the patch from this issue and the changeField() patch, all tests pass on PostgreSQL (again).

Josh Waihi’s picture

Status: Needs review » Reviewed & tested by the community

:( Sad but true

webchick’s picture

Status: Reviewed & tested by the community » Fixed

While I too would rather Drupal conform to specs where they exist, the practical reality seems to be that more database systems behave as MySQL does than adhering to the letter of the spec as PostgreSQL does. So it makes sense for PostgreSQL to do extra work here, as much as it pains me to say it. :\

Committed to HEAD, with some minor English touch-ups on a few of the comments. In addition to closing a critical, I believe this now gets us to 100% tests passing on PostgreSQL! GREAT WORK!

Dries’s picture

Status: Fixed » Needs work
+++ includes/database/database.inc
@@ -810,6 +810,20 @@ abstract class DatabaseConnection extends PDO {
+   * Escapes a alias name string.

Should that be 'an'?

+++ includes/database/database.inc
@@ -810,6 +810,20 @@ abstract class DatabaseConnection extends PDO {
+   * Force all alias names to be strictly alphanumeric-plus-underscore. In
+   * contrast to DatabaseConnection::escapeField() /
+   * DatabaseConnection::escapeTable(), this doesn't allow the point.

I wonder why a point isn't allowed. It never hurts to explain the why, IMO.

+++ includes/database/pgsql/query.inc
@@ -189,4 +189,71 @@ class SelectQuery_pgsql extends SelectQuery {
+  /**
+   * Overrides SelectQuery::orderBy().
+   *
+   * Automatically adds columns that are ordered on as fields.
+   */

Any reason why the make this PostgreSQL specific versus modifying the original? Might be worth pointing out ...

Or what about having the main code throw an exception if this condition isn't met.

I'm not a big fan of making this magically work for one driver while the other drivers are actually doing the wrong thing.

Powered by Dreditor.

webchick’s picture

Oops. Cross-post. :\ I did fix "an". ;)

webchick’s picture

And I've rolled this back for now so we can get consensus on this approach. But I think that the database system maintainers have done a decent job of laying out why strict ANSI SQL adherence isn't practical.

Berdir’s picture

I wonder why a point isn't allowed. It never hurts to explain the why, IMO.

For fields, the point is allowed because it is valid to write table.field. However, the point is not a valid character for an alias: "table.field AS some.thing" results in an error. I can add that as an explanation.

Any reason why the make this PostgreSQL specific versus modifying the original? Might be worth pointing out ...

Or what about having the main code throw an exception if this condition isn't met.

I'm not a big fan of making this magically work for one driver while the other drivers are actually doing the wrong thing.

See #16-21, especially #19. We can not detect all cases where this is needed or not, so we can't throw an exception.

We could move this to the generic driver, but I guess the main reason against it is performance. This check both adds additional processing (loop over all added fields and tables and expressions for every orderBy() call) and the database needs to select more data and transfer it back to us but we know for sure that it is not needed (if it would be, the query would have selected it in the first place). DamZ even suggested going as far as removing that data when fetching the results so it would be transparent. But I'm not sure about that.

Berdir’s picture

Status: Needs work » Needs review
FileSize
8.18 KB
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch postgresql_orderby.patch. View

Ok, Here's my try at improving the documentation.

@webchick: It seems that you didn't rollback the escapeAlias() method in includes/database/database.inc but that's not so important since it's not yet used :)

Josh Waihi’s picture

Maybe for D8 we could have something like a development mode where things like the db layer can imply strict rules like ANSI compliance but can be turned off in production for performance enhancements?

Just a thought.

chx’s picture

Such things add a horrible code weight which probably would need to be dead weight on all Drupal sites... so likely the answer is: no.

Josh Waihi’s picture

or rather 'contrib'.

Crell’s picture

You probably could write a contrib module that does a hook_query_alter on all SelectQuery instances, checks them for extra pedantry, and throws exceptions more readily than core does. Actually that's probably a nice feature for Devel as a toggle.

drifter’s picture

Status: Needs review » Reviewed & tested by the community

This was already RTBC. Dries raised some concerns, Berdir addressed them. I've read the new documentation in #29, and it seems clear to me (even though I didn't know much about the subject :) - so, RTBC?

mradcliffe’s picture

I read through this and the comments help clarify things for people not familiar with PostgreSQL.

+1 to RTBC if it still passes.

mradcliffe’s picture

#29: postgresql_orderby.patch queued for re-testing.

Status: Reviewed & tested by the community » Needs work

The last submitted patch, postgresql_orderby.patch, failed testing.

mradcliffe’s picture

Status: Needs work » Needs review
FileSize
8.88 KB
PASSED: [[SimpleTest]]: [MySQL] 24,821 pass(es). View

One hunk failed. Patch redo, should apply now.

mradcliffe’s picture

Status: Needs review » Reviewed & tested by the community

Okay, Berdir's patch passes again. Setting the status back to RTBC per drifter.

webchick’s picture

Status: Reviewed & tested by the community » Fixed

Read over the new docs and they seem clear. I don't forsee us finding another way around this unless we introduce a performance penalty for the 95% case of our users. Furthermore, this patch gets us to 100% tests passing on pgsql (I hope), and the sooner we get that, the sooner we can stop breaking pgsql all the time. Hope the new docs are cool with Dries (if not, your call, of course).

Committed #38 to HEAD. Thanks!!

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.