Pager query tests through an exception in simple test because an SQL query does not specify GROUP BY<code> with <code>HAVING. This is required for PostgreSQL. The issues with in this are:

  1. compiling the query does not take into account that HAVING requires a GROUP BY clause
  2. an Exception cannot be thrown when compiling becase the method used is __toString()
  3. Using the GROUP BY clause returns different results

Where an aspect of SQL syntax requires another part, we should be able to thrown a SQL construction error of some form. In its current form, it isn't good enough because MySQL will not fail on this Syntax.

CommentFileSizeAuthor
#11 pgsql_having_fix.patch1.49 KBBerdir
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Crell’s picture

Is there a way that we can derive what should be present in GROUP BY from the HAVING?

What does the SQL spec actually say the proper behavior is?

Does this get into "babysitting broken code" territory? (This will probably be answered by the first two questions.)

Damien Tournoud’s picture

@Crell: as far as the database layer is concern, broken code that works on MySQL but fails on PostgreSQL is never on the "don't babysit broken code".

chx’s picture

Wait, what, that's news even for me. There is absolutely no GROUP BY in some query but there is HAVING????? That's IMO broken SQL.

Crell’s picture

chx is right that HAVING doesn't really make sense without a GROUP BY. I'm inclined to call that a bug in the calling code if it happens, and we don't account for all possible broken queries that could be generated. (Why MySQL tries to parse it, I don't know.)

Shouldn't we just fix the pager query / pager query tests so that they don't generate invalid SQL in the first place?

Josh Waihi’s picture

Yes we should but if we know that what they're trying to do is wrong then we should throw an Exception IMO, this will prevent this from happening. Naturally, that will mean not using __toString() and calling compile or something instead so we can do that. Or alternatively, have a sanity check before __toString is called.

Crell’s picture

We could sanity check in execute(), but __toString() could technically be called from anywhere. I'd hate to lose __toString(), though. What if we returned FALSE instead?

Josh Waihi’s picture

I think that the db layer has an obligation to tell the user that they're 'doing it wrong' and what it is they are doing wrong. To some level, we get this when an SQL error occurs and an Exception is thrown. Returning anything else other than a string from a function called __toString seems wrong and an incorrect way to use PHP.

If we want to keep __toString, sanity checking in execute seems a good place to put it since if we're missing bits to structure a query, we can't return a useful string from the __toString method. We'll also have to return some message like 'CANNOT COMPILE SQL QUERY TO STRING' when we cannot compile a query.

Though I think I rather an explicit call to a function, i.e print $query->sprint() rather than the current print $query

Damien Tournoud’s picture

I'm with fiasco that we need to move away from __toString(), because of brain-dead PHP limitations:

Fatal error: Method Test::__toString() must not throw an exception in test.php on line 15

I suggest we add two methods to every Query: a ->generateSQL() method, that generates the SQL string from the query, without checking it from a syntax point of view, and a ->verify() method, that checks that everything is in order. In that case, ->execute() will just: call ->verify(), trigger an exception if verify fails, then call ->generateSQL().

Crell’s picture

If we're going to drop the __toString() behavior, we should use toSQL() to mirror both that and the common toXML() approach.

However, I am still not convinced that we need to cover every case. If someone writes an invalid query, it should break. Our obligation is only to make sure that the report of breakage is at least somewhat useful. The actual error happens, however, on execute, which is not a magic method and therefore should throw a useful exception, shouldn't it?

Looking back at this issue from the start, the only reason I can see to change anything here is to provide user-space SQL validation so that we can explicitly break MySQL to match PostgreSQL's (admittedly more standards-compliant) behavior. But user-space SQL validation is not something that we should have any business doing. In this case, I can't recall ever writing a HAVING without a GROUP BY to begin with, as it never made logical sense. If some code is doing that, we should fix it and be done with it. Babysitting SQL specs is not something we should be spending cycles doing.

Josh Waihi’s picture

Issue tags: +PostgreSQL Surge

There is no way we can get away with not babysitting some some SQL specs here. If MySQL was more strict and standards compliant, I wouldn't have half the bugs to fix in the postgreSQL driver. Not providing user space SQL validation will only create more of these problems where the SQL syntax will work on MySQL and not on PostgreSQL or SQLite. I'm not saying it needs to be a full blown parser, I'd imagine only this check will be in there for starters, but a place we can throw exceptions from when we don't have enough information to construct a query string will be really useful.

I'm happy with toSQL() and verify() FTW

Berdir’s picture

Status: Active » Needs review
FileSize
1.49 KB

The attached patch fixes the tests for PostgreSQL. I did not care about the results at all, the only important thing of that test was to confirm that having() stuff was added to HAVING and not WERE, see #467984: DBTNG bugfixes

I'm not sure about the validate stuff. We could only validate dynamic queries anyway, there is nothing that stops you from writing non-standard SQL in db_query().

Crell’s picture

Once the bot approves it I'd go ahead and commit #11. The current code is a bug, whatever else we are going to do.

Crell’s picture

Status: Needs review » Reviewed & tested by the community

The bot likes it. Bug fixing is good.

cburschka’s picture

It's good, but the order of those components seems non-semantic. In an SQL query, the "group by" comes before the "having", which comes before the "order". Here order precedes both group and having.

However, the order already preceded having before the patch, so that's something for a separate clean-up.

Josh Waihi’s picture

correct me if I'm wrong, but the order doesn't matter so long as the components are present when the string is compiled right?

Dries’s picture

Status: Reviewed & tested by the community » Fixed

Committed to CVS HEAD. Thanks!

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

grub3’s picture

Status: Closed (fixed) » Active

This issue was fixed with PostgreSQL 8.4.

PostgreSQL 8.4 does not require GROUP By when using HAVING.

Also it incorporates other fixes.
Especialy deleting on several tables at once.

We should drop PostgreSQL 8.3 support and require PostgreSQL 8.4.

PostgreSQL 8.4 is the stable release.
It is shipping with Debian SID.

Crell’s picture

Status: Active » Closed (fixed)

Please do not reopen long-closed issues.

Also, 8.4 is way too new for us to be able to reasonably require it for all Drupal installs. No one in his right mind runs Debian Sid as a production server.

grub3’s picture

I am not of that oppinion and I think it should be more discussed:

  • PostgreSQL 8.4 is the stable release of PostgreSQL. No matter what Debian offers.
  • People can always compile by hand when needed.
  • Migration from 8.3 to 8.4 is faily easy. I could migrate a 500.000 posts forums + 500.000 D6 pages in 1 hour.
  • PostgreSQL 8.4 was developped to fix some common SQL problems which slow down conversion from MySQL. Including for example GROUP BY, ORDER BY common problems. In the case of Drupal, it suits perfectly our needs.

I believe that Drupal compatibility should include only PostreSQL 8.4.

Crell’s picture

A closed issue about a syntax bug that has been fixed is not the right place to discuss version support questions. If you want to debate the support status of PostgreSQL, the place to do so is on the Groups.Drupal.Org Group. For the most part I will defer to the Postgres maintainers (Damien and fiasco) on this question.

However, your argument doesn't hold water anyway. PHP 5.3 is the stable release of PHP, too, but we still support PHP 5.2, despite the many improvements PHP 5.3 has. Just because it's the current stable doesn't mean it's what the vast majority of people have installed, and being easy to install and use without compiling stuff from source is a main goal of the Drupal project.