When a module is used that makes use of the node_access table, node_db_rewrite_sql rewrites any queries that have nid as the primary field, changing "SELECT n.nid" to "SELECT DISTINCT(n.nid)" with an added INNER JOIN and WHERE clause (if the user doesn't have administer nodes permission).

If a module (e.g. image_gallery) does a query that does a sort on anything other than n.nid (e.g. when pulling a gallery thumbnail), we get the following query, which is invalid in PostgreSQL:

SELECT DISTINCT(n.nid) FROM ... WHERE ... ORDER BY n.sticky DESC, n.created DESC LIMIT 1;

pg_query(): Query failed: ERROR: for SELECT DISTINCT, ORDER BY expressions must appear in select list in /usr/local/drupal-4.7.2/includes/database.pgsql.inc on line 84.

This can be solved cleanly for PostgreSQL by rewriting the code to use a subquery instead of a DISTINCT JOIN WHERE, as shown in the patch. I'm not sure what this will do to MySQL, though, so I'm not sure how simple this will be to fix.

My relevant setup:
- Drupal 4.7.2
- PostgreSQL
- image, image_gallery, and simple_access modules enabled
- at least one gallery created
- simple_access module initialized
- at least one picture in a gallery (may not be required, but you currently get another error if there isn't one, due to a bug in the image_gallery module)
- go to /image, and you should get the error

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Steve Simms’s picture

Also, hopefully this is clear from the description, but this bug isn't particular to the image, image_gallery, or simple_access modules. Any module that uses node_access, when combined with another module that sorts a list of nodes, will exhibit this problem on a PostgreSQL database.

drumm’s picture

Could this be solved by simply adding relevant columns, such as n.created, to the SELECT column list?

sammys’s picture

I see the merit of finding an automatic solution to a problem that may creep up on any module developer without pgsql knowledge. I believe the concerns of core maintainers will be the added weight of the subquery to frequently executed queries. This is where i'm inclined to lean towards @drumm's suggestion.

We better get some MySQL people to test your patch (maybe comment on the impact as well) and we'll take it from there.

--
Sammy Spets
Synerger
http://www.synerger.com

killes@www.drop.org’s picture

Status: Needs review » Needs work
Steve Simms’s picture

I just did a little testing:

@drumm: Yes, adding the columns does work. I'm concerned that it's very unintuitive, though -- the developer would have to know that db_rewrite_sql is going to break the query if node_access is being used.

@sammys: I haven't done exhaustive testing by any stretch, but in my sample tests, the patched code is giving results in about 1/6th the time of the unpatched code on PostgreSQL (5.3ms -> 0.9ms), so it's actually a performance enhancement as well.

That said, I don't know that this would be the case for MySQL -- I'm pretty sure subqueries are a newer feature for it.

If it turns out to be slower for MySQL, I could redo the patch to switch on the database type, so that MySQL remains unchanged, but PostgreSQL will work without burdening the module developers.

What do you think?

sammys’s picture

I've experienced similar problems in other modules where columns aren't added to the select list and the query breaks in pgsql. I don't think the problem will go away simply. Most notably the problem exists in views. But I digress.

Though I don't know how large your dataset is I better mention this. Performance tests on small datasets are generally filled with inaccuracies because DBMS internal code timings (such as query interpreting) weigh more heavily in the query timing. In addition, the first query will usually take longer than subsequent queries on the same table(s) because memory resident block buffers will be filled during the first query. That said, we have to get hold of a MySQL tester to bench the subquery version.

I believe it's Drupal policy (written or not) that everything outside includes/database..inc and *.install files should be DBMS generic. That means the code you're writing better work for both systems or a new function must be written into all includes/database..inc files. If you can come up with a fantastic way to abstract it and build a useful function then by all means.

Let's get the MySQL version benched and we'll take it from there.

Cheers!

Steve Simms’s picture

Re: problem not going away simply -- agreed. I took a brief look at the development list yesterday, and saw that db_rewrite_sql is causing others problems as well. I think the best way to work around this will be to minimize the amount of rewriting that's being done wherever possible.

Re: performance. It was a small dataset, but probably not atypical for an average blog. Both test queries would've been fully in RAM. My point was more that it wasn't an obvious performance hit than anything else. The subquery uses "EXISTS", so it'll quit as soon as it finds the first result, vs. a JOIN that has to do all of the work first to make the tables line up before it can filter through them. I would guess that this would become slower on larger tables than the subquery, which could conceivably be done last after the rest of the filtering is done (though that's an issue for the database performance folks, and not Drupal).

Re: Drupal policy. Ok, so if this particular fix makes MySQL noticeably slower, the task becomes abstracting the rewriting code in node.module into the database files somehow, since the current version isn't actually database-generic (it's relying on a MySQL quirk, so it'll most likely break any other database -- is there anyone using Drupal who's not on MySQL or PG?).

How can we go about flagging down some MySQL people to test/time this?

sammys’s picture

Version: 4.7.2 » 4.7.3

Best bet is to send a mail to the dev list. I think this will be low on their priorities if there are other issues with this function. It most certainly needs attention, but there is something terribly wrong with the function if there are so many troubles. The situation needs someone with experience to look at the problems carefully and solve them with an elegant solution. Core code is like that unfortunately.

Someone needs to tackle this. Could you mail me URLs for the other issues you've found for node_db_rewrite_sql() and i'll send a mail to the dev list about it. If you're ok with sending the mail to the dev list yourself, then by all means go for it.

killes@www.drop.org’s picture

Version: 4.7.3 » x.y.z

moving to CVS as it has the same code.

sammys’s picture

db_rewrite_sql is not working for product.module in the ecommerce suite as well. It appears you can't even add the extra columns to the query select list before you call db_rewrite_sql. humph

sammys’s picture

had a chat with @chx and @killes about this. there are two ways to deal with the problem. 1) fork a db_rewrite_sql for each dbms or 2) fix the query problem in db_rewrite_sql.

@killes and I believe 2) is the way to go seeing there really isn't much in it. I've hit a fork in the road with the DISTINCT() syntax on pgsql. Unfortunately it doesn't support it unless it's placed at the beginning of the select list (which is possible) like this:

SELECT DISTINCT ON (n.nid) field2, field3 FROM ...

mysql equivalent:

SELECT DISTINCT (n.nid), field2, field3 FROM ...

We _could_ put a db_type clause in there, but i've posed the question, "Is there any reason we can't put the distinct across the entire set of selected values?" So far I haven't been given a use case where the select list has distinct nid with non-distinct select fields. Hopefully someone comes up with a use case otherwise this problem will require some code auditing with grep.

*yorn*

sammys’s picture

Assigned: Unassigned » sammys
Status: Needs work » Needs review
FileSize
4.95 KB

Ok... progress has been made on this issue.

First of all, the DISTINCT ON syntax above is incorrect. It should be
SELECT DISTINCT ON (n.nid) n.nid ... ORDER BY n.nid,...

The ORDER BY part only needs to be there if an ORDER BY expression has been supplied.

@chx and I have consulted and the end result was we create a DBMS specific function (i.e a new function in each includes/database.??sql* file). I've done just that and called it db_distinct_field().

There is one major benefit to the new function (for pgsql): ORDER BY fields don't need to be supplied in the SELECT field list anymore!

There is one major downside for pgsql: it doesn't convert queries already specifying DISTINCT to the DISTINCT ON syntax. I figure we'll cross that bridge _if_ we come to it.

chx’s picture

Status: Needs review » Reviewed & tested by the community

I think this one is a go. Applies with a slight offset.

Dries’s picture

Yay, the db_rewrite_sql() functionality gets messier every day. I guess this is necessary evil.

Dries’s picture

Is this a problem in Drupal 4.7 as well?

Gerhard Killesreiter’s picture

yeah, I think it is.

sammys’s picture

It most definitely is a problem on 4.7 as well.

drumm’s picture

Status: Reviewed & tested by the community » Fixed

Looks relatively straightforward (while not fun to look at) and functional. Committed to HEAD.

chx’s picture

Version: x.y.z » 4.7.4
Status: Fixed » Needs review

this is a problem with 4.7 as well. not tested whether the patch applies...

sammys’s picture

Patch applies to 4.7 with only an offset for one of the files. If we're assuming that operationally it's the same as in 5.0 then testing is not necessary. RTBC in that case. I won't set this to RTBC unless we all concur.

killes@www.drop.org’s picture

Status: Needs review » Fixed

applied. Hope it doesn't break anything

Anonymous’s picture

Status: Fixed » Closed (fixed)
wedge’s picture

Sorry about the late feedback and I'm not 100% sure it's related but I think this change is causing problems in some edge cases. Please look at #6 here http://drupal.org/node/89776

wedge’s picture

Status: Closed (fixed) » Active

Guess I have to reopen the issue as well.

Steve Simms’s picture

Version: 4.7.4 » 5.0

#89776 looks like the same problem. If wedge is having this issue with the patched code, it doesn't look like this got fixed. (I haven't upgraded yet, and can't confirm it for myself.)

Also, the patch breaks ordering by anything other than the distinct column, if I'm reading the new code correctly, by always putting the distinct column (e.g. n.nid) first in the ORDER BY list. This would explain the other problem that wedge mentions in #89776, and would be highly critical because it would break (mis-sort) every date-sorted list of nodes when access control is turned on, if I'm remembering the code correctly.

Off the top of my head at 3:30am, a fix would be to add the distinct column at the very end of the ORDER BY list, rather than the beginning. That would be a moderately painful regex to write, and there may be a better way. SQL-wise, LIMIT, OFFSET, and FOR are the only other clauses that should follow an ORDER BY, and FOR is probably going to be rare enough to safely ignore, if you want to work backwards from the end rather than going forward from the ORDER BY statement.

drumm’s picture

Status: Active » Closed (fixed)

Please open separate issues or reclassify other issues as needed. Having fresh, well-described bugs makes this easier to follow.