PostgreSQL supports 2 different kinds of distinct in a query:

SELECT DISTINCT ON (field) field, ...

or

SELECT COUNT(DISTINCT (field)), ...

Right now, the syntax for any distinct field is being converted to the former. But it means that if you DO need the latter, you will get an SQL error. ie:

* warning: pg_query() [function.pg-query]: Query failed: ERROR: syntax error at or near "ON" LINE 1: SELECT t.tid, COUNT( DISTINCT ON (n.nid) n.nid) AS c FROM te... ^ in /var/www/dev.goth.net/htdocs/drupal/includes/database.pgsql.inc on line 125.
* user warning: query: SELECT t.tid, COUNT( DISTINCT ON (n.nid) n.nid) AS c FROM term_node t INNER JOIN node n ON t.nid = n.nid WHERE n.status = 1 AND n.type = 'faq' GROUP BY t.tid in /var/www/dev.goth.net/htdocs/drupal/includes/database.pgsql.inc on line 144.

My solution is to change the code from:

function db_distinct_field($table, $field, $query) {
  $field_to_select = 'DISTINCT ON ('. $table .'.'. $field .") $table.$field";
  // (?<!text) is a negative look-behind (no need to rewrite queries that already use DISTINCT).
  $query = preg_replace('/(SELECT.*)(?:'. $table .'\.|\s)(?<!DISTINCT\()(?<!DISTINCT\('. $table .'\.)'. $field .'(.*FROM )/AUsi', '\1
  $query = preg_replace('/(ORDER BY )(?!'.$table.'\.'.$field.')/', '\1'."$table.$field, ", $query);
  return $query;
}

to:

function db_distinct_field($table, $field, $query) {
  if (preg_match('/\s*[A-Za-z0-9_]\s*\(\s*'.$table.'\.'.$field.'\s*\)/', $query))
  {
      $field_to_select = 'DISTINCT ('. $table .'.'. $field .")";
  }
  else
  {
      $field_to_select = 'DISTINCT ON ('. $table .'.'. $field .") $table.$field";
      // (?<!text) is a negative look-behind (no need to rewrite queries that already use DISTINCT).
      $query = preg_replace('/(SELECT.*)(?:'. $table .'\.|\s)(?<!DISTINCT\()(?<!DISTINCT\('. $table .'\.)'. $field .'(.*FROM )/AUsi', '\1
      $query = preg_replace('/(ORDER BY )(?!'.$table.'\.'.$field.')/', '\1'."$table.$field, ", $query);
  }
  return $query;
}

This will make any function usage (COUNT(), MAX(), etc), including user-defined ones still work. However if a distinct is needed outside a function, the 'DISTINCT ON (field) field' syntax is used.

Comments

salvis’s picture

Status: Active » Closed (duplicate)

This is a duplicate of #128846: Rewritten PostgreSQL queries fail and a patch is being worked on there.

Please help review there.