Problem/Motivation

Over many years Drupal has held off checking queries for reserved words in column names, table names and aliases because it was a design decision. The idea is that we just just should not use reserved words. However, time has made it clear that this position (as stated in the long discussions in #371: resolve ANSI SQL-92/99/2003 reserved words conflict in query statements) is not really tenable. Here are the problems:

  • The list of reserved words is different for different database drivers even those the support the various ANSI SQL standards. For example OFFSET is reserved in PostgreSQL but not in MySQL
  • Database drivers add new reserved keywords in new versions. MySQL has been doing this in every new version for example 5.7 and 8

Proposed resolution

  • Add a new syntax square bracket syntax to the query builder to know what identifiers (column names/ aliases) are. Leverage work in #371: resolve ANSI SQL-92/99/2003 reserved words conflict in query statements suggested using [COLUMN_NAME] to match {TABLE_NAME}. This supports db_query(), addExpression() and addWhere()
  • Automatically quote fields and aliases in escapeField() and escapeAlias().
  • All identifiers should be quoted. This helps us ensure that the database API is properly used when query building and will make edge-cases less likely.

Using [] to identify identifiers (field names / aliases)

There is precedence - this is the quote character in MSSQL.

The one consideration is that :variable[] is used to identify and expand array variables.

Discarded resolution(s)

We could deprecate SELECT, UPDATE, INSERT queries via \Drupal\Core\Database\Connection::query() since the methods like \Drupal\Core\Database\Connection::select() require fields to be specified. If we do that we have to worry about performance - see #1067802: Compare $connection->query() and $connection->select() performance. Also this does not work for columns added using ->addExpression() or ->addWhere(). Therefore this resolution has been discarded.

Remaining tasks

Fix the blockers:

File follow-ups:

  • add coder rule to try to check for missing []
  • Ensure core uses new square bracket syntax

User interface changes

None

API changes

At the moment \Drupal\Core\Database\Connection::escapeField() and \Drupal\Core\Database\Connection::escapeAlias() in their core implementations are re-entrant. And some core code relies on this. That said the docs hint that this should not be case. In the new implementation methods also now quote identifiers BUT because the code removes the quotes before adding them they still are re-entrant. The new implementations result in strings like users_data.uid becoming "users_data"."uid".

\Drupal\Core\Database\Query\ConditionInterface::condition()'s first argument $field must be a field and cannot be (ab)used to do ->condition('1 <> 1'). The way to do this is to use the new API added by #2986334: Add a way to enforce an empty result set for a database condition query

\Drupal\Core\Database\Connection::prepareQuery() now takes additionally takes a boolean that is derived from a query's $options['allow_square_brackets'] value that allows specific queries to use square brackets and to not them for identifier quotes. The default value for $options['allow_square_brackets'] is FALSE.

\Drupal\Core\Database\Connection::prefixTables now quotes tables so {node} becomes "node".

Data model changes

None

Release notes snippet

All identifiers (table names, database names, field names and aliases) are quoted in database queries. Column names or aliases should be wrapped in square brackets when using Connection::query(), db_query(), ConditionInterface::where() or SelectInterface::addExpression. Custom or contrib database drivers should read https://www.drupal.org/node/2986894 to see what changes are necessary.

CommentFileSizeAuthor
#138 drupal-allow-dash-in-table-name-2986452.patch739 bytesAlexey.Samsonov
#124 2986452-9.0.x-123.patch68.4 KBalexpott
#124 122-123-interdiff.txt1.07 KBalexpott
#122 2986452-9.0.x-122.patch68.14 KBalexpott
#122 120-122-interdiff.txt289 bytesalexpott
#121 2986452-9.0.x-test-without-entity-condition-changes-116-8.patch67.66 KBalexpott
#121 2986452-9.0.x-test-without-query-condition-changes-106-13.patch68.14 KBalexpott
#120 2986452-9.0.x-120.patch68.6 KBalexpott
#120 118-120-interdiff.txt1.48 KBalexpott
#118 2986452-9.0.x-118.patch68.44 KBalexpott
#118 115-118-interdiff.txt4.54 KBalexpott
#115 2986452-9.0.x-115.patch67.34 KBalexpott
#115 113-115-interdiff.txt1.3 KBalexpott
#113 2986452-9.0.x-113.patch66.66 KBalexpott
#113 2986452-8.9.x-113.patch66.66 KBalexpott
#113 109-113-interdiff.txt3.32 KBalexpott
#109 2986452-8.9.x-109.patch63.61 KBalexpott
#109 105-109-interdiff.txt5.45 KBalexpott
#105 2986452-8.9.x-104.patch64.58 KBalexpott
#105 100-104-interdiff.txt895 bytesalexpott
#100 2986452-8.9.x-100.patch63.71 KBalexpott
#100 97-100-interdiff.txt5.6 KBalexpott
#99 95-97-interdiff.txt1.81 KBalexpott
#97 2986452-8.9.x-97.patch60.2 KBalexpott
#97 95-97-interdiff.txt60.2 KBalexpott
#95 2986452-95.patch58.46 KBkostyashupenko
#92 2986452-92.patch58.23 KBandypost
#92 interdiff.txt1.36 KBandypost
#91 2986452-91.patch58.24 KBandypost
#91 interdiff.txt518 bytesandypost
#90 2986452-90.patch58.04 KBandypost
#90 interdiff.txt1.9 KBandypost
#88 2986452-88.patch57.94 KBandypost
#88 interdiff-reroll.txt11.36 KBandypost
#82 2986452-2-82.patch57.77 KBalexpott
#82 69-82-interdiff.txt1.43 KBalexpott
#69 2986452-69.patch57.76 KBalexpott
#63 2986452-63.patch58.37 KBalexpott
#63 46-63-interdiff.txt2.26 KBalexpott
#46 2986452-46.patch58.55 KBalexpott
#46 43-46-interdiff.txt768 bytesalexpott
#43 2986452-44.patch58.54 KBalexpott
#43 42-44-interdiff.txt1.91 KBalexpott
#41 2986452-42.patch58.54 KBalexpott
#41 sjerdo-interdiff.txt2.8 KBalexpott
#36 34-36-interdiff.txt7.62 KBalexpott
#36 2986452-36.patch61.81 KBalexpott
#34 2986452-34.patch55.98 KBalexpott
#34 32-34-interdiff.txt1.83 KBalexpott
#32 2986452-32.patch55.18 KBalexpott
#32 30-32-interdiff.txt15.19 KBalexpott
#30 2986452-30.patch46.02 KBalexpott
#30 26-30-interdiff.txt2.95 KBalexpott
#26 2986452-26.patch46.11 KBalexpott
#26 25-26-interdiff.txt1.08 KBalexpott
#25 2986452-25.patch46.1 KBalexpott
#25 23-25-interdiff.txt7.19 KBalexpott
#23 2986452-22.patch42.95 KBalexpott
#22 2986452-22.patch42.95 KBalexpott
#22 20-22-interdiff.txt1.8 KBalexpott
#20 2986452-20.patch42.42 KBalexpott
#18 14-18-interdiff.txt14.13 KBalexpott
#14 12-14-interdiff.txt653 bytesalexpott
#12 9-12-interdiff.txt2.03 KBalexpott
#19 18-19-interdiff.txt2.03 KBalexpott
#20 19-20-interdiff.txt1.58 KBalexpott
#18 2986452-18.patch40.84 KBalexpott
#12 2986452-12.patch28.49 KBalexpott
#19 2986452-19.patch42.29 KBalexpott
#9 2986452-9.patch27.52 KBalexpott
#14 2986452-14.patch28.64 KBalexpott
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

alexpott created an issue. See original summary.

alexpott’s picture

Title: Database reserved keywords » Database reserved keywords need to be quoted as per the ANSI standard
alexpott’s picture

For me the SQLite manual has this right

SQLite adds new keywords from time to time when it takes on new features. So to prevent your code from being broken by future enhancements, you should normally quote any identifier that is an English language word, even if you do not have to.

https://www.sqlite.org/lang_keywords.html

catch’s picture

suggested using [COLUMN_NAME] to match {TABLE_NAME}

This doesn't seem like the worst suggestion given all the other options.

Not really keen on completely deprecating \Drupal\Core\Database\Connection::query(), the query builder already supports where/expression for gnarly syntax stuff and that can include columns just as much (say a REPLACE(foo, 'bar', 'baz') query in MySQL).

alexpott’s picture

Issue summary: View changes

@catch yep totally agree with #4 -

alexpott’s picture

Issue summary: View changes
alexpott’s picture

I think the one remaining design choice before it's worth starting to write patches is: do we use a dictionary of reserved words?

On the plus side that would mean that the query strings are not inundated with quotes. However I feel that the SQLite docs in #3 have it right. We quote all identifiers and then we don't have to worry about changes in the list of reserved words. We have less to maintain and writing contrib and custom database drivers becomes easier because you know that all identifiers are quoted.

alexpott’s picture

Another benefit of always quoting and not using a dictionary is that bugs like #2986539: \Drupal\user\Plugin\EntityReferenceSelection\UserSelection::entityQueryAlter() should escape the fake condition column on replacement will become much harder to have.

alexpott’s picture

Status: Active » Needs review
FileSize
27.52 KB

Here's a first cut. I don't expect it to be green. Trying on MySQL first and then once that is green I'll test queuing up runs on the other database types.

Patch includes:
#2986334: Add a way to enforce an empty result set for a database condition query
#2986539: \Drupal\user\Plugin\EntityReferenceSelection\UserSelection::entityQueryAlter() should escape the fake condition column on replacement

alexpott’s picture

Issue summary: View changes

Discussed the reserved word dictionary with @catch and @amateescu in slack. After much discussion we decided that we should

  1. introduce the [] syntax for identifiers (column names, aliases) in db_query(), SelectInterface::addExpression() and ConditionInterface::where()
  2. always escape the identifiers and the table names

This resolves #7

alexpott’s picture

Issue summary: View changes
alexpott’s picture

Forgot one thing. We can't mix escaped field and table name in the same static now. Because they have slightly different strategies due to table name prefixing.

alexpott’s picture

+++ b/core/lib/Drupal/Core/Database/Query/Select.php
@@ -852,7 +852,7 @@ public function __toString() {
-      $query .= $table_string . ' ' . $this->connection->escapeTable($table['alias']);
+      $query .= $table_string . ' ' . $this->connection->escapeAlias($table['alias']);

This is the type of super tricky bug that this change will help us catch.

alexpott’s picture

Fixing escaping/quoting where the field name is empty.

The last submitted patch, 12: 2986452-12.patch, failed testing. View results

The last submitted patch, 9: 2986452-9.patch, failed testing. View results

Status: Needs review » Needs work

The last submitted patch, 14: 2986452-14.patch, failed testing. View results

alexpott’s picture

Okay... think I've fixed MySQL and Postgres and SQLite both install. Let's see what's borked.

alexpott’s picture

Whoops small mistake on the hacky real_square_brackets option.

alexpott’s picture

We need a bot with Postgres > 9.5 - my local is but on less than that a different upsert class is used.

Also lol SQLite supports square brackets as quotes cause MSSQL so although the implementation is wrong it is not totally broken.

The last submitted patch, 18: 2986452-18.patch, failed testing. View results

alexpott’s picture

I was not testing locally what I thought I was lolz. Postgres should be greener now.

alexpott’s picture

alexpott’s picture

Issue summary: View changes

Adding some of the API implications of this change.

I wonder if we should make escapeField() and escapeAlias() re-entrant - i.e only add quotes if the first character is not a quote.

alexpott’s picture

Fixing postgres and more cleanup since the postgres driver implementation is not API no point deprecating things.

alexpott’s picture

The last submitted patch, 25: 2986452-25.patch, failed testing. View results

alexpott’s picture

So I decided to track down why the where in PreviewTest was not being quoted. In \Drupal\views\Plugin\views\argument\NumericArgument::query() we see:

    $placeholder = $this->placeholder();
    $null_check = empty($this->options['not']) ? '' : " OR $this->tableAlias.$this->realField IS NULL";

    if (count($this->value) > 1) {
      $operator = empty($this->options['not']) ? 'IN' : 'NOT IN';
      $placeholder .= '[]';
      $this->query->addWhereExpression(0, "$this->tableAlias.$this->realField $operator($placeholder)" . $null_check, [$placeholder => $this->value]);
    }
    else {
      $operator = empty($this->options['not']) ? '=' : '!=';
      $this->query->addWhereExpression(0, "$this->tableAlias.$this->realField $operator $placeholder" . $null_check, [$placeholder => $this->argument]);
    }

This dumping in of $this->tableAlias.$this->realField into a query without passing through escapeAlias() and escapeField() is a problem. I think we should consider this part of the followup that is going to ensure we use the new square bracket syntax. So this will become [$this->tableAlias].[$this->realField] and then views will quote identifiers properly. Atm if the table name had had an upper case letter in it then postgres would be broken :(

catch’s picture

Quick review. Generally this looks good and the amount of crap it removes from the postgresql driver shows what we'd be dealing with if we went the dictionary route.

  1. +++ b/core/lib/Drupal/Core/Database/Connection.php
    @@ -980,10 +1047,12 @@ public function escapeTable($table) {
    +      $parts = array_map([$this, 'escapeAlias'], $parts);
    

    Why is it using escapeAlias here when the method is escapeField()? If escapeAlias() is doing two different things (actually escaping aliases, and also escaping fields) then it should probably be renamed.

  2. +++ b/core/lib/Drupal/Core/Database/Driver/pgsql/Connection.php
    @@ -48,28 +48,6 @@ class Connection extends DatabaseConnection {
    -  protected $postgresqlReservedKeyWords = ['all', 'analyse', 'analyze', 'and',
    

    Nice.

  3. +++ b/core/lib/Drupal/Core/Database/Driver/pgsql/Upsert.php
    @@ -24,6 +24,11 @@ public function execute() {
         $insert_fields = array_merge($this->defaultFields, $this->insertFields);
    +    // The fields are escaped and quoted when adding to the insert query so this
    +    // is not necessary and actually harmful.
    +    // $insert_fields = array_map(function ($field) {
    +    //   return $this->connection->escapeField($field);
    +    // }, $insert_fields);
     
    

    cruft.

alexpott’s picture

alexpott’s picture

Made a start on the change record - https://www.drupal.org/node/2986894

+++ b/core/lib/Drupal/Core/Database/Connection.php
@@ -323,6 +350,14 @@ protected function setPrefix($prefix) {
+  /**
+   * @return string
+   */
+  protected function identifierQuote() {
+    @trigger_error('In Drupal 9.0 this method will be abstract and contrib and custom drivers will have to implement. See @todo');
+    return '';
+  }

@@ -341,6 +376,27 @@ public function prefixTables($sql) {
+  /**
+   * Quotes all identifiers in a query.
+   *
+   * Queries sent to Drupal should wrap all unquoted identifiers in square
+   * brackets. This function searches for this syntax and replaces them with the
+   * database specific identifier. In ANSI SQL this a double quote.
+   *
+   * Note that :variable[] is used to denote array arguments but
+   * Connection::expandArguments() is always called first.
+   *
+   * @param string $sql
+   *   A string containing a partial or entire SQL query.
+   *
+   * @return string
+   *   The string containing a partial or entire SQL query with all identifiers
+   *   quoted.
+   */
+  protected function quoteIdentifiers($sql) {
+    return str_replace(['[', ']'], $this->identifierQuote(), $sql);
+  }

Help on naming these methods would be great.

alexpott’s picture

Adding some unit test coverage (as well as moving some) and resolve some of the @todos I added.

Status: Needs review » Needs work

The last submitted patch, 32: 2986452-32.patch, failed testing. View results

alexpott’s picture

Fixing the tests.

alexpott’s picture

Bored of fighting with PreviewTest - opened #2987009: Remove unnecessary space in queries generated by Select::__toString() to make it a bit simpler.

alexpott’s picture

Rerolled on top of #2966523: MySQL 8 Support - and now we're able to remove the private code added by that workaround for 8.6.x.

alexpott’s picture

Issue summary: View changes
alexpott’s picture

Issue summary: View changes
alexpott’s picture

Issue summary: View changes
sjerdo’s picture

Some review comments:

  1. +++ b/core/lib/Drupal/Core/Database/Connection.php
    @@ -143,10 +143,33 @@
    +  /**
    +   * List of escaped table names, keyed by unescaped names.
    +   *
    ...
    +   */
    +  protected $escapedFields = ["" => ""];
    

    This should be "List of escaped field names, keyed..."

  2. +++ b/core/lib/Drupal/Core/Database/Connection.php
    @@ -254,6 +277,11 @@ public function destroy() {
    +   * - allow_square_brackets: By default, queries which contain square brackets
    +   *   will have them replaced with the identifier quote character for the
    +   *   database type. In rare cases, such as creating an SQL function, []
    +   *   characters might be needed and can be allowed by changing this option to
    +   *   TRUE.
        *
        * @return array
        *   An array of default query options.
    

    The default value (FALSE) isn't returned in the defaultOptions() method.
    Since allow_delimiter_in_query also has a default value of FALSE and is returned by the method, this default value also should be add.

  3. +++ b/core/lib/Drupal/Core/Database/Connection.php
    @@ -393,12 +462,21 @@ public function getFullQualifiedTableName($table) {
    +   * @param array $options
    +   *   An associative array of options to control how the query is run. The
    +   *   given options will be merged with self::defaultOptions(). See the
    +   *   documentation for self::defaultOptions() for details.
    +   *   Typically, $options['return'] will be set by a default or by a query
    +   *   builder, and should not be set by a user.
    ...
        */
    -  public function prepareQuery($query) {
    

    Documentation is incorrect: The $options array isn't merged with self::defaultOptions() in this method. I suggest adding $options += $this->defaultOptions();

alexpott’s picture

Rerolled now that 8.7.x has the blockers committed.

@sjerdo thanks for the review.

  1. Good point - fixed
  2. Nice idea - fixed. This is also great because we no longer have to check for emptiness.
  3. I think this highlights the fact we shouldn't be passing the whole options array in here. In an ideal world prepareQuery would be more internal to the DB API. So I've changed this to just passing a boolean along which has the nice side effect of tying in nicely with prepareQuery's other task of prefix tables.

Status: Needs review » Needs work

The last submitted patch, 41: 2986452-42.patch, failed testing. View results

alexpott’s picture

Status: Needs work » Needs review
FileSize
1.91 KB
58.54 KB

Whoops.

The last submitted patch, 41: 2986452-42.patch, failed testing. View results

Status: Needs review » Needs work

The last submitted patch, 43: 2986452-44.patch, failed testing. View results

alexpott’s picture

alexpott’s picture

Issue summary: View changes

Updating issue summary to reflect recent changes.

borisson_’s picture

alexpott credited bleen.

alexpott credited daffie.

alexpott credited kriboogh.

alexpott’s picture

Crediting people who worked on #1426084: Provide backtick escaping for MySQL in DB abstraction layer as I've mark it as a duplicate.

alexpott’s picture

kriboogh’s picture

Thanks for the crediting. I did part of the patching on the backtick escaping for MySQL layer. (#1426084)

Does the patch in this issue also take into account that the prefix replacement patterns also incorrectly replace curly brackets in data, for example if a query is run with JSON data in it's value. In the #1426084 latest patch we added more complex regex's to tackle that problem (it's commented in the code what the regex does).

alexpott’s picture

@kriboogh whilst that sounds like a valid issue is also sounds like scope-creep and something that can be tackled in a follow-up issue. This issue is purely attempting to quote columns, tables and aliases properly so a database's particular set of reserved words do not cause Drupal problems.

kriboogh’s picture

Ok good to know than. I also saw this patch also doesn't allow a dash (-) in the table and column names, which is allowed in MySQL as long as your names are quoted (backticked). Would you consider that also a followup issue, or something that needs to be added in this patch?

alexpott’s picture

@kriboogh tricky. That's on the edge for me. I'd be happy to do as follow-up. But it does suggest that we should use ` as the quote identifier for MySQL. But also we need to account for postgres and sqlite support. So it could be good to do in follow-up so we can focus on just that one issue and add good test coverage.

+++ b/core/lib/Drupal/Core/Database/Driver/mysql/Connection.php
@@ -546,6 +232,15 @@ public function queryTemporary($query, array $args = [], array $options = []) {
+  /**
+   * {@inheritdoc}
+   */
+  protected function identifierQuote() {
+    // The database is using the ANSI option on set up so use ANSI quotes and
+    // not MySQL's custom backtick quote.
+    return '"';
+  }

ie. I think we should do a separate issue to explore all the ramifications of changing this. It's annoying that the " and ` don't work the same when the ANSI_QUOTE option is on.

This suggests that we should probably re-open #2767595: Support "-" in database table names

alexpott’s picture

@kriboogh scratch what I said in #58. Because we set the SQL_MODE to ANSI in Connection.php yes this patch does take care of - in table and database names.

With the mysql client...

mysql> set SQL_MODE = "ANSI";
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> create table "test-test-test" as select * from node;
Query OK, 0 rows affected (0.02 sec)
Records: 0  Duplicates: 0  Warnings: 0

Yay.

kriboogh’s picture

That's good to know, but the thing is, the escapeTable method in the base Connection.php doesn't take the dash into account when escaping the table names, so the - will be filtered out if not overridden. To allow it in the MySQL connection, the extended MySQL Connection should include the '-' then

public function escapeTable($table) {
    if (!isset($this->escapedNames[$table])) {
      $this->escapedNames[$table] = preg_replace('/[^A-Za-z0-9_.-]+/', '', $table);
    }
    return $this->escapedNames[$table];
  }

(untested)

alexpott’s picture

@kriboogh good point. I think then given that this issue get's us most of the way to supporting dashes in table names we should do #60 in #2767595: Support "-" in database table names so we can address all three db at the same time and ensure we're not introducing more problems by allowing them. I've changed the status of that issue from closed to postponed.

borisson_’s picture

Status: Needs review » Needs work

Just nitpicks, nothing functional. Not sure what is holding this issue back.

  1. +++ b/core/lib/Drupal/Core/Command/DbDumpCommand.php
    @@ -256,7 +256,9 @@ protected function getTableIndexes(Connection $connection, $table, &$definition)
    +    // Note that table names are now always quoted.
    

    The now here is good for this patch (because it makes it clear that this is a change that happened here), but I don't think it should end up in core.

    Something like this seems better I think?

    We alway quote table names to make sure that we allow for tables that are reserved keywords.

  2. +++ b/core/lib/Drupal/Core/Database/Connection.php
    @@ -306,14 +336,19 @@ protected function setPrefix($prefix) {
    +        // $val can point to another database like 'database.users'. In this
    +        // instance we need to quote the identifiers correctly.
    ...
    +    // $this->prefixes['default'] can point to another database like
    +    // 'prefix.'. In this instance we need to quote the identifiers correctly.
    

    We use another example of another database in these comments, should we align those?

  3. +++ b/core/lib/Drupal/Core/Database/Connection.php
    @@ -323,6 +358,20 @@ protected function setPrefix($prefix) {
    +    @trigger_error('In Drupal 9.0 this method will be abstract and contrib and custom drivers will have to implement. See https://www.drupal.org/node/2986894', E_USER_DEPRECATED);
    

    will have to implement it?

  4. +++ b/core/lib/Drupal/Core/Database/Connection.php
    @@ -341,6 +390,27 @@ public function prefixTables($sql) {
    +   *   A string containing a partial or entire SQL query.
    ...
    +   *   The string containing a partial or entire SQL query with all identifiers
    

    Does this work for non-query statements as well? It seems like it does, so we should probably update the language here as well?

  5. +++ b/core/lib/Drupal/Core/Database/Connection.php
    @@ -488,7 +565,11 @@ public function getLogger() {
    +    // @todo oh dear this is not pretty. Also I'm not sure what would happen
    +    //   here if the prefix replacement resulted in a table name with dot in it.
    +    //   I think this might be a problem in HEAD for Postgres and table
    +    //   prefixing that results using different databases.
    

    Should we figure out the postgres thing for HEAD in this issue as well? Or should we postpone this on that? Or can that be a followup?

    In any case, this comment isn't really helpful.

alexpott’s picture

Status: Needs work » Needs review
FileSize
2.26 KB
58.37 KB

1. Yep - the point here is that the we need to strip the quotes so updated the comment. The interesting thing here is the hard-coding of the identifier quote. Might need to think on this one. More work to do here.
2. The examples are based on the type of things you'd expect as values. I.e this is about table specific prefixes (database.users) vs generic whole database prefixes like (prefix or test12345678).
3. Fixed
4. Non-query statements - I'm not sure I get why you'd be passing non-query strings to this?
5. Opened #3002184: \Drupal\Core\Database\Connection::makeSequenceName() looks like it might have interesting behaviour on Postrgres if the database prefix has a dot in it and yep you are right no need to add an @todo here - that's all pretty theoretical.

alexpott’s picture

Done some more thinking about #63.1

At the moment DbDumpCommand is hard coding the identifier quote so it can remove it from the result of $connection->prefixTables(). This patch also adds similar code to \Drupal\Core\Database\Driver\pgsql\Schema. \Drupal\Core\Database\Driver\pgsql\Schema is an interesting class because it already has plenty of examples of stripping identifier quotes from things - see \Drupal\Core\Database\Driver\pgsql\Schema::indexExists() for example. This file also hardcodes " all over the place.

So a couple of thoughts emerge:

  • Should we make Connection::identifierQuote() public so any random piece of code can get this detail and do what they want to.
  • Or should we add a method to Connection to strip a string of identifier quotes?
  • Or given that table names are the only new place they can turn up (Postgres has been adding them to fields for the whole of 8.x) do we add a parameter to prefixTables() they allows the return value to be unquoted?
  • Or should we leave as is. Hardcoding a postgres's schema class and a db dumpcommand specific to mysql is not that bad
  • Is it acceptable for a database's Schema class to hardcode its identifier quote? (I think it might be)

I'm not sure which way to go atm.

neclimdul’s picture

While I understand the intention, {} already makes it incredibly annoying to take queries in and out of Drupal to build and debug them. Adding invalid syntax to support a processed escaping of columns breaks syntax linters and development tools like PhpStorm. [] compounds that, do we _really_ _absolutely_ _positively_ have to degrade site builder and developer experience here and re-invent the wheel? Like this is really bad. :(

alexpott’s picture

@neclimdul if you stringify the query all the escaping will already be done. The table prefixes get replaced much later in the process so it's more harmful there. But for example the views queries displayed on the views preview have plenty of quotes now but they'll still be runnable against a db if you replace the {}. Also [] is the quoting syntax in mssql. The problem is different SQL flavours have different quoting strategies so if we want to support multiple database engines then we need something. Plus [] is only really needed for db_query() which in most circumstances is not really the correct API to use. Nearly all instances in core especially where columns are involved should be replaced with db_select(). Also there are not that many run-time usages in core. So I'm not really sure that #65 is an issue.

kriboogh’s picture

Tested and verified patch #63 with our project and seems to work.

For our particular situation to work I also added a followup issue for #50 (dash issue) and #60 (prefix in json data) based on the latest patch (#63) of this issue and provided a patch for it at https://www.drupal.org/project/drupal/issues/3011987

kristiaanvandeneynde’s picture

The new reserved keyword 'groups' in MySQL 8.0.2 completely breaks the Group module which uses said keyword as its main table name (because it never was an issue until mySQL 8.0.2). So as far as I'm concerned, this issue is quite critical :o

alexpott’s picture

@kristiaanvandeneynde reviews welcome - and yes I agree that for a site experiencing a problem caused by not quoting identifiers this is totally a critical issue.

Status: Needs review » Needs work

The last submitted patch, 69: 2986452-69.patch, failed testing. View results

kristiaanvandeneynde’s picture

Status: Needs work » Needs review

The test fail seems unrelated to the issue at hand (order of array values).

While I can't speak for the correctness of the patch as a whole - I've not worked with the inner layers of the database API - I can say that the code additions, removals and changes in the patch seem to make sense. They are well documented, the tests seem to reflect what has changed nicely and I simply love the removal of all the reserved keywords in favor of simply specifying an identifier quote.

+1 from me as far as code quality and sanity goes. I'll leave the RTBC up to someone who knows the ins and outs of the database layer more than I do.

nevergone’s picture

larowlan’s picture

+++ b/core/lib/Drupal/Core/Database/Connection.php
@@ -143,10 +143,33 @@
+   * @deprecated in Drupal 8.7.x, to be removed before Drupal 9.0.0. This is no
+   *   longer used. Use \Drupal\Core\Database\Connection::$escapedTables or
+   *   \Drupal\Core\Database\Connection::$escapedFields instead.

should we remove this property and add a __get to trigger a deprecated error?

alexpott’s picture

@larowlan initial thought was good idea as this is a base class that is built for extension. But isn't there an issue with scope as implementing __get() would make $connection->escapedNames work whereas before it wouldn't - admittedly we'd only return an empty array so there's nothing really leaking but it feels strange to do this - especially if contrib / custom is still use the property because then stuff can leak out of the class.

larowlan’s picture

From slack:

Is there a chance that a sub-class could end up with unescaped stuff if they rely on the old property?
In which case, we could hard throw an exception from __get, which would also prevent leakage

alexpott’s picture

Issue summary: View changes

@larowlan if they were accessing $escapedNames directly then they really are doing it wrong. Because you'd never been able to assume that something was in there. So you'd always access via the escapeTable() / escapeField() / escapeDatabase() methods - alias escaping didn't use this.

Given that you'd always have to handle the empty case I think you're likely to have code like

  if (!isset($this->escapedNames[$database])) {
    $this->escapedNames[$database] = preg_replace('/[^A-Za-z0-9_.]+/', '', $database);
   }
   // Do something with $this->escapedNames[$database];

In your implementation. If you do that right now this patch doesn't break you and you are still quoting - note I'm not sure this has much to do with security - it is about avoid clashes with database reserved words. If we implement __get() and an exception I think we are being unnecessarily disruptive because we'll break working code unnecessarily.

I've added a release notes section.

larowlan’s picture

yep, fair enough

Version: 8.7.x-dev » 8.8.x-dev

Drupal 8.7.0-alpha1 will be released the week of March 11, 2019, which means new developments and disruptive changes should now be targeted against the 8.8.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

joachim’s picture

Status: Needs review » Needs work

A whole bunch of issues about escaping or quoting database identifiers have been closed as duplicates, but either they were closed incorrectly, or the patch here doesn't go far enough.

Migrations into a D8 database that has hyphens in the DB name (e.g. 'drupal-8') fails because of #2845333: can't JOIN to a foreign database whose name contains a hyphen. The patch there fixes the problem, but the patch here does not.

joachim’s picture

Ok, the patch on this issue is currently definitely NOT handling #2767595: Support "-" in database table names, because dumping the contents of $this->escapedTables in Connection::escapeTable() is showing me this:

> "drupal-8.migrate_map_client" => "drupal8.migrate_map_client"

Where's my hyphen??

joachim’s picture

Ok so the problem is partly with DB API and partly with Migrate.

Migrate's SqlBase does this:

        $alias = $this->query->leftJoin($this->migration->getIdMap()
          ->getQualifiedMapTableName(), 'map', $map_join);

which boils down to something like this:

        $alias = $this->query->leftJoin('drupal-8.migrate_map_foo', 'map', $map_join);

Note how we are joining BACK to the Drupal 8 site's database, in a query that is being run on a different connection.

Several problems here:

1. If the local Drupal 8 site uses a database table prefix, then the query will be broken, because the {} don't get added to the 'drupal-8.migrate_map_foo' identifier. Presumably, that is because DB API doesn't touch it because it supposes it's an external database. And normally it would be right! So I think DB API needs to have a way of joining to a 2nd database by specifying its connection rather than giving its actual database name. (But that's a whole other issue.)

2. It looks like this patch is breaking Migrate in several ways.

- 2a. When migrate's map query joins to the foreign database, it needs to wrap the 'database.table' in '[]', if I am correctly reading the docs in this patch?

- 2b. With this patch applied, the migrate map query looks like this, which isn't valid SQL:

SELECT "n"."nid" AS "nid", "n"."type" AS "type", "n"."language" AS "language", "n"."status" AS "status", "n"."created" AS "created", "n"."changed" AS "changed", "n"."comment" AS "comment", "n"."promote" AS "promote", "n"."sticky" AS "sticky", "n"."tnid" AS "tnid", "n"."translate" AS "translate", "nr"."vid" AS "vid", "nr"."title" AS "title", "nr"."log" AS "log", "nr"."timestamp" AS "timestamp",....
alexpott’s picture

@joachim is there any chance you can create a test case that replicates #81.2 - because if this does break migrate map joining then we definitely are missing test coverage.

Also #79. This patch doesn't really claim to fix database names - it is mainly focused on table names and other identifiers such as indexes using reserved words. It does solve that case and has tests to prove it. It does quote database names which might solve some issues but I think it is okay to re-open that issue and add further fixes and test coverage there.

Re-rolled for 8.8.x - it would be great to get this issue in as it solves real world problems that people are having and prevents a heap of future problems as DBs add reserved words.

alexpott’s picture

Looking into the cross database query issue... it seems that the interesting interplay is between \Drupal\Core\Database\Connection::getFullQualifiedTableName()and \Drupal\Core\Database\Query\Select::__toString()

      else {
        $table_string = $this->connection->escapeTable($table['table']);
        // Do not attempt prefixing cross database / schema queries.
        if (strpos($table_string, '.') === FALSE) {
          $table_string = '{' . $table_string . '}';
        }
      }

It looks extremely worth making one of the related issues - perhaps #2767595: Support "-" in database table names to fully test multi database querying because it looks like this code path is extremely lightly tested. Whilst I can't see how changes made make the situation in HEAD worse there's no doubt that this can be improved and perhaps resolve issues like #2767595: Support "-" in database table names

alexpott’s picture

@joachim fyi the code snippet also points to why hyphen dbs are struggling with migrate btw - $table_string = $this->connection->escapeTable($table['table']); with strip the hyphens. This is not the fault of this patch and is currently occurring in HEAD - as pointed out in #83 we need better test coverage for testing multiple databases via fully qualified identifiers over a single connection.

kriboogh’s picture

@alexpott and @joachim if you want hyphens in a table name there is a patch for that in:
https://www.drupal.org/project/drupal/issues/3011987

joachim’s picture

> @joachim is there any chance you can create a test case that replicates #81.2 - because if this does break migrate map joining then we definitely are missing test coverage.

It looks like there is test coverage of the query that joins to a map table, for example in /core/modules/migrate/tests/src/Kernel/HighWaterTest.php. A dump of $query->__toString() in SqlBase::initializeIterator() gets me:

SELECT m.id AS id, m.title AS title, m.changed AS changed, map.sourceid1 AS migrate_map_sourceid1, map.source_row_status AS migrate_map_source_row_status
FROM
{high_water_node} m
LEFT OUTER JOIN drupalaccounts8.test55734627migrate_map_high_water_test map ON id = map.sourceid1
WHERE (map.sourceid1 IS NULL) OR (map.source_row_status = :db_condition_placeholder_0) OR (changed &gt; :db_condition_placeholder_1)
GROUP BY id, title, changed, map.sourceid1, map.source_row_status
ORDER BY changed ASC; Array
(
    [:db_condition_placeholder_0] =&gt; 1
    [:db_condition_placeholder_1] =&gt; 3

With the patch from #82 applied, that test fails for me with:

Migration failed with source plugin exception: SQLSTATE[42S02]: Base table or view not found: 1146 Table &#039;drupalaccounts8.test77520843migrate_map_high_water_test&#039; doesn&#039;t exist: SELECT &quot;m&quot;.&quot;id&quot; AS &quot;id&quot;, &quot;m&quot;.&quot;title&quot; AS &quot;title&quot;, &quot;m&quot;.&quot;changed&quot; AS &quot;changed&quot;, &quot;map&quot;.&quot;sourceid1&quot; AS &quot;migrate_map_sourceid1&quot;, &quot;map&quot;.&quot;source_row_status&quot; AS &quot;migrate_map_source_row_status&quot;
FROM
{high_water_node} &quot;m&quot;
LEFT OUTER JOIN drupalaccounts8.test77520843migrate_map_high_water_test &quot;map&quot; ON id = map.sourceid1
WHERE (&quot;map&quot;.&quot;sourceid1&quot; IS NULL) OR (&quot;map&quot;.&quot;source_row_status&quot; = :db_condition_placeholder_0) OR (&quot;changed&quot; &gt; :db_condition_placeholder_1)
GROUP BY id, title, changed, map.sourceid1, map.source_row_status
ORDER BY &quot;changed&quot; ASC; Array
(
    [:db_condition_placeholder_0] =&gt; 1
    [:db_condition_placeholder_1] =&gt; 3
)

but then it's also failing for me locally WITHOUT the patch. Not sure why!

(Incidentally, and off-topic, the tests for migrate look like they could do with some work, as MigrateTestBase for instance has a TODO referencing an issue now closed, and many references to simpletest despite being a phpunit test class.)

andypost’s picture

queued for all databases

+++ b/core/lib/Drupal/Core/Database/Driver/mysql/Connection.php
@@ -64,277 +64,6 @@ class Connection extends DatabaseConnection {
-  private $reservedKeyWords = [

+++ b/core/lib/Drupal/Core/Database/Driver/pgsql/Connection.php
@@ -48,28 +48,6 @@ class Connection extends DatabaseConnection {
-  protected $postgresqlReservedKeyWords = ['all', 'analyse', 'analyze', 'and',

reserved words are removed, so it makes useless #3020839: Drupal\Core\Database\Driver\mysql\Connection::quoteIdentifier in_array performance improvements

Status: Needs review » Needs work

The last submitted patch, 88: 2986452-88.patch, failed testing. View results
- codesniffer_fixes.patch Interdiff of automated coding standards fixes only.

andypost’s picture

Fix CS for messages, the second one still needs proper wording

Version: 8.8.x-dev » 8.9.x-dev

Drupal 8.8.0-alpha1 will be released the week of October 14th, 2019, which means new developments and disruptive changes should now be targeted against the 8.9.x-dev branch. (Any changes to 8.9.x will also be committed to 9.0.x in preparation for Drupal 9’s release, but some changes like significant feature additions will be deferred to 9.1.x.). For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

idebr’s picture

Status: Needs review » Needs work
Issue tags: +Needs reroll

Patch no longer applies against 8.9.x

kostyashupenko’s picture

Status: Needs review » Needs work

The last submitted patch, 95: 2986452-95.patch, failed testing. View results

alexpott’s picture

Status: Needs work » Needs review
FileSize
60.2 KB
60.2 KB

Fixing the test. It shows we have some work to do on ensure that joins are properly quoted.

daffie’s picture

@alexpott: I think that your 95-97-interdiff.txt is the same as your patch file.

alexpott’s picture

FileSize
1.81 KB

Whoops here's the right interdiff

alexpott’s picture

This addresses the entity query issue with joins not being quoted. I've also looked at all the other addJoin() in core and none of the others are dynamic.

Status: Needs review » Needs work

The last submitted patch, 100: 2986452-8.9.x-100.patch, failed testing. View results

daffie’s picture

The views module uses a lot of $query->where(), expressions, and joins with sql strings as input. Only in the patch there is nothing about the views module? If you create a view with preserved word as a table name or column the query will not quote the table names and column names.

alexpott’s picture

@daffie - sure and that's the same as now. We can handle that in follow-ups. This patch is not making the situation worse - and it gives us a path to make it better.

andypost’s picture

Maybe query() method (or kind of it) of database layer could use assert to test it properly for the most of queries?

alexpott’s picture

Hmmm.... so if we change the queries then any alters or funky stuff like \Drupal\workspaces\EntityQuery\Tables might break... but this stuff is very very fragile :(

daffie’s picture

My review of the change record:

  1. All the join methods and Connection::queryRange() should be added to the list. db_query() can be removed.
  2. With the given examples can we get how it is now and what it should become.
  3. Can we add text about what happens when you do not add those square brackets to your code.

My review for the patch:

  1. +++ b/core/lib/Drupal/Core/Database/Connection.php
    @@ -341,6 +390,27 @@ public function prefixTables($sql) {
    +  public function quoteIdentifiers($sql) {
    +    return str_replace(['[', ']'], $this->identifierQuote(), $sql);
    +  }
    

    There is no testing for this new method.

  2. +++ b/core/lib/Drupal/Core/Database/Connection.php
    @@ -306,14 +336,19 @@ protected function setPrefix($prefix) {
    +    // 'prefix.'. In this instance we need to quote the identifiers correctly.
    

    What is exactly wrong with 'prefix.'? Is it the name prefix or is it the dot or both?

  3. +++ b/core/lib/Drupal/Core/Database/Connection.php
    @@ -323,6 +358,20 @@ protected function setPrefix($prefix) {
    +  protected function identifierQuote() {
    +    @trigger_error('In drupal:9.0.0 this method will be abstract and contrib and custom drivers will have to implement it. See https://www.drupal.org/node/2986894', E_USER_DEPRECATED);
    +    return '';
    +  }
    

    Why do we not just create the method and let contrib database driver override this method when they need to? Now the impression is that the default value is an empty string.

  4. +++ b/core/lib/Drupal/Core/Database/Connection.php
    @@ -265,6 +293,7 @@ protected function defaultOptions() {
    +      'allow_square_brackets' => FALSE,
    

    I would like to see a lot more testing for this option being TRUE. At least database tests and tests for situations where one might use this option being TRUE.

  5. +++ b/core/lib/Drupal/Core/Database/Connection.php
    @@ -341,6 +390,27 @@ public function prefixTables($sql) {
    +    return str_replace(['[', ']'], $this->identifierQuote(), $sql);
    

    Can we add a comment what this code actually does.

  6. +++ b/core/lib/Drupal/Core/Database/Connection.php
    @@ -387,18 +457,25 @@ public function getFullQualifiedTableName($table) {
    +   * @param bool $quote_identifiers
    +   *   (optional) Quote any identifiers enclosed in square brackets. Defaults to
    +   *   TRUE.
    ...
    +  public function prepareQuery($query, $quote_identifiers = TRUE) {
    

    Can we add testing for when the $quote_identifiers = FALSE.

  7. +++ b/core/lib/Drupal/Core/Database/Connection.php
    @@ -494,7 +571,9 @@ public function getLogger() {
    +    return str_replace($this->identifierQuote(), '', $this->prefixTables('{' . $table . '}_' . $field . '_seq'));
    

    Can we do in 2 lines of code to increase readability.

  8. The amount of testing for the method Connection::setPrefix() is very limited. See: Drupal\Tests\Core\Database\ConnectionTest::testPrefixRoundTrip(). Can we do more?
  9. +++ b/core/lib/Drupal/Core/Database/Connection.php
    @@ -958,30 +1037,32 @@ public function schema() {
        *   The sanitized database name.
    ...
       public function escapeDatabase($database) {
    ...
    +    return $this->identifierQuote() . $database . $this->identifierQuote();
    

    Now we are not only sanitizing the database name, but also adding quotes to it. This is a public method. This is a BC break!

  10. +++ b/core/lib/Drupal/Core/Database/Connection.php
    @@ -998,10 +1079,14 @@ public function escapeTable($table) {
       public function escapeField($field) {
    ...
    +      $this->escapedFields[$field] = $identifier_quote . str_replace('.', $identifier_quote . '.' . $identifier_quote, $escaped) . $identifier_quote;
    

    Now we are not only sanitizing the field name, but also adding quotes to it. This is a public method. This is a BC break!

  11. +++ b/core/lib/Drupal/Core/Database/Connection.php
    @@ -1022,7 +1107,7 @@ public function escapeAlias($field) {
         if (!isset($this->escapedAliases[$field])) {
    ...
    +    return $this->identifierQuote() . $this->escapedAliases[$field] . $this->identifierQuote();
    

    Now we are not only sanitizing the alias name, but also adding quotes to it. This is a public method. This is a BC break!

  12. +++ b/core/lib/Drupal/Core/Database/Driver/pgsql/Schema.php
    @@ -553,7 +553,7 @@ public function renameTable($table, $new_name) {
    -    $old_full_name = $this->connection->prefixTables('{' . $table . '}');
    +    $old_full_name = str_replace('"', '', $this->connection->prefixTables('{' . $table . '}'));
    

    Why this change. Is it because method Connection::prefixTables() has changed. If so, then that would be a BC break!

  13. +++ b/core/lib/Drupal/Core/Database/Query/Condition.php
    @@ -123,7 +123,6 @@ public function where($snippet, $args = []) {
    -      'operator' => NULL,
    

    Why this change? And why in this patch?

  14. +++ b/core/lib/Drupal/Core/Database/Query/Select.php
    @@ -820,7 +820,7 @@ public function __toString() {
    +      $fields[] = $this->connection->escapeField((isset($field['table']) ? $field['table'] . '.' : '') . $field['field']) . ' AS ' . $this->connection->escapeAlias($field['alias']);
    

    Can we change this from a single line to multi line to improve readability.

  15. +++ b/core/modules/system/tests/modules/database_test/database_test.install
    @@ -313,20 +313,18 @@ function database_test_schema() {
    -      'offset' => [
    -        'description' => 'A column with preserved name.',
    -        'type' => 'text',
    -      ],
    -      'function' => [
    -        'description' => 'A column with reserved name in MySQL 8.',
    

    This removes the testing with the special column names "offset" and "function". Please do not change this test, just add another test.

  16. +++ b/core/modules/workspaces/src/EntityQuery/Tables.php
    @@ -93,7 +93,7 @@ protected function addJoin($type, $table, $join_condition, $langcode, $delta = N
    +      list($base_table, $id_field) = explode('.', str_replace(['[', ']'], '', $condition_parts[1]));
    

    Can we change this from a single line to multi line to improve readability.

  17. +++ b/core/tests/Drupal/Tests/Core/Database/ConnectionTest.php
    @@ -297,4 +263,98 @@ public function testFilterComments($expected, $comment) {
    +    $mock_pdo = $this->createMock(StubPDO::class);
    +    $connection = new StubConnection($mock_pdo, [], $identifier_quote);
    

    Why not use a real connection?

  18. +++ b/core/tests/Drupal/Tests/Core/Database/ConnectionTest.php
    @@ -297,4 +263,98 @@ public function testFilterComments($expected, $comment) {
    +  public function providerEscapeTables() {
    ...
    +  public function providerEscapeAlias() {
    ...
    +  public function providerEscapeFields() {
    ...
    +  public function testEscapeTable($expected, $name, $identifier_quote = '"') {
    ...
    +  public function testEscapeAlias($expected, $name, $identifier_quote = '"') {
    ...
    +  public function testEscapeField($expected, $name, $identifier_quote = '"') {
    

    Nit: Can we put the provider methods next to their test methods. Improves readability.

  19. $saved_value = $this->connection->query('SELECT [update] FROM {select} WHERE id = :id', [':id' => 2])->fetchField();
    Solutions like the above piece of code, with sometimes the square brackets and sometimes the curly brackets, looks like to me as a ugly drupalism.
    Other developers are not going to like this.
daffie’s picture

What happens to the current code that has not been updated to use the square brackets? Can we add testing for that so we a sure that contrib/custom code does not break!
In the patch are a lot of little changes that can break existing contrib/custom code. I also think they will be very nasty little bugs to debug.
I know that the alternatives like do nothing or deprecate all methods that take raw sql strings as a parameter are not very attractive.
I am not sure what the best solution is.

daffie’s picture

The method SelectInterface::having() is not mentioned in the CR and properly some work needs to be done on the patch.

alexpott’s picture

@daffie thanks for the review.

Re #106

  1. This is literally called on every query. Adding a standalone test for this feels silly.
  2. It's the fact that other_db.table needs to be converted into "other_db"."table" and not "other_db.table"
  3. In future versions of drupal this should be abstract - because each db driver needs to think about how it quotes its identifiers. Adding this as deprecated allows us to do that. Returning an empty string means that the current strings prepared by contrib db drivers are unchanged so they get to adopt this when they like
  4. Yep we should add explicit testing - it's test 1000's of times on a postgres test run but what happens if we remove postgres.
  5. I think that is unnecessary. This method has very explicit documentation already. Ie.
       * Queries sent to Drupal should wrap all unquoted identifiers in square
       * brackets. This function searches for this syntax and replaces them with the
       * database specific identifier. In ANSI SQL this a double quote.
    
  6. Sure - that would be covered by the other testing of allow_square_brackets = TRUE - imo methods like this a public but really internal to the DB API and should never be called from outside it.
  7. Sure - done
  8. Will try to
  9. I disagree. We are now properly escaping it!
  10. I disagree. We are now properly escaping it!
  11. I disagree. We are now properly escaping it!
  12. Yes. But I disagree about BC breaking - because in my opinion these methods are not really part of our API. They all should be @internal
  13. Because it was breaking something - I'm going to have to dig around to see what.
  14. Sure
  15. I disagree. The new test is better.
    +++ b/core/modules/system/tests/modules/database_test/database_test.install
    @@ -313,20 +313,18 @@ function database_test_schema() {
    +      'update' => [
    +        'description' => 'A column with reserved name.',
             'type' => 'text',
    

    This is reserved in all SQL dialects.

  16. Done but I'm not really a fan of this change.
  17. Because this is unit test
  18. Done
  19. So what - the curly brackets are a drupalism sure... but actually in MSSQL square brackets are the identifier quotes. If they don't like it use the proper DB API (ie. db_select() and not db_query()). There is a price you have to pay for abstraction and magically working on a wide variety of databases and we have to pay it somewhere. Either we pay it with obscure and impossible to fix bugs or we start using the databases' own ways of coping with this - ie. quoting all identifiers. This functionality exists in MySQL, PostgreSQL, SQLite etc... for a reason.

Re #107 - if you don't use square brackets nothing changes and nothing breaks unless you are already broken on an alternate db driver you are not testing against.

andypost’s picture

catch’s picture



+++ b/core/modules/workspaces/src/EntityQuery/Tables.php
+++ b/core/modules/workspaces/src/EntityQuery/Tables.php
@@ -93,7 +93,8 @@ protected function addJoin($type, $table, $join_condition, $langcode, $delta = N

@@ -93,7 +93,8 @@ protected function addJoin($type, $table, $join_condition, $langcode, $delta = N
       // If those two conditions are met, we have to update the join condition
       // to also look for a possible workspace-specific revision using COALESCE.
       $condition_parts = explode(' = ', $join_condition);
-      list($base_table, $id_field) = explode('.', $condition_parts[1]);
+      $condition_parts_1 = str_replace(['[', ']'], '', $condition_parts[1]);
+      list($base_table, $id_field) = explode('.', $condition_parts_1);
 
       if (isset($this->baseTablesEntityType[$base_table])) {
         $entity_type_id = $this->baseTablesEntityType[$base_table];

What would happen if workspaces didn't make this change? Query failure? Error?

Trying to figure out what the impact on contrib would be - not that many contrib modules do anything as evil as workspaces has to.

alexpott’s picture

Re #111 We need to remove the square brackets from the identifiers so that:

  • $base_table is found in the list of base tables ie. if (isset($this->baseTablesEntityType[$base_table])) {
  • $id_field correctly matches if ($id_field === $revision_key || $id_field === 'revision_id') {
  • $base_table is found in the list of workspace tables ie. $workspace_association_table = $this->contentWorkspaceTables[$base_table];

Note that \Drupal\workspaces\EntityQuery\Tables::addJoin() is working with raw sql so it's always a bit fraught.

alexpott’s picture

Patches attached:

Still to address from #106 - point 13. and #108.

alexpott’s picture

The CR now mentions the having method on the select interface so #108 is addressed.

alexpott’s picture

More allow_square_brackets tests. Let's work on 9.0.x for now as whatever this change has to go there first.

daffie’s picture

Thank you @alexpott on working on this issue. The patch looks better. I did some more reviewing:

  1. +++ b/core/lib/Drupal/Core/Database/Connection.php
    @@ -144,9 +144,32 @@ abstract class Connection {
    +   * @deprecated in drupal:8.8.0 and is removed from drupal:9.0.0. This is no
    

    This is going into 9.0 first, it therefor cannot be removed in 9.0.

  2. +++ b/core/lib/Drupal/Core/Command/DbDumpCommand.php
    @@ -264,7 +264,9 @@ protected function getTableIndexes(Connection $connection, $table, &$definition)
    +    $table = trim($connection->prefixTables('{' . $table . '}'), '"');
    

    Why removing only the double qouotes and not others like backticks?

  3. +++ b/core/lib/Drupal/Core/Database/Connection.php
    @@ -306,14 +336,19 @@ protected function setPrefix($prefix) {
    +        // $val can point to another database like 'database.users'. In this
    +        // instance we need to quote the identifiers correctly.
    ...
    +    // $this->prefixes['default'] can point to another database like
    +    // 'other_db.'. In this instance we need to quote the identifiers correctly.
    

    And what is then the correct way of quoting those? I know what it is, but can we add an example to the comments.

  4. +++ b/core/lib/Drupal/Core/Database/Connection.php
    @@ -323,6 +358,20 @@ protected function setPrefix($prefix) {
    +    @trigger_error('In drupal:9.0.0 this method will be abstract and contrib and custom drivers will have to implement it. See https://www.drupal.org/node/2986894', E_USER_DEPRECATED);
    

    In Drupal 10.0.0 instead of 9.0.0.

  5. +++ b/core/lib/Drupal/Core/Database/Connection.php
    @@ -956,30 +1036,32 @@ public function schema() {
    +    $database = preg_replace('/[^A-Za-z0-9_]+/', '', $database);
    +    return $this->identifierQuote() . $database . $this->identifierQuote();
    

    Escaped tables, fields and aliases are stored in an array for fast retrieval. Should we not do the same for database names?

  6. +++ b/core/lib/Drupal/Core/Database/Connection.php
    @@ -996,10 +1078,14 @@ public function escapeTable($table) {
    +      $this->escapedFields[$field] = $identifier_quote . str_replace('.', $identifier_quote . '.' . $identifier_quote, $escaped) . $identifier_quote;
    
    @@ -1020,7 +1106,7 @@ public function escapeAlias($field) {
    +    return $this->identifierQuote() . $this->escapedAliases[$field] . $this->identifierQuote();
    

    Why are with escaped fields storing the value with the identifier quotes and aliases without? Please be consistent.

  7. +++ b/core/lib/Drupal/Core/Database/Driver/pgsql/Schema.php
    @@ -550,7 +550,7 @@ public function renameTable($table, $new_name) {
    +    $old_full_name = str_replace('"', '', $this->connection->prefixTables('{' . $table . '}'));
    
    @@ -866,8 +866,10 @@ protected function introspectIndexSchema($table) {
    +    $full_name = str_replace('"', '', $this->connection->prefixTables('{' . $table . '}'));
    

    Can we use the method identifierQuote() instead of a hard coded double quote.

  8. +++ b/core/lib/Drupal/Core/Entity/Query/Sql/Condition.php
    @@ -50,6 +50,13 @@ public function compile($conditionContainer) {
    +        if ($field === 0) {
    +          // Only add a condition if necessary.
    +          if ($condition['value'] != 0) {
    +            $conditionContainer->alwaysFalse();
    +          }
    +          continue;
    +        }
    

    Why is this change added, it looks like it does not belong in this patch.

  9. +++ b/core/tests/Drupal/KernelTests/Core/Database/SchemaTest.php
    @@ -1252,4 +1252,27 @@ public function testFindTables() {
    +  public function testUpperCaseTableName() {
    

    Can we add a provider method and add different kinds of allowed table names. Mixing it up shall we say. Should we also do the same for other schema stuff like fields and indexes. Or should we create a followup for that.

  10. +++ b/core/tests/Drupal/Tests/Core/Database/ConnectionTest.php
    @@ -93,49 +96,12 @@ public function providerTestPrefixTables() {
         $mock_pdo = $this->createMock('Drupal\Tests\Core\Database\Stub\StubPDO');
    

    If we are touching this test then can we change it to StubPDP::class

  11. +++ b/core/tests/Drupal/Tests/Core/Database/ConnectionTest.php
    @@ -297,4 +263,98 @@ public function testFilterComments($expected, $comment) {
    +  public function providerEscapeTables() {
    ...
    +  public function providerEscapeAlias() {
    ...
    +  public function providerEscapeFields() {
    

    Can we testing with backticks.

  12. +++ b/core/tests/Drupal/KernelTests/Core/Entity/EntityQueryTest.php
    index 8325079986..fb0ead458d 100644
    --- a/core/tests/Drupal/Tests/Core/Database/ConnectionTest.php
    

    I am missing testing for the method escapeDatabase().

daffie’s picture

@alexpott said in comment #109.12 the following:

They all should be @internal

Ok. Lets document them like that.

alexpott’s picture

Re #117 I think we can add @internal to new methods here - but documenting the rest of the DB API with @internal should be a followup created #3118629: Add @internal to public database API methods where appropriate

@daffie thanks for review in #116.

  1. I think with \Drupal\Core\Database\Connection::$escapedNames - given it is no longer used I'd be in favour of removing completely in 9.0.x. TBH I think we should target this patch at 9.0.x and forget about 8.x.x and say that as longer as you use the db API as intended everything will work with no changes. Changing deprecation here but will bring this up with release management.
  2. Because this script is MySQL only. See \Drupal\Core\Command\DbDumpCommand::getTableSchema - we only need to handle double quotes.
  3. Fixed
  4. Fixed
  5. I don't think so - this is rarely used - feels like premature optimisation.
  6. Nice catch
  7. Nope - it's on the connection object and it is protected. However I think we should revisit that decision and have the quote character as a constant on the connection so we can use it in schema classes - or have it on the schema classes so we can use in connection - or something like that.
  8. It does belong - will have to dig for the reason. But I'll do that with the NULL question you had in previous reviews and will document this in the issue summary once I've found out.
  9. I don't mixing it up is necessary - testing mixed cases etc is not worth it. We have a gazillion lowercase names we just need to prove we handle upper case tables names consistently and this does that.
  10. This appears at least 7 times in this test changing it in one place makes it in inconsistent and I don't think changing the rest here is in scope
  11. Sure
  12. Added

Okay now I'm going to work on #116.7 and the fix suggested above.

daffie’s picture

@alexpott: The changes in #118 look good to me.

For 118.9: Mixing it up in the sense of testing with table names that have both lower and upper case characters. We have only testing for lower case table names. Can we add testing for uppercase table names and for table names with both lower and upper case table names.

Also still open #106.13 and #108.

I shall keep reviewing this patch because we need it for #2746541: Migrate D6 and D7 node revision translations to D8.

alexpott’s picture

#108 has been addressed - the change record now mentions ::having().

Re mixing the table name case I don't think there is any value in that. We're not doing different code paths depending on the case. The point of the test is to prove that any upper case characters in a table name are respected and it does that.

So I've looked into making \Drupal\Core\Database\Connection::identifierQuote() public or adding a constant to the Schema or Connection classes. I've come to the conclusion that both would be wrong moves because:

  • Making \Drupal\Core\Database\Connection::identifierQuote protected is nice - it means the whole quoting thing stays as much as possible internal to the drivers (as it should be)
  • The schema and connection concrete classes for each database type already contain a lot of db specific code (obviously) and tonnes of hard coded assumptions. For example the postgres schema already has things like $saveIdentifier = '"drupal_' . $this->hashBase64($identifierName) . '_' . $tag . '"'; - and I think that's okay. I don't want to see all the instances of this become $saveIdentifier = $this->connnection->identifierQuote() . 'drupal_' . $this->hashBase64($identifierName) . '_' . $tag . $this->connnection->identifierQuote(); or $saveIdentifier = self::identifierQuote . 'drupal_' . $this->hashBase64($identifierName) . '_' . $tag . self::identifierQuote;.

I think the patch makes the right balance between exposing the complexity and doing the simplest thing in the internals of a db driver.

That said I think in code outside the DB driver we need to document clearly what we're copying so I've added more docs here.

Also this isn't necessary for #2746541: Migrate D6 and D7 node revision translations to D8 - it would mean we could remove a strtolower() is test code but /shrug. This is important because it helps make our DB API more robust and as when adopt newer versions of databases they will add and remove reserved words so we need to quote identifiers. What we're adding is inline with other projects see Propel and Doctrine (that said Doctrine is using keyword lists and only quotes when identifiers are in the list).

alexpott’s picture

#106.13 was introduced in #18 and I've run all the tests that failed in #14 and they pass without it and they weren't any update tests - so it looks okay.

#116.8 was also introduced in #18. However Drupal\KernelTests\Core\Entity\EntityQueryTest fails.

Looking at the fail in EntityQuery test it's quite a bit of fun. In HEAD (i.e. not this patch) the sql generated is:

SELECT base_table.revision_id AS revision_id, base_table.id AS id
FROM
test53917321entity_test_mulrev base_table
INNER JOIN test53917321entity_test_mulrev_property_data entity_test_mulrev_property_data ON entity_test_mulrev_property_data.id = base_table.id
WHERE (entity_test_mulrev_property_data.id IN ('1', '3', '5')) AND (0 = '0')
ORDER BY base_table.id ASC

The (0 = '0') is nonsense. It is due to ->condition("id.%delta", 0, '=') note that later in this test we do ->condition("id.%delta", 1, '=') which generates the following SQL in HEAD

SELECT base_table.revision_id AS revision_id, base_table.id AS id
FROM
test49113081entity_test_mulrev base_table
INNER JOIN test49113081entity_test_mulrev_property_data entity_test_mulrev_property_data ON entity_test_mulrev_property_data.id = base_table.id
WHERE (entity_test_mulrev_property_data.id IN ('1', '3', '5')) AND (0 = '1')
ORDER BY base_table.id ASC

(0 = '1') nice :)

With the patch these queries become:

SELECT "base_table"."revision_id" AS "revision_id", "base_table"."id" AS "id"
FROM
"test10140436entity_test_mulrev" "base_table"
INNER JOIN "test10140436entity_test_mulrev_property_data" "entity_test_mulrev_property_data" ON "entity_test_mulrev_property_data"."id" = "base_table"."id"
WHERE "entity_test_mulrev_property_data"."id" IN ('1', '3', '5')
ORDER BY "base_table"."id" ASC

and

SELECT "base_table"."revision_id" AS "revision_id", "base_table"."id" AS "id"
FROM
"test10140436entity_test_mulrev" "base_table"
INNER JOIN "test10140436entity_test_mulrev_property_data" "entity_test_mulrev_property_data" ON "entity_test_mulrev_property_data"."id" = "base_table"."id"
WHERE ("entity_test_mulrev_property_data"."id" IN ('1', '3', '5')) AND ((1 = 0))
ORDER BY "base_table"."id" ASC

respectively.

Specifically this appears to be about testing what happens when \Drupal\Core\Entity\Query\Sql\Tables::addField hits line 230 and does return 0 which is funky because the interface says:

   * @return string
   *   The return value is a string containing the alias of the table, a dot
   *   and the appropriate SQL column as passed in. This allows the direct use
   *   of this in a query for a condition or sort.

So whilst this is a mess caused by #2384459: Add entity query condition for delta in EFQ this patch is preserving the current behaviour and is doing the right thing.

alexpott’s picture

FWIW it's not really a mess. It's just very odd stuff to code around. It's all about people using %delta on single value field in the entity query system. What's happening here is that we should only return values where delta is 0. And we do. So all good. And the using of alwaysFalse() makes sense for values other than 0 for a single value field.

Let's add some comments.

Note this code is completely analogous to code already in HEAD in \Drupal\Core\Entity\Query\Sql\Tables::addField()

          // If this is a numeric specifier we're adding a condition on the
          // specific delta. Since we know that this is a single value base
          // field no other value than 0 makes sense.
          if (is_numeric($next)) {
            if ($next > 0) {
              $this->sqlQuery->alwaysFalse();
            }
            $key++;
            $next = $specifiers[$key + 1];
          }

Note we can't solve this in addField() because we don't have the condition value there.

daffie’s picture

All my points are addressed. Thank you @alexpott for taking the time to explain everything to me.
The patch looks good.
I am still a bit worried about sites out there that are programmed by people that are still learning Drupal/PHP shall we say. To me the need to fix the problems that are stated in the IS are more important then poorly programmed that might/will fail.
The change record is also in order.
For me the patch is RTBC.
Thanks everybody for working on this issue!

catch’s picture

Status: Reviewed & tested by the community » Fixed

The original issue where this was brought up, is nearly old enough to drink and vote in the UK. https://www.drupal.org/project/drupal/issues/371

As discussed above, there is a risk of disruption to contributed modules with this change, however as core shows, it should only be contrib modules doing complex SQL string manipulation (workspaces is the only module doing this in core, even views isn't touched here), or ones not using the database API properly at all.

Since both of those cases are impossible for us to detect, that makes this a major-only change - although it's technically allowable in the same sense that we might change queries that other modules are trying to alter in minor releases.

+++ b/core/tests/Drupal/KernelTests/Core/Database/QueryTest.php
@@ -150,4 +150,14 @@ public function testNumericExpressionSubstitution() {
 
+  /**
+   * Tests quoting identifiers in queries.
+   */
+  public function testQuotingIdentifiers() {
+    // Use the tabled named an ANSI SQL reserved word with a column that is as
+    // well.
+    $result = $this->connection->query('SELECT [update] FROM {select}')->fetchObject();
+    $this->assertEquals('Update value 1', $result->update);
+  }
+

s/tabled/table/ - fixed on commit.

Committed 85202b4 and pushed to 9.0.x. Thanks!

daffie’s picture

#371: resolve ANSI SQL-92/99/2003 reserved words conflict in query statements is from 2002. Almost the same as 2020. Just 2 numbers reversed. :)
Just be patient. Your issue will be fixed in the end.

  • catch committed 3e9a5a5 on 9.0.x
    Issue #2986452 by alexpott, andypost, kostyashupenko, daffie, kriboogh,...
andypost’s picture

Is it going to be backported to 8.9?
CR needs update accordingly

catch’s picture

I don't think we should backport this to 8.9, due to the potential disruption to query rewriting.

mondrake’s picture

+++ b/core/tests/Drupal/KernelTests/Core/Entity/EntityQueryTest.php
@@ -1228,9 +1228,9 @@ public function testToString() {
     $expected = $connection->select("entity_test_mulrev", "base_table");
     $expected->addField("base_table", "revision_id", "revision_id");
     $expected->addField("base_table", "id", "id");
-    $expected->join("entity_test_mulrev__$figures", "entity_test_mulrev__$figures", "entity_test_mulrev__$figures.entity_id = base_table.id");
-    $expected->join("entity_test_mulrev__$figures", "entity_test_mulrev__{$figures}_2", "entity_test_mulrev__{$figures}_2.entity_id = base_table.id");
-    $expected->addJoin("LEFT", "entity_test_mulrev__$figures", "entity_test_mulrev__{$figures}_3", "entity_test_mulrev__{$figures}_3.entity_id = base_table.id");
+    $expected->join("entity_test_mulrev__$figures", "entity_test_mulrev__$figures", '"entity_test_mulrev__' . $figures . '"."entity_id" = "base_table"."id"');
+    $expected->join("entity_test_mulrev__$figures", "entity_test_mulrev__{$figures}_2", '"entity_test_mulrev__' . $figures . '_2"."entity_id" = "base_table"."id"');
+    $expected->addJoin("LEFT", "entity_test_mulrev__$figures", "entity_test_mulrev__{$figures}_3", '"entity_test_mulrev__' . $figures . '_3"."entity_id" = "base_table"."id"');
     $expected->condition("entity_test_mulrev__$figures.{$figures}_color", ["blue"], "IN");
     $expected->condition("entity_test_mulrev__{$figures}_2.{$figures}_color", ["red"], "IN");
     $expected->isNull("entity_test_mulrev__{$figures}_3.{$figures}_color");
diff --git a/core/tests/Drupal/Tests/Core/Database/ConnectionTest.php b/core/tests/Drupal/Tests/Core/Database/ConnectionTest.php

this has wrongly assumed that " would always be the character for quoting identifiers, which is the case in core, but not necessarily in contrib. Will file a followup.

daffie’s picture

@mondrake: There are more places in the patch where the double quote is hardcoded.

mondrake’s picture

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

jibran’s picture

Ever since this issue is committed. DER 2.x triggers are broken on D9 see #3129590: Make DER green for D9 branch. Can I please get a patch review over there?

daffie’s picture

Created #3130579: Make Drupal\Core\Database\Schema work with reserved keywords for naming to fix working with reserved keywords and the class Drupal\Core\Database\Schema.

TR’s picture

Issue tags: +MySQL 8
Alexey.Samsonov’s picture

effulgentsia’s picture

Folks here might be interested in #3218978: MySQL driver allows settings.php to remove ANSI_QUOTES from sql_mode, but doesn't work when it is. I don't know to what extent that issue was affected by this one. It might have already existed, and either made more exposed or not by this one.

Chi’s picture

It there a way to escape brackets in SQL query? Seems like this change imposed severe restrictions on using JSON and array data types in Drupal.