Problem/Motivation

testExpressionUpdate

fail: [Other] Line 878 of modules/simpletest/tests/database_test.test:
Number of affected rows are returned.
  /**
   * Test updating with expressions.
   */
  function testExpressionUpdate() {
    // Set age = 1 for a single row for this test to work.
    db_update('test')
      ->condition('id', 1)
      ->fields(array('age' => 1))
      ->execute();

    // Ensure that expressions are handled properly.  This should set every
    // record's age to a square of itself, which will change only three of the
    // four records in the table since 1*1 = 1. That means only three records
    // are modified, so we should get back 3, not 4, from execute().
    $num_rows = db_update('test')
      ->expression('age', 'age * age')
      ->execute();
    $this->assertIdentical($num_rows, 3, 'Number of affected rows are returned.' . " num_rows: $num_rows");

Steps to reproduce

With PostgreSQL $num_rows is 4 so the test fails.

Proposed resolution

Probably backport changes made in #805858: Affected rows inconsistent across database engines

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

mcdruid created an issue. See original summary.

mcdruid’s picture

I see @poker10 already commented in #805858: Affected rows inconsistent across database engines.

David Rothstein's summary in that issue in #45 (slightly simplified to ignore really old MySQL):

In Drupal 7, this used to be inconsistent [and still is] ...

- MySQL / InnoDB: changed rows only
- MySQL / MyISAM: changed rows only
- PostgreSQL: all rows matched by the query
- SQLite: changed rows only

One way to address this would be to change the pgsql driver to try and return the number changed rows to make it consistent with the other dbs / drivers.

However, I'm not certain that's a great idea at this stage in D7's lifecycle.

The simplest solution would be to change the test to match D8/9 so that passes.

That would, however, mask the fact that the db drivers don't behave consistently.

poker10’s picture

Yesterday I have added comment in that related issue (as I thought that it was "needs work" for D7). But we can consider that issue done (commited D8) a keep open only this issue.

But I am a little bit confused about the direction to solve this in D7. We cannot easily backport D8 fix, as this fix uses new MySQL flag PDO::MYSQL_CLIENT_FOUND_ROWS which was introduced in PHP 5.3.5. But D7 should also support PHP 5.2.4 (or greater), according to INSTALL.txt.

One way is to fix the test only.

Another way to fix this is to go a way with the patch available for D7 (drupal-affected-rows-behavior-805858-51.patch) which adds a workaround for PostgreSQL (like it is in SQLite). But this will introduce new conditions to all update queries and this seem like an overkill to me. It can easily introduce additional complexity of queries and reduce their speed.. We should consider this even if they wanted to apply it only to PostgreSQL.

Also if we go with this patch, then the behavior will be different in D8/D9 and D7, which I do not think is somthing what we want (especially when this PostgreSQL behavior is not really a bug in the true sence, it is a regular PostgreSQL behavior).

So this one is a little bit more complicated as I have seen.

mcdruid’s picture

Yes there are a few options and none seem absolutely ideal.

I don't think we need to be too concerned about supporting PHP 5.2 any more.

However, I don't think using a PDO flag to change the behaviour of MySQL makes sense as that could introduce unexpected / unwanted regressions on lots of sites.

It also wouldn't solve the inconsistency unless we also change the SQLite driver.

I think the choice is between leaving the drivers as they are (inconsistent, but only PostgreSQL behaving differently) and changing the PostgreSQL driver so that it returns the number of changed rows as the other drivers do.

We could make that change configurable so that sites could opt out if they wish.

mcdruid’s picture

Patch based on #805858-52: Affected rows inconsistent across database engines

This moves the unselectNoOpRows() logic that's already in the sqlite driver into the parent UpdateQuery class.

The sqlite driver continues to call that for UPDATE queries so long as the option has not been overridden on the query. I'm not sure if anyone actually uses that and I don't see it documented in D7 core, but we should preserve the way it works.

The pgsql driver will now also call unselectNoOpRows() for UPDATE queries but there's an opt-out at the db connection level e.g.:

$databases['default']['default'] = array(
  'driver' => 'pgsql',
  'database' => 'drupal7',
  'username' => 'postgres',
  'password' => 'topsecret',
  'host' => 'localhost',
  'prefix' => '',
  'pgsql_return_matched_rows' => TRUE,
);

I think it makes sense to have this here in order to make it simple for sites to opt out from settings.php without changing any other code.

A few remaining todos:

I've not added unselectNoOpRows() in the parent UPDATE query class unlike the patch this is based on. It was suggested there may be performance benefits if this is added for MySQL but I'm not sure it's worth causing potential regressions by changing the way driver works on the vast majority of D7 sites here.

This change at the end of \UpdateQuery_pgsql::execute() in the original patch looks like it makes sense in terms of the params of the query method, but I'm not actually sure why it's in this patch or is necessary:

  -    $this->connection->query($stmt, $options);
  +    $this->connection->query($stmt, array(), $options);

The comments could do with being tidied up:

  • They mention MySQL in the unselectNoOpRows() method but this patch doesn't actually change the MySQL driver.
  • We could provide an example of the opt-out option in default.settings.php

I've manually tested the opt-out. It makes DatabaseUpdateTestCase::testExpressionUpdate fail again with PostgreSQL where this patch makes that test pass without it.

mcdruid’s picture

Status: Active » Needs review
poker10’s picture

For the PostgreSQL part: don't you think it will be better to have the opt-in instead of changing the current behavior automatically? Maybe this could be more "user-friendly" solution, as I am not sure if this issue causes any troubles in the real world right now (for me it seems like that it only creates the test failure).

But I do not know if we can (in case of opt-in solution) force the testbot to use opt-in - maybe if we update the default settings.php, so all new installations will be using this "fix" and all existing installations will remain without this fix, unless explicitly set in settings.php. Would it work like this?

On the other side, SQLite installations seems to be using this patch for some time, so there the opt-out makes sense.

I also wonder if this existing "typo" is causing any problems right now? I cannot find any bug report to this, so I am curious to try this change what will happen in update queries.

-    $this->connection->query($stmt, $options);
+   $this->connection->query($stmt, array(), $options);

I also agree that we do not want to change anything in MySQL in this Drupal 7 stage, so this patch looks like better than #805858-52.

poker10’s picture

There are some new failures after this patch, like this:

exception: [Uncaught exception] Line 2284 of includes/database/database.inc:
PDOException: SQLSTATE[22P02]: Invalid text representation: 7 ERROR:  invalid input syntax for type bytea
LINE 2: WHERE ( (name = 'site_name') )AND ( (value  's:14:"Test; \...
                                                      ^: UPDATE simpletest463076variable SET value=:db_update_placeholder_0
WHERE ( (name = :db_condition_placeholder_1) )AND ( (value  :db_condition_placeholder_0) OR (value IS NULL ) ) ; Array()
 in variable_set() (line 1310 of /var/www/html/includes/bootstrap.inc).

This was also mentioned in the last comment in #805858: Affected rows inconsistent across database engines. Worth check if the reason is this change:

-     $this->connection->query($stmt, $options);
+   $this->connection->query($stmt, array(), $options);
poker10’s picture

The patch has a problem with bytea fields. It adds <> operator to the new conditions, but on the bytea field this operator is not valid. It seems like that the bytea fields should be skipped in unselectNoOpRows().

poker10’s picture

Status: Needs review » Needs work

Ok, what I have found so far:

-    $this->connection->query($stmt, $options);
+    $this->connection->query($stmt, array(), $options);

This was changed also in D8 commit here, so it should be safe to fix it: #2167507: Fix rowCount query usage in pgsql and sqlite drivers . But it is questionable, if we do not have to backport all changes to have it work correctly, especially this:

-     return $stmt->rowCount();

The new test exception caused by patch is unrelated to this query() call change, but it is caused by bytea condition, as I have pointed out in #9.

In the final I still incline to opt-in solution for PostgreSQL - if it is possible to solve the bytea problem (not sure we know the field datatype we are querying).

mcdruid’s picture

Thanks.

Here's a rough first pass at skipping blob (/ bytea) fields in unselectNoOpRows().

We'll definitely have to consider the opt-in/opt-out question.

mcdruid’s picture

Status: Needs review » Needs work

Wow. Well that seems to have made testExpressionUpdate pass, but caused a ton of other tests to fail.

Something's very wrong...

poker10’s picture

Hmm, it seems like that in case the bytea field is in query, we need to skip the whole condition addition block for that query. Because if we are updating two columns and one of them is bytea, this code:

      if (in_array($field, $skip_fields)) {
        continue;
      }

will unset only the bytea column, but the second column will be still added as exclusion, so the query will be broken in the final.

This is the original query:

UPDATE simpletest393784test_serialized SET name=:db_update_placeholder_0, info=:db_update_placeholder_1
WHERE  (id = :db_condition_placeholder_1)

And this is the modified query after unselectNoOpRows() (only the name column in being checked additionally):

UPDATE simpletest393784test_serialized SET name=:db_update_placeholder_0, info=:db_update_placeholder_1
WHERE  (id = :db_condition_placeholder_1) AND ( (name <> :db_condition_placeholder_0) OR (name IS NULL ) ) 

But the correct expression should be this, as of patch #5 (which is unfortunately not working as the second column is bytea):

UPDATE simpletest393784test_serialized SET name=:db_update_placeholder_0, info=:db_update_placeholder_1
WHERE  (id = :db_condition_placeholder_2) AND ( (name <> :db_condition_placeholder_0) OR (name IS NULL ) OR (info <> :db_condition_placeholder_1) OR (info IS NULL ) ) 

So in case the bytea column is present in the query, it should not be modified. The result in this case should be:

UPDATE simpletest393784test_serialized SET name=:db_update_placeholder_0, info=:db_update_placeholder_1
WHERE  (id = :db_condition_placeholder_2)
poker10’s picture

mcdruid’s picture

Okay, thanks for the very clear explanation.

I think this patch should completely skip modifying UPDATEs that will affect any blob (/ bytea) fields.

However, I wonder if it's actually worth making this change at all as it's going to result in very inconsistent behaviour.

Perhaps we're back to modifying the test so that it doesn't fail for PostgreSQL and leaving the default rowCount behaviour alone?

If we do that, I think we should tweak the tests so that they check for the predicted behaviour of the different db engines, even if that is inconsistent (it'd remain as outlined in #2).

[edit: cross posted with @poker10 - I think our patches should do about the same thing... but do we actually want to do this?]

poker10’s picture

Yes, it seems like that this is maybe not the correct way. With these patches the PostgreSQL will still behave differently when bytea field will be in the query. So the inconsistency will be here anyway.

I have also tried replacing the <> with NOT LIKE (which is afterwards modified by prepareQuery() to cast bytea fields to text), but this will cause another new failures on numeric/integer fields. Also the NOT LIKE behavior is not the same as <>, especially if the string contains specific characters.

I am adding a new patch where only test is modified. If we decide to go this way, something like this should work. No interdiff here, as this is a new approach from zero.

mcdruid’s picture

poker10’s picture

Testbot results seems good :)

testExpressionUpdate failure is gone. Apart from some weird/random upgrade tests failures which still needs to be looked at (they also slightly differs in each testbot run), there are only three additional test modules failures. Two are waiting for Fabianx/Framework manager review (#3264826, #3264866) and the other is FileFieldWidgetTestCase (#3264750).

  • mcdruid committed ed3e399 on 7.x
    Issue #3264471 by mcdruid, poker10: DatabaseUpdateTestCase::...
mcdruid’s picture

Status: Needs review » Fixed

I got RTBC from Fabianx before committing this.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.