Problem/Motivation

While working on #2746541: Migrate D6 and D7 node revision translations to D8 it was discovered that Sql::getHighestId() fails if any of the destination ids are not of type 'integer'.

The master node migration, to be introduced in #2746541: Migrate D6 and D7 node revision translations to D8 3 destination ids; nid, vid and langcode. Currently, getHighestId() throws a LogicException because langcode is a string. Irregardless of the master node migration getHighestId() should not fail if a destination id is not an integer, having a langcode as an identifier seems quite reasonable.

Proposed resolution

Just allow Sql::getHighestId() to work if at least one id is of type integer, not all of them need to be integers.

Remaining tasks

Patch
Add a test
Review

CommentFileSizeAuthor
#45 3086238-45.patch4.95 KBquietone
#45 interdiff-41-45.txt1.79 KBquietone
#41 3086238-41.patch4.8 KBquietone
#41 interdiff-38-41.txt3.4 KBquietone
#38 3086238-38.patch6.27 KBquietone
#38 interdiff-35-38.txt2.13 KBquietone
#35 3086238-35.patch4.84 KBquietone
#35 interdiff-33-35.txt1.09 KBquietone
#33 interdiff-29-33.txt2.71 KBquietone
#33 3086238-33.patch4.82 KBquietone
#29 3086238-29.patch4.85 KBquietone
#29 3086238-29-fail.patch3.74 KBquietone
#26 3086238-26.patch12.1 KBquietone
#26 interdiff-22-26.txt5.06 KBquietone
#22 3086238-22.patch12.1 KBquietone
#22 interdiff-20-22.txt9.96 KBquietone
#20 interdiff-18-20.txt603 bytesquietone
#20 3086238-20.patch9.01 KBquietone
#18 3086238-18.patch8.48 KBquietone
#18 interdiff-14-18.txt1.81 KBquietone
#14 3086238-14.patch6.99 KBquietone
#14 interdiff-12-14.txt372 bytesquietone
#12 interdiff-6-12.txt1.45 KBquietone
#12 3086238-12.patch7.07 KBquietone
#6 3086238-6.patch6.37 KBquietone
#6 interdiff-4-6.txt673 bytesquietone
#4 interdiff-2-4.txt3.54 KBquietone
#4 3086238-4.patch6.3 KBquietone
#2 3086238-2.patch4.9 KBquietone
#2 3086238-2-fail.patch3.85 KBquietone
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

quietone created an issue. See original summary.

quietone’s picture

Status: Active » Needs review
FileSize
3.85 KB
4.9 KB

The last submitted patch, 2: 3086238-2-fail.patch, failed testing. View results

quietone’s picture

Opted to add a way to select which destination id to get the highest value of. This allows one to pass in the name of the id, as in 'nid' or 'vid', and then that is used in the query. I've also changed the query to use MAX, not sure why it currently doesn't.

Status: Needs review » Needs work

The last submitted patch, 4: 3086238-4.patch, failed testing. View results

quietone’s picture

Status: Needs work » Needs review
FileSize
673 bytes
6.37 KB

Rework the query.

quietone’s picture

Issue summary: View changes
quietone’s picture

Gábor Hojtsy’s picture

Title: getHIghestId() should not fail when there is a destination id with type string » getHighestId() should not fail when there is a destination id with type string

Is this a new feature (too). The first part of the issue summary sounds like it is an existing bug, but then "we should also add this feature while we are here". Not sure what is the relation between the two?

heddn’s picture

Issue summary: View changes
+++ b/core/modules/migrate/src/Plugin/migrate/id_map/Sql.php
@@ -970,15 +970,11 @@ protected function getMigrationPluginManager() {
+    // Ensure that at least one Id is an integer.

Nit: Id should be ID.

heddn’s picture

Status: Needs review » Needs work
+++ b/core/modules/migrate/src/Plugin/migrate/id_map/Sql.php
@@ -970,15 +970,11 @@ protected function getMigrationPluginManager() {
+  public function getHighestId($id_name = NULL) {

This needs an update on HighestIdInterface as well.

quietone’s picture

Issue summary: View changes
Status: Needs work » Needs review
FileSize
7.07 KB
1.45 KB

Updated the IS in the hopes of clarifying the questions in #9
Fixes for #10 amd #11.

One thing that I wonder about is the changes I've made to the query. I've skimmed through the original issue, #2876085: Before upgrading, audit for potential ID conflicts. In comment #154 heddn agrees that this should handle multiple destination IDs and the query was changed in https://www.drupal.org/project/drupal/issues/2876085#comment-12316496 to order by all destination fields. Since it was only returning the highest value for the first field it didn't matter but if we want the second field, like 'vid', then that sorting isn't guaranteed to produce the result we want. That is why I switched to using MAX. Just want to be clear about that change.

Status: Needs review » Needs work

The last submitted patch, 12: 3086238-12.patch, failed testing. View results

quietone’s picture

Status: Needs work » Needs review
FileSize
372 bytes
6.99 KB

Oops. Was thinking about catching a bus and not what I was typing.

Version: 8.8.x-dev » 8.9.x-dev

Drupal 8.8.0-alpha1 will be released the week of October 14th, 2019, which means new developments and disruptive changes should now be targeted against the 8.9.x-dev branch. (Any changes to 8.9.x will also be committed to 9.0.x in preparation for Drupal 9’s release, but some changes like significant feature additions will be deferred to 9.1.x.). For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

dinarcon’s picture

Hi @quietone, is it intentional that in HighestIdInterface::getHighestId() that docblock includes `@param string $id_name` while the method signature does not include the `id_name` parameter? I noticed that in `Sql::getHighestId()` the optional param is present and defaults to `NULL`.

Also, in the issue summary it says "it was discovered that Sql::getHighestId() fails if any of the destination ids are not of type 'string'." Shouldn't the last part say "if any of the destination ids are not of type 'integer'."

quietone’s picture

Issue summary: View changes

@dinarcon, thx. I have corrected the IS. Now, off to bed.

quietone’s picture

@dinarcon, How about this?
Although I guess this is not BC since it changes the interface.

Status: Needs review » Needs work

The last submitted patch, 18: 3086238-18.patch, failed testing. View results

quietone’s picture

Status: Needs work » Needs review
FileSize
9.01 KB
603 bytes

Hopefully just missed one.

larowlan’s picture

I think the BC way to do this is to deprecate the method without the argument and add a new method with the argument.

An example of where we've done that before - https://www.drupal.org/node/3060969

quietone’s picture

@larowlan, thanks that helped a lot.

Now, let's see if I got it right. Well, mostly, this still needs a CR and adding that to deprecation messages.

larowlan’s picture

Thanks, we should probably update the IS too

Code changes look good, will leave it to migrate folks to iron out the best name for the new method.

We also need a deprecation test to ensure the error is triggered.

And obviously, needs to use the actual change notice link, nice work @quietone 😎

quietone’s picture

@larowlan, Thanks! Pleased I got so much correct.

One thing I am not clear on. Are you saying there is a test for the expectedDeprecation message and one for the trigger error message?

+++ b/core/modules/migrate/tests/src/Unit/MigrateSqlIdMapTest.php
@@ -1071,4 +1071,175 @@ public function testMapTableCreation() {
+   * @expectedDeprecation getHighestId() is deprecated in drupal:8.9.0 and is removed from drupal:9.0.0. Use getMaximalId() instead. See http://drupal.org/node/the-change-notice-nid
+   */
+  public function testGetHighestId() {

Is this not sufficient?

larowlan’s picture

Oh, I missed that in the patch - all good

quietone’s picture

Added draft change record and updated the notices in the patch with its node id.

quietone’s picture

Issue summary: View changes
Issue tags: -Needs issue summary update

Updates the IS.

And as I wrote that I wonder if there should be a check to force the first destination id to be an integer. That would limit someone that has a string id first, followed by an integer id. Obviously not a use case in core and a very unlikely case in the wild.

quietone’s picture

This may not be the way to go. The changes don't help make using the AuditInterface usable when the migration has multiple ids that should be checked.

quietone’s picture

I've decided to reduce the scope of this to only what is needed for #2746541: Migrate D6 and D7 node revision translations to D8. At the time it seemed worth doing both here but alas, no. My local testing of it show that adding the ability to select the destination id to get the highest value of isn't really flexible enough. New issue being made for the other part.

The patch here just fixes it so that a destination id can be a string and one can still use getHighestId(). I have not made an interdiff because the patch is rather small.

edit: clarify first paragraph

quietone’s picture

Issue summary: View changes

The last submitted patch, 29: 3086238-29-fail.patch, failed testing. View results

benjifisher’s picture

Status: Needs review » Needs work

Mostly nits, but I think the first item needs to be fixed.

  1. +++ b/core/modules/migrate/src/Plugin/migrate/id_map/Sql.php
    @@ -970,15 +970,11 @@ protected function getMigrationPluginManager() {
    ...
    -  public function getHighestId() {
    ...
    +  public function getHighestId($id_name = NULL) {
        

    I think we want to leave the function signature unchanged until #3091004: getHighestId() should be able to use any integer destination id.

  2. +  public function getHighestId($id_name = NULL) {
    +    // Ensure that at least one ID is an integer.
    +    if (array_search('integer', array_column($this->migration->getDestinationPlugin()->getIds(), 'type')) === FALSE) {
    +      throw new \LogicException('Cannot determine the highest migrated ID without an integer ID column');
    +    };
        

    I like using array_search() instead of array_filter(). But I do not like the long lines. The one where we throw an exception is OK, but can we rewrite the other one? Something like

        $key_types = array_column($this->migration->getDestinationPlugin()->getIds(), 'type');
        if (array_search( 'integer', $key_types) === FALSE) {
          throw new \LogicException('Cannot determine the highest migrated ID without an integer ID column');
        };
        

    is a lot easier to read.

  3. +++ b/core/modules/migrate/tests/src/Unit/MigrateSqlIdMapTest.php
    @@ -1071,4 +1071,136 @@ public function testMapTableCreation() {
    ...
    +  public function testGetHighestId(array $destination_ids, array $rows, $expected) {
    +    $this->database = $this->getDatabase([]);
    +    $this->sourceIds = $destination_ids;
    +    $this->destinationIds = $destination_ids;
    +    $db_keys = [];
    +    for ($i = 1; $i <= count($destination_ids); $i++) {
    +      $db_keys[$i] = 'sourceid' . $i;
    +    }
    +    for ($i = 1; $i <= count($destination_ids); $i++) {
    +      $db_keys[] = 'destid' . $i;
    +    }
    +    $id_map = $this->getIdMap();
    +    foreach ($rows as $row) {
    +      $values = array_combine($db_keys, $row);
    +      $source_values = array_slice($row, 0, count($destination_ids));
    +      $values['source_ids_hash'] = $id_map->getSourceIdsHash($source_values);
    +      $this->saveMap($values);
    +    }
        

    Two suggestions, feel free to leave it as is. Use "sourceid$i" and "destid$i" instead of explicit concatenation. Since count($destination_ids) is used three times, maybe make a variable for it.

  4. +  /**
    +   * Data provider for testGetHighestIdInvalid().
    +   *
    +   * Scenarios to test:
    +   * - Destination id type string
    +   * - Destination id type integer and string.
    +   *
    +   * @return array
    +   *   An array of data values.
    +   */
    +  public function getHighestIdInvalidDataProvider() {
        

    Maybe I am confused, but that does not seem like an accurate description of the two scenarios. We have one or two keys, all of type string.

    I am not sure how the coding standards for comments apply to lists, but be consistent in ending punctuation. I think I would make each item a complete sentence with ending punctuation.

    As in #10, I think it should be "Destination ID".

quietone’s picture

Status: Needs work » Needs review
FileSize
4.82 KB
2.71 KB

@benjifisher, thanks for the review
1. Oops, that is a mistake. Fixed
2, 3, and 4. Fixed

benjifisher’s picture

Status: Needs review » Needs work

@quietone:

Thanks for those changes!

This all looks good, except for one mismatch between the comments and the code. This confused me:

  /**
   * Data provider for testGetHighestIdInvalid().
   *
   * Scenarios to test:
   * ...
   * - Destination id type integer and string.
   * ...
   */
  public function getHighestIdInvalidDataProvider() {
    return [
      // ...
      [
        'ids' => [
          'nid' => [
            'type' => 'int',
          ],
          'language' => [
            'type' => 'string',
          ],
        ],
      ],
    ];

It took me several minutes to figure out why this throws the expected exception: the code is looking for "integer", not "int". This difference is easy to miss, especially for a PHP coder: (int) "17" === (integer) "17".

Please make the comment match the code, and emphasize the distinction between "int" and "integer": something like

   * Scenarios to test:
   * - Destination ID type string.
   * - Destination ID types int (not integer) and string.

As in #10, I am using "ID", not "id", in both scenarios. The second scenario has two types, hence the plural.

quietone’s picture

Status: Needs work » Needs review
FileSize
1.09 KB
4.84 KB

@benjifisher, thanks again. You've got a good eye for detail.

Fixes here for #34.

alexpott’s picture

Status: Reviewed & tested by the community » Needs work

I'm not entirely sure about this. I think the we need to ensure that the first field returned by $this->destinationIdFields() is an integer - the rest are not significant looking at the way the query is done...

    // Get the highest id from the list of map tables.
    $ids = [0];
    foreach ($map_tables as $map_table) {
      // If the map_table does not exist then continue on to the next map_table.
      if (!$this->getDatabase()->schema()->tableExists($map_table)) {
        continue;
      }

      $query = $this->getDatabase()->select($map_table, 'map')
        ->fields('map', $this->destinationIdFields())
        ->range(0, 1);
      foreach (array_values($this->destinationIdFields()) as $order_field) {
        $query->orderBy($order_field, 'DESC');
      }
      $ids[] = $query->execute()->fetchField();
    }

This seems to have been considered in #27 but if you did have

a string id first, followed by an integer id

the change here is not going to produce the expected results.

Also this shows that we should be adding test coverage of having a string ID as the first ID.

quietone’s picture

Status: Needs work » Needs review
FileSize
2.13 KB
6.27 KB

I've added a change in getHighestId() to make sure the first destination id used is of type integer. This isn't done in destinationIdFields() since it only getHighestId() that requires the first id to be an integer.

heddn’s picture

+++ b/core/modules/migrate/src/Plugin/migrate/id_map/Sql.php
@@ -1004,11 +1004,20 @@ public function getHighestId() {
+      // Ensure the first destination id is an integer.
+      $destination_id_fields = $this->destinationIdFields();
+      foreach ($this->migration->getDestinationPlugin()->getIds() as $destination_id => $type) {

I'm not sure that we want to silently disregard non integer destination ids. We can't be sure the destination is even a Drupal entity that uses numeric serial destination ids. What about UUIDs on non standard destinations.

alexpott’s picture

I think I should have been clearer... we should only support getHighestId when the first ID is an integer.

quietone’s picture

Yea, I was uncomfortable with that but I thought it was the intention. This patch ensures the first ID is an integer.

alexpott’s picture

+++ b/core/modules/migrate/tests/src/Unit/MigrateSqlIdMapTest.php
@@ -1071,4 +1071,137 @@ public function testMapTableCreation() {
+      [
+        'ids' => [
+          'nid' => [
+            'type' => 'int',
+          ],
+          'language' => [
+            'type' => 'string',
+          ],
+        ],
+      ],

I had to stare at this test case for a while to work out why we expected it to fail. It's because int != integer. Do we ever expect the type to be int or was this a made up value for more explicit testing?

quietone’s picture

benjifisher asked that in #34 and the comment was updated accordingly. Honestly, I probably used 'int' because I have made that mistake before.

alexpott’s picture

+++ b/core/modules/migrate/tests/src/Unit/MigrateSqlIdMapTest.php
@@ -1071,4 +1071,137 @@ public function testMapTableCreation() {
+   * - Destination ID type string.
+   * - Destination ID types int (not integer) and string.

Ah I see... is these were array keys of the tests then it make PHPUnit output great and the scenarios really obvious. We could do the same with the other data provider.

I.e.

  /**
   * Data provider for testGetHighestIdInvalid().
   *
   * @return array
   *   An array of data values.
   */
  public function getHighestIdInvalidDataProvider() {
    return [
      'Destination ID type string' => [
        'ids' => [
          'language' => [
            'type' => 'string',
          ],
        ],
      ],
      'Destination ID types int (not integer) and string' => [
        'ids' => [
          'nid' => [
            'type' => 'int',
          ],
          'language' => [
            'type' => 'string',
          ],
        ],
      ],
    ];
  }
quietone’s picture

Fixes for #44.

heddn’s picture

Status: Needs review » Reviewed & tested by the community

I think we be good again.

benjifisher’s picture

I have not had a chance to review the updates since #37, but this part of #42 makes me feel better:

I had to stare at this test case for a while to work out why we expected it to fail. It's because int != integer.

I had the same problem.

alexpott’s picture

Version: 8.9.x-dev » 8.8.x-dev
Status: Reviewed & tested by the community » Patch (to be ported)

Committed and pushed 02f1c210f7 to 9.0.x and a10cb97a8e to 8.9.x. Thanks!

Going to ask other committers about backporting this to 8.8.x

  • alexpott committed 02f1c21 on 9.0.x
    Issue #3086238 by quietone, alexpott, heddn, larowlan, benjifisher,...

  • alexpott committed a10cb97 on 8.9.x
    Issue #3086238 by quietone, alexpott, heddn, larowlan, benjifisher,...
quietone’s picture

+++ b/core/modules/migrate/src/Plugin/migrate/id_map/Sql.php
@@ -971,14 +971,11 @@ protected function getMigrationPluginManager() {
+    // Ensure that at the ID an integer.

Look what I found this morning! This should be 'Ensure that the first ID is an integer'. Sorry, not able to fix it now.

alexpott’s picture

@quietone indeed. Hotfixed...

  • alexpott committed 72772f5 on 9.0.x
    Issue #3086238 follow-up by quietone: getHighestId() should not fail...

  • alexpott committed b6404c0 on 8.9.x
    Issue #3086238 follow-up by quietone: getHighestId() should not fail...
quietone’s picture

@alexpott, thank you!

catch’s picture

+++ b/core/modules/migrate/src/Plugin/migrate/id_map/Sql.php
@@ -971,14 +971,11 @@ protected function getMigrationPluginManager() {
-      }
-    );
+    // Ensure that at the ID an integer.

This comment isn't a sentence.

alexpott’s picture

Status: Patch (to be ported) » Fixed

@catch +1'd the backport and as per #52 that non-sentence has been hotfixed.

  • alexpott committed a6028c4 on 8.8.x
    Issue #3086238 follow-up by quietone: getHighestId() should not fail...
  • alexpott committed e6561bb on 8.8.x
    Issue #3086238 by quietone, alexpott, heddn, larowlan, benjifisher,...
Wim Leers’s picture

Published the change record: https://www.drupal.org/node/3089609

EDIT: Unpublished, because it looks like the committed changes deviate from the change record? It seems that we don't need a change record anymore, since it's now simply a bugfix?

quietone’s picture

Yes, the change record is not needed.

Wim Leers’s picture

Cool, deleted it :)

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.