Problem/Motivation

For custon migration it's a very common request to filter the migrated items. Some real use cases are:
- migrate nodes older than {date};
- skip blocked users or users with certain emails or users that never accessed the website;
- migrate users with certain roles;

It would be great to be able to use conditions to narrow down your SQL source data for content/config entity types. The resulting functionality would provide much more capabilities to how you can migrate nodes one content type at a time, and the same for taxonomy terms -- but for users (by role), menu links (by menu), and so on.

The functionality would let you narrow your data during the source/extraction phase, instead of during the process/transformation phase (which you can do using skip_on_value/empty + method: row). Benefits would include general tidiness of data :) And! Speed/performance!

Proposed resolution

Introduce new configuration properties in SqlBase base class: conditions, joins and distinct:
conditions - to filter the results;
joins - to join additional tables (required to filter by fields/properties stored in separate tables);
distinct - to avoid duplicated results when joining one-to-many records;

If such configuration is present, it's being applied to the query in prepareQuery method, when the base query is already defined.

Remaining tasks

  1. Review and validate the approach;
  2. Resolve #2833060: SqlBase::prepareQuery() should be called also on count, which is kind of a blocker (the source count is not updated after applying conditions, because the count() ignoring prepareQuery method);
  3. Add more tests;
  4. Write a change record;
CommentFileSizeAuthor
#93 drupal_core-3069776-SQL-source-plugins-allow-defining-conditions-and-join-in-migration-yml-2.patch14.05 KBtatianag
#85 drupal_core-3069776-SQL-source-plugins-allow-defining-conditions-and-join-in-migration-yml.patch12.6 KBxurizaemon
#84 drupal_core-3069776-SQL-source-plugins-allow-defining-conditions-and-join-in-migration-yml.patch14.35 KBxurizaemon
#60 interdiff_55-58.txt1.53 KBvsujeetkumar
#58 3069776_58.patch10.71 KBvsujeetkumar
#55 3069776_55.patch10.79 KBayushmishra206
#55 interdiff_52-55.txt3.53 KBayushmishra206
#52 interdiff_38-52.txt287 bytesvsujeetkumar
#52 3069776_52.patch10.67 KBvsujeetkumar
#48 interdiff_42-44.txt1.01 KBshaktik
#48 3069776-48.patch13.91 KBshaktik
#47 interdiff_46-47.txt3.7 KBhkirsman
#47 3069776-47.patch13.99 KBhkirsman
#46 interdiff_45-46.txt597 bytesjoey-santiago
#46 3069776-46.patch13.69 KBjoey-santiago
#45 interdiff_43-45.txt519 bytesjoey-santiago
#45 3069776-45.patch13.59 KBjoey-santiago
#43 interdiff_42-43.txt910 bytesjoey-santiago
#43 3069776-43.patch13.3 KBjoey-santiago
#42 interdiff_40-42.txt3.25 KBjoey-santiago
#42 3069776-42.patch13.29 KBjoey-santiago
#40 interdiff_38-40.txt3.62 KBjoey-santiago
#40 3069776-40.patch12.43 KBjoey-santiago
#38 interdiff_35-38.txt676 bytesvsujeetkumar
#38 3069776_38.patch10.68 KBvsujeetkumar
#35 3069776-sqbasewithjoins-35.patch10.67 KBjoey-santiago
#33 interdiff_29-33.txt3.82 KBchandrashekhar_srijan
#33 3069776-33.patch9.78 KBchandrashekhar_srijan
#29 interdiff_27-29.txt2.54 KBheddn
#29 3069776-29.patch7.27 KBheddn
#28 interdiff_24-27.txt1.64 KBheddn
#27 3069776-27.patch7.56 KBravi.shankar
#24 interdiff_21_24.txt2.38 KBSkabbkladden
#24 drupal-migrate-sql-source-conditions-3069776-24.patch7.58 KBSkabbkladden
#21 interdiff_8-21.txt764 bytesdaggerhart
#21 drupal-migrate-sql-source-conditions-3069776-21.patch7.77 KBdaggerhart
#8 interdiff_5-8.txt4.82 KBweekbeforenext
#8 drupal-migrate-sql-source-conditions-3069776-8.patch7.1 KBweekbeforenext
#6 interdiff_4-5.txt671 bytesdaggerhart
#6 drupal-migrate-sql-source-conditions-3069776-5.patch7.2 KBdaggerhart
#4 interdiff_2-4.txt6.27 KBweekbeforenext
#4 drupal-migrate-sql-source-conditions-3069776-4.patch7.22 KBweekbeforenext
#2 drupal-migrate-sql-source-conditions-3069776-2.patch2 KBweekbeforenext

Issue fork drupal-3069776

Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

alisonjo2786 created an issue. See original summary.

weekbeforenext’s picture

As suggested, I have moved my changes from migrate_plus issue #3077445: Add Conditions to Table Source Plugin to Drupal core migrate module, SqlBase.php. This will allow for a conditions array in migration YAML files.

This is an example of the syntax that can be included in the source plugin configuration for a migration:

source:
  conditions:
    -
      field: entity_type
      value: article
    -
      field_name: language
      value: en
      operator: <>

The "value" and "operator" values are optional. They will default to the values specified in the ConditionInterface, condition method.

public function condition($field, $value = NULL, $operator = '=');

heddn’s picture

Status: Needs review » Needs work
Issue tags: +Needs tests

Nice work. Some small points to consider below. Plus needs some tests. A simple test with a condition on node type/bundle and asserting all results only are that node type would probably be sufficient.

  1. +++ b/core/modules/migrate/src/Plugin/migrate/source/SqlBase.php
    @@ -29,6 +29,11 @@
    + * - conditions: (optional) Conditons that should be included in the query. This
    

    Let's provide a rational example of what this would look like. Maybe filtering by node type as a good example.

  2. +++ b/core/modules/migrate/src/Plugin/migrate/source/SqlBase.php
    @@ -242,6 +248,13 @@ protected function prepareQuery() {
    +    if (isset($this->configuration['conditions'])) {
    

    Let's throw a InvalidPluginException if the config for conditions doesn't match the expected format. But don't do in in prepareQuery. Rather do it in the constructor. Then we can be assured that conditions is an array and has field, value and operation all set per expectation.

weekbeforenext’s picture

I updated the SqlBase.php documentation with an example of 'conditions' and added conditions in the constructor to catch malformed conditions data.

I added a new test of the constructor to SqlBaseTest.php, to test the new conditions added. Doing so, broke the existing tests.

I'm not sure if I'm on the right track, so I wanted to see if I could get some direction on fixing.

The previous test uses a defined class "TestSqlBase" that extends SqlBase. Originally, the constructor was sort of disabled for the testing of the mapJoinable() method. Modifying the constructor method to call the parent constructor is causing the following errors:

Drupal\migrate\Plugin\MigrationInterface::getIdMap() was not expected to be called.

Drupal\migrate\Plugin\MigrationInterface::getIdMap() was not expected to be called more than once.

Should we create another class that extends SqlBase to test the constructor?

Status: Needs review » Needs work

The last submitted patch, 4: drupal-migrate-sql-source-conditions-3069776-4.patch, failed testing. View results
- codesniffer_fixes.patch Interdiff of automated coding standards fixes only.

daggerhart’s picture

Status: Needs work » Needs review
FileSize
7.2 KB
671 bytes

Before #4, TestSqlBase's constructor didn't call the parent constructor of SqlBase. After #4, TestSqlBase calls its parent SqlBase, which calls its parent SourcePluginBase, which itself always calls `$this->migration->getIdMap();`

I've updated the patch to pass tests by changing the MigrationInterface mock object to expect to call getIdMap "atLeastOnce()".

heddn’s picture

Status: Needs review » Needs work
  1. +++ b/core/modules/migrate/src/Plugin/migrate/source/SqlBase.php
    @@ -29,6 +29,20 @@
    + * - conditions: (optional) Conditions that should be included in the query. This
    ...
    + *   value is not defined, defaults to NULL. If operator is not defined, defaults
    

    Nit: 80 chars.

  2. +++ b/core/modules/migrate/src/Plugin/migrate/source/SqlBase.php
    @@ -108,6 +122,18 @@
    +    if (isset($this->configuration['conditions'])) {
    

    The doxygen says it defaults to an empty array. But then here we do we assume it might not. Wny not += merge in 'conditions' as an array into the configuration?

  3. +++ b/core/modules/migrate/src/Plugin/migrate/source/SqlBase.php
    @@ -447,7 +481,7 @@ protected function mapJoinable() {
    -      ) {
    +    ) {
    

    Stray change. Not needed.

  4. +++ b/core/modules/migrate/tests/src/Unit/SqlBaseTest.php
    @@ -18,6 +19,82 @@
    +      [
    +        TRUE,
    +        '',
    ...
    +      // Source conditions is not a multidimensional array.
    +      [
    +        TRUE,
    +        [''],
    +      ],
    +      // Source conditions condition does not specify 'field'.
    +      [
    +        TRUE,
    +        [['']],
    +      ],
    

    It would help a lot w/ this data provider if the keys were the variable names of what they are.

    Example:
    'not array' => [
    'expected' => TRUE,
    'conditions' => '',
    ],

  5. +++ b/core/modules/migrate/tests/src/Unit/SqlBaseTest.php
    @@ -162,9 +244,12 @@ class TestSqlBase extends SqlBase {
    +    return TRUE;
    

    I don't think a return TRUE is needed.

weekbeforenext’s picture

Worked with @daggerhart to make the recommended changes.

Status: Needs review » Needs work

The last submitted patch, 8: drupal-migrate-sql-source-conditions-3069776-8.patch, failed testing. View results
- codesniffer_fixes.patch Interdiff of automated coding standards fixes only.

weekbeforenext’s picture

Status: Needs work » Needs review

Looking at the failed test, I think the failure is unrelated to the work here. Let me know if I'm missing something.

Status: Needs review » Needs work

The last submitted patch, 8: drupal-migrate-sql-source-conditions-3069776-8.patch, failed testing. View results
- codesniffer_fixes.patch Interdiff of automated coding standards fixes only.

heddn’s picture

Re-ran the tests, I think there must be something in this patch that is causing the fails.

Version: 8.8.x-dev » 8.9.x-dev

Drupal 8.8.0-alpha1 will be released the week of October 14th, 2019, which means new developments and disruptive changes should now be targeted against the 8.9.x-dev branch. (Any changes to 8.9.x will also be committed to 9.0.x in preparation for Drupal 9’s release, but some changes like significant feature additions will be deferred to 9.1.x.). For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

alison’s picture

Question: the "type" is used as an example -- I thought we can already migrate nodes one content type at a time, can't we? Would this functionality change how that existing functionality works -- or am I mistaken and it doesn't exist yet -- or is it unrelated -- or?

weekbeforenext’s picture

@alisonjo315, I think you are right if we are using the "d7_node" source plugin.

I was originally repurposing logic used with the "table" source plugin in migrate_plus. I wasn't sure what example would make most sense for more general use.

I'll give it some more thought, but if you have an idea, feel free to update the patch or share in a comment.

Teamwork :)

alison’s picture

@weekbeforenext aaaaaaaahhhhhh ok, that makes sense, thanks for explaining -- it's been all of like, two months since I've worked on migrations, and, man it's always a surprise how many thing fall all the way out of your brain in so little time :)

I'll give it some thought, but I think it makes total sense now that you've explained it. Thank you!

alison’s picture

Re: tests -- sorry if this is stating the obvious, I'm awfully close to illiterate when it comes to unit testing, but trying to dip my toe in the water... anyway:

The one failure is in core/modules/migrate_drupal/tests/src/Kernel/d6/FieldDiscoveryTest.php (that link will take you directly to the relevant line) -- which is test in the migrate_drupal module, not directly in the migrate module:

Drupal\Tests\migrate_drupal\Kernel\d6\FieldDiscoveryTest::testGetAllFields
Failed asserting that 23 is identical to 22.

(assertSame(23, count($field_info));)

I'm not sure why it's getting a count of 23 for the field_info "parts" -- I'm not sure what in this patch would cause a different number...?

Is there something about the use of field things in this new functionality that's different in D6, when compared to the D7 source data......?? (...I only say that b/c the d7 version of the test didn't fail -- it's different, obviously, but -- here's the d7 version of testGetAllFields, just in case...)

OR... is the new functionality in this patch actually adding to the field_info structure, in the context of testing, so this line of FieldDiscoveryTest.php needs to be changed to assertSame(23, count($field_info)); .....?


There's a very good chance I'm saying nonsense, but, that's what a first go-around is I guess. One major problem is that idk how to step backwards (aka "backtrace," I suppose) from testGetAllFields() -- if anyone can explain that part to me, I can take another swing at it!)
freelock’s picture

Huh, I just stumbled on this issue from the Slack thread...

I recently implemented a few "standard" yaml config keys for a generic source plugin that I reused on 2 different sites -- with just a little more config support, I would not have even needed a custom source!

In my custom source, I implemented these additional configurations keys:

- table
- fields (list of field names)
- not_empty (list of fields to exclude rows where this field empty)
- condition (Dictionary with the field name as the key, and then value/operator fields in a map, which I think looks slightly cleaner than the example above)
- keys (dictionary of keys with type and alias)

For example, here's a source config:

source:
  id: ci_item_variation
  plugin: mas_sync
  table: CI_Item
  fields:
    - ItemCode
    - ItemCodeDesc
    - StandardUnitPrice
    - SalesUMConvFctr
    - SalesUnitOfMeasure
    - SalesUMConvFctr
    - InactiveItem
    - LastReceiptDate
  not_empty:
    - ItemCode
  condition:
    LastReceiptDate:
      value: '1970-02-01'
      operator: '>='
  keys:
    ItemCode:
      type: string
      alias: a

... I was able to do a couple dozen migrations using the same source plugin, just altering these fields -- and I was also able to use basically the same source plugin for both a MySQL and a MS SQL Server source.

So perhaps worth expanding to cover these additional items?

heddn’s picture

re #18: I think that type of a source plugin works really well in contrib plus (migrate_plus). But in core, such a plugin wouldn't be needed. We'd be better off with what we have here in core, then taking #18 and posting that as a patch to contrib.

freelock’s picture

Is it even possible to do a migration without migrate_plus :-)

Ok, given that SqlBase is a parent class of DrupalSqlBase, I can see the benefit of adding Conditions here, without the rest of my config options... and migrate_plus seems like a fine place to put a generic SqlMigration class that implements these...

Cheers,
John

daggerhart’s picture

Status: Needs work » Needs review
FileSize
7.77 KB
764 bytes

Worked this out with @weekbeforenext

The right thing to do was update the d6 FieldDiscoveryTest to look for 23 keys in the field_info.

Between #5 and #8 we made it so that $this->configuration['conditions'] has the default value of an empty array when not provided. This means that field configurations will always now have 1 more item than before this patch.

This error doesn't show up for d7 migration tests because the d7 FieldDiscoveryTest doesn't assert for a specific number of field_info keys the way that the d6 test does.

heddn’s picture

Status: Needs review » Needs work
Issue tags: -Needs tests +Novice, +Amsterdam2019

Tagging novice because the feedback is pretty easy to respond.

  1. +++ b/core/modules/migrate/src/Plugin/migrate/source/SqlBase.php
    @@ -108,6 +122,17 @@
    +    $this->configuration['conditions'] = isset($this->configuration['conditions']) ? $this->configuration['conditions'] : [];
    

    We don't need to support PHP < 7 at this point in the core release cycle. So lets switch to null coalescing operator.

  2. +++ b/core/modules/migrate/tests/src/Unit/SqlBaseTest.php
    @@ -18,6 +19,82 @@
    +    $migration = $this->getMockBuilder('Drupal\migrate\Plugin\MigrationInterface')
    ...
    +    $state = $this->getMockBuilder('Drupal\Core\State\State')
    
    @@ -61,14 +138,19 @@ public function testMapJoinable($expected_result, $id_map_is_sql, $with_id_map,
    +    $state = $this->getMockBuilder('Drupal\Core\State\State')
    

    Use Class:class syntax for greater refactoring in IDEs.

  3. +++ b/core/modules/migrate/tests/src/Unit/SqlBaseTest.php
    @@ -18,6 +19,82 @@
    +      // Source conditions is not an array.
    +      'not array' => [
    ...
    +      // Source conditions is not a multidimensional array.
    +      'not multidimensional array' => [
    ...
    +      // Source conditions condition does not specify 'field'.
    +      'field not specified' => [
    

    These comments are no longer needed now with the array key names :)

Skabbkladden’s picture

I'm looking at this at DrupalCon Amsterdam 2019

Skabbkladden’s picture

weekbeforenext’s picture

Status: Needs work » Needs review
heddn’s picture

Status: Needs review » Needs work
+++ b/core/modules/migrate/tests/src/Unit/SqlBaseTest.php
@@ -55,12 +55,12 @@
+    $migration = $this->getMockBuilder(Drupal\migrate\Plugin\MigrationInterface::class)
...
+    $state = $this->getMockBuilder(Drupal\Core\State\State::class)

@@ -139,7 +136,7 @@
+    $state = $this->getMockBuilder(Drupal\Core\State\State::class)

These should all use import statements via use instead of the full path inline.

ravi.shankar’s picture

Status: Needs work » Needs review
FileSize
7.56 KB

I have changed it.

heddn’s picture

Status: Needs review » Needs work
Issue tags: -Novice +Needs tests
FileSize
7.27 KB
2.54 KB

Here I've cleaned up the existing tests a little. But this still needs more work. Re-adding tests tag.

This still seems to be missing any tests of valid conditions. We've got edge case testing of the constructor, but not of anything else. Let's add that test coverage please.

Version: 8.9.x-dev » 9.1.x-dev

Drupal 8.9.0-beta1 was released on March 20, 2020. 8.9.x is the final, long-term support (LTS) minor release of Drupal 8, which means new developments and disruptive changes should now be targeted against the 9.1.x-dev branch. For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

benjifisher’s picture

Would the current approach on this issue make it possible for contrib modules (such as Migrate Tools) to alter the SQL query based on something other than the migration YAML? If so, then we might not need #2780839: Optimize migration of specific source IDs for SQL sources.

chandrashekhar_srijan’s picture

I am looking into this.

chandrashekhar_srijan’s picture

Status: Needs work » Needs review
FileSize
9.78 KB
3.82 KB

I have added the test cases for valid conditions. Kindly check.

joey-santiago’s picture

Would it also be possible with this patch to do a join?

for example:

we are currently migrating a site in which most likely we have files that are not used in any fields, so we would like to avoid getting unused files in. Our idea would be to join the field_data_field tables for the fields we want to keep and then migrating only the files we need. Does it make sense?

SELECT fid FROM file_managed JOIN field_data_field_attachments WHERE file_managed.fid = field_data_field_attachments.field_attachments_fid;

thanks for helping me out.

joey-santiago’s picture

Hello,

so i added some rough code that seems to provide the ability to do a join... does it make any sense? Sure, i am not adding anything to the tests for now, but i'd like to hear if the approach makes sense or not.

heddn’s picture

The addition to add joins seems reasonable.

Status: Needs review » Needs work

The last submitted patch, 35: 3069776-sqbasewithjoins-35.patch, failed testing. View results
- codesniffer_fixes.patch Interdiff of automated coding standards fixes only.

vsujeetkumar’s picture

Fixing test.

benjifisher’s picture

@vsujeetkumar:

Thanks for the fix. When uploading a new patch, remember also to update the issue status from Needs Work (NW) to Needs Review (NR). I would do that for you, but I checked the test results: although the tests pass, the testbot reports two coding-standards violations, so the issue status should still be NW.

joey-santiago’s picture

I wonder would it be too much to also add subqueries to the possibilities?

i achieved the functionality, it seems to work with this patch, but i'm a little bit doubtful.

joey-santiago’s picture

This is the use case i have in my mind for subqueries:

SELECT u.*
FROM
{users} u
WHERE (u.uid > 0) AND (uid IN (SELECT node.uid AS uid
FROM
{node} node
WHERE type <> 'my_unwanted_type'));

So i want to migrate only users who own nodes excluding some type. The problem is this would also need a condition in the subquery... Dunno, is it too complicated? Am i the only one having this kind of needs?

About joins, an example is:

source:
  plugin: d7_file
  scheme: private
  joins:
    -
      table: 'field_data_field_file'
      condition: 'f.fid=t.field_file_fid'
      alias: 't'

This way i am migrating only files that are used in a certain field. I think it makes sense making these migrations more 'self contained' in stead of migrating a million files at once.

Thank you for you effort on this, i think this saved me a lot of headaches!

joey-santiago’s picture

And here's the new patch. This allows pretty easily to, for example, get in only users who are owners of published nodes of type page:

subqueries:
    -
      table: node
      field: uid
      mainfield: uid
      options: distinct
      operator: 'IN'
      conditions:
        -
          field: type
          value: 'page'
          operator: '='
        -
          field: status
          value: 1
          operator: '='

Again, i think the main point is tailoring the migration to only what needs to be ported to the new site instead of getting everything carried over.

joey-santiago’s picture

Sorry, the patch was wrong, here's a better one. checks are a little bit tricky to do :).

joey-santiago’s picture

Ouch!
big question.
It seems that the total amount of rows for the migration is not calculated on the query with conditions, joins and subqueries added to it, so this makes it impossible to require this migration from another one.
example:

----------------------- -------------- -------- ------- ---------- ------------- --------------------- 
  Group                   Migration ID   Status   Total   Imported   Unprocessed   Last Imported        
 ----------------------- -------------- -------- ------- ---------- ------------- --------------------- 
  Import from SITE   site_user       Idle     9829    14         9815          2020-09-10 07:18:27

the thing is i very happily migrated only the 14 users that own nodes and i really don't want to take in the other 9815.
Hacking the count function this way, the counter is fixed, but will i break something else?

public function count($refresh = FALSE) {
  return (int) $this->prepareQuery()->countQuery()->execute()->fetchField();
}
joey-santiago’s picture

Added patch for the last comment: seems to solve the counter issue.

joey-santiago’s picture

Adding the option for a distinct to be set from the joins settings.

hkirsman’s picture

I was having warnings.

 [warning] Invalid argument supplied for foreach() SqlBase.php:350
 [warning] Invalid argument supplied for foreach() SqlBase.php:358
 [warning] Invalid argument supplied for foreach() SqlBase.php:350
 [warning] Invalid argument supplied for foreach() SqlBase.php:358

Adding patch to check if the variables exists.

shaktik’s picture

Status: Needs work » Needs review
FileSize
13.91 KB
1.01 KB

Please ignore interdiff_42-44.txt file.
correct interdiff_47-48 below.
fixed testcase, kindly check.

index f4af3d508d..673b4dc9ac 100644
--- a/core/modules/migrate_drupal/tests/src/Kernel/d6/FieldDiscoveryTest.php
+++ b/core/modules/migrate_drupal/tests/src/Kernel/d6/FieldDiscoveryTest.php
@@ -283,7 +283,7 @@ public function testGetAllFields() {
     foreach ($actual_fields['node'] as $bundle => $fields) {
       foreach ($fields as $field_name => $field_info) {
         $this->assertArrayHasKey('type', $field_info);
-        $this->assertCount(23, $field_info);
+        $this->assertCount(25, $field_info);
         $this->assertEquals($bundle, $field_info['type_name']);
       }
     }
xurizaemon’s picture

I don't want to cool your contribution effort @joey-santiago and my comments are that of just another Drupal user here :)

In my opinion while simple filters are a good fit for Migrate core (clearly expressed in Migrate yaml), more complex examples like your subquery in #41 are better suited to a custom Migrate source plugin which extends the existing source plugin and modifies the query there.

That keeps Migrate's yaml and docs simple and clear, and still offers a flexible (more readily flexible, IMO) interface for people to customise in filters like "only users with a current account ... who've logged in in the last two years ... and who made a donation at least once ... and ..."

My 2c, feel free to keep at it if you / others feel worthwhile. Just my opinion here :)

Version: 9.1.x-dev » 9.2.x-dev

Drupal 9.1.0-alpha1 will be released the week of October 19, 2020, which means new developments and disruptive changes should now be targeted for the 9.2.x-dev branch. For more information see the Drupal 9 minor version schedule and the Allowed changes during the Drupal 9 release cycle.

quietone’s picture

Issue summary: View changes
Status: Needs review » Needs work
Issue tags: +Needs issue summary update

Yes, I agree with xurizaemon, I don't think we should be adding sub-queries in core. Let's keep this simple.

That means this needs to return to the patch in #38 and bring it up to date for 9.2.x. It will need coding standard fixes see comment #39.

This also needs an issue summary update to indicate what is actually being changed here. Currently it is a list of ideas.

I reckon this is going to need a change record as well.

vsujeetkumar’s picture

Status: Needs work » Needs review
FileSize
10.67 KB
287 bytes

Fixed the two coding standard issue as mentioned in #51 and give the interdiff using #38 patch

quietone’s picture

Status: Needs review » Needs work

@vsujeetkumar, thanks.

I reviewed the patch but not the test. Mostly doc and comment changes.

  1. +++ b/core/modules/migrate/src/Plugin/migrate/source/SqlBase.php
    @@ -29,6 +29,25 @@
    + * - conditions: (optional) Conditions that should be included in the query.
    + *   This should be in array format with each array item providing values for
    + *   field, value (optional) and operator (optional). Defaults to an empty
    

    This should be a statement of what conditions are not what they should be. For example, 'Conditions to add to the query'.

  2. +++ b/core/modules/migrate/src/Plugin/migrate/source/SqlBase.php
    @@ -29,6 +29,25 @@
    + *   Example:
    

    Add a new line here to separate the Example from the description.

  3. +++ b/core/modules/migrate/src/Plugin/migrate/source/SqlBase.php
    @@ -29,6 +29,25 @@
    + *     joins:
    

    This, 'joins' needs documentation above.

  4. +++ b/core/modules/migrate/src/Plugin/migrate/source/SqlBase.php
    @@ -29,6 +29,25 @@
    + *   @endcode
    

    There needs to be an explanation of what this example does. What are the results when using these conditions and joins?

  5. +++ b/core/modules/migrate/src/Plugin/migrate/source/SqlBase.php
    @@ -108,6 +127,28 @@
    +      throw new \InvalidArgumentException('Source "conditions" should be an array of conditions including field, value (optional), and operator (optional) values.');
    ...
    +      throw new \InvalidArgumentException('Source "joins" should be an array of conditions including table, alias and condition.');
    

    The words 'conditions' and 'joins' should be in single quotes.

  6. +++ b/core/modules/migrate/src/Plugin/migrate/source/SqlBase.php
    @@ -108,6 +127,28 @@
    +        throw new \InvalidArgumentException('Source condition must be an array including table, alias and condition.');
    

    I think that condition here is meant to be join.

  7. +++ b/core/modules/migrate/src/Plugin/migrate/source/SqlBase.php
    @@ -231,10 +272,10 @@ protected function select($table, $alias = NULL, array $options = []) {
    +   * Adds tags, metadata, and conditions to the query.
    

    It also add joins. How about 'Adds tags, metadata, and configured conditions and joins to the query.'

    Similar changes in the @return description.

  8. +++ b/core/modules/migrate/src/Plugin/migrate/source/SqlBase.php
    @@ -242,6 +283,16 @@ protected function prepareQuery() {
    +    // Add conditions to the query.
    

    Still want to highlight that the conditions and joins are configuration. So, 'Add any configured conditions.' And similar for the joins comment a few lines below

  9. +++ b/core/modules/migrate_drupal/tests/src/Kernel/d6/FieldDiscoveryTest.php
    @@ -283,7 +283,7 @@ public function testGetAllFields() {
    -        $this->assertCount(22, $field_info);
    +        $this->assertCount(24, $field_info);
    

    Why has this changed?

Also, this still needs an issue summary update. At minimum, update the proposed resolution.

ayushmishra206’s picture

Assigned: Unassigned » ayushmishra206
ayushmishra206’s picture

Assigned: ayushmishra206 » Unassigned
Status: Needs work » Needs review
FileSize
3.53 KB
10.79 KB

Addressed:
53.1
53.2
53.5
53.7
53.8
Please review.

The remaining changes are:
53.3
53.4
53.6
53.9

Status: Needs review » Needs work

The last submitted patch, 55: 3069776_55.patch, failed testing. View results

quietone’s picture

+++ b/core/modules/migrate/src/Plugin/migrate/source/SqlBase.php
@@ -131,7 +132,7 @@ public function __construct(array $configuration, $plugin_id, $plugin_definition
+      throw new \InvalidArgumentException('Source 'conditions' should be an array of conditions including field, value (optional), and operator (optional) values.');

@@ -141,7 +142,7 @@ public function __construct(array $configuration, $plugin_id, $plugin_definition
+      throw new \InvalidArgumentException('Source 'joins' should be an array of conditions including table, alias and condition.');

The outer quotes need to be double quotes.

vsujeetkumar’s picture

Updated outer quotes as mentioned in #57.

vsujeetkumar’s picture

Status: Needs work » Needs review
vsujeetkumar’s picture

FileSize
1.53 KB

Interdiff added, Please have a look.

quietone’s picture

Status: Needs review » Needs work
+++ b/core/modules/migrate/src/Plugin/migrate/source/SqlBase.php
@@ -242,6 +284,16 @@ protected function prepareQuery() {
+    // Add any configured conditions.

Looks like this was missed from 53.8. This is configuring joins, not conditions.

Setting back to NW for #53: 3, 4, 6 and 9.

This also needs an issue summary update, see Write an issue summary for an existing issue for guidance on doing that.

alison’s picture

@quietone I can try to update the issue summary -- before I dive in, is it because of formatting/structure issue, or content/substance?

Version: 9.2.x-dev » 9.3.x-dev

Drupal 9.2.0-alpha1 will be released the week of May 3, 2021, which means new developments and disruptive changes should now be targeted for the 9.3.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Matroskeen made their first commit to this issue’s fork.

Matroskeen’s picture

Assigned: Unassigned » Matroskeen
Issue tags: +Needs change record

I'll try to review and address the remaining points. In the meantime, feel free to edit the issue summary.

We would also need to add a change record describing a new feature - adding a tag.

quietone’s picture

@alisonjo315, thanks.

I default to thinking in lists, so this will be brief.

  • Problem/Motivation - this fine, it is easy to understand.
  • Proposed resolution - This shows several options as the resolution. If there has been agreement on the option then this should show just that and I like to link to the comment where the approach was agreed to. That is really helpful on very long issues. This section is what the reviewer and committer compare the code with to determine if the patch is doing what it should.
  • Remaining tasks - just check that this is up to date
  • User interface changes - The boilerplate code needs to be removed. I often add a N/A to indicate that I did look at the section and other times I delete it.
  • API changes - Same as the 'User interface changes' section
  • Data model changes - Same as the 'User interface changes' section
  • Release notes snippet - Same as the 'User interface changes' section

Matroskeen’s picture

@quietone the answer for #53.9 is in #21.


I opened a merge request with some changes on top of #58:
1) Improved documentation and came up with some real examples (at least I hope so);
2) Added a bit more superpower by switching from join() to addJoin(), which can be used to specify a type of join. I think it mght be helpful in some cases and defaults to INNER, which is the same as join();
3) Improved SqlBase constructor a bit;

I think we need more tests with real source plugins. Is it fine to add more test cases to the existing source plugin tests like Drupal\Tests\node\Kernel\Plugin\migrate\source\d7\NodeTest?

The other important thing is the issue mentioned in #44. Currently, the conditions are not affecting the source count, which is gonna be confusing.

If I have 10 nodes in D7 site, and I'm adding a condition, I'm still seeing 10 as Total, which makes me think conditions don't work (but they actually do).

I don't want to postpone this issue on #2833060: SqlBase::prepareQuery() should be called also on count, but I think we cannot introduce this feature without fixing the source count.

Leaving "needs work" for:
1) Issue summary update;
2) More tests;
3) Change record;

Matroskeen’s picture

Assigned: Unassigned » Matroskeen
Matroskeen’s picture

Assigned: Matroskeen » Unassigned

Added distinct option support and borrowed example from this issue: #3057685: Plugin to only migrate users with certain roles.
Still needs work for items mentioned in #68.

Matroskeen’s picture

Title: SQL source plugins: allow defining conditions in migration yml » SQL source plugins: allow defining conditions and join in migration yml
Issue summary: View changes
Issue tags: -Needs issue summary update

Updated the issue summary.

Liam Morland made their first commit to this issue’s fork.

Matroskeen’s picture

Looking at #3063778: Source plugin to only pull in certain URL aliases, we may consider adding support for condition groups. Thoughts?

Matroskeen’s picture

Version: 9.3.x-dev » 9.4.x-dev

Drupal 9.3.0-rc1 was released on November 26, 2021, which means new developments and disruptive changes should now be targeted for the 9.4.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

quietone’s picture

Discussed at #3253983: [meeting] Migrate Meeting 2021-12-16 2100Z. mikelutz pointed out that this would reduce the number of source plugins in core. For myself, this needs more testing. I think testing using an existing Migration Kernel test as a template and using the various conditions available. Also tests proving that this works when someone uses the new conditions as well as existing configuration properties that alter the query. One example is 'node_type' in Node.php.

HitchShock made their first commit to this issue’s fork.

HitchShock’s picture

Applied the same behaviour to countQuery().
When we use conditions/joins/distinct the command drush migrate-status doesn't take this into account to get the Total value.
This means that the Total count will always be greater than the one that will actually be migrated: Total > Imported. Thus, in the end, we will still have the Unprocessed items. This in turn will mean that the migration is still incomplete.
This is very critical if such a configuration is a required dependency for another configuration.

Matroskeen’s picture

@HitchShock I totally agree that this issue should be fixed before we land this feature.
For this purpose, there is another issue: #2833060: SqlBase::prepareQuery() should be called also on count

I'm glad that you came up with the same approach. We also have a test coverage over there, so you can take a look.

I think we can keep these changes in the merge request, but eventually, we'll have to revert when #2833060 is committed.

HitchShock’s picture

Allowed to add extra fields to the query. This is useful when using joins - we can retrieve an extra migration source from the joined table, e.g. when the joined table is a custom one.

dww made their first commit to this issue’s fork.

quietone’s picture

This was discussed at #3267552: [meeting] Migrate Meeting 2022-03-10 2100Z and I just want to ask the question that mikelutz raised, sort of. Why do this in prepareQuery instead of initializeIterator?

nsciacca’s picture

With the patch from #58 I get these warnings on some pages, I'm sure it's a conflict with other source plugins I'm running for Field Collections to Paragraphs, but it would be nice if there were checks around the loops to see if the conditions / joins existed first.

Notice: Undefined index: conditions in Drupal\migrate\Plugin\migrate\source\SqlBase->prepareQuery() (line 290 of core/modules/migrate/src/Plugin/migrate/source/SqlBase.php).
Warning: Invalid argument supplied for foreach() in Drupal\migrate\Plugin\migrate\source\SqlBase->prepareQuery() (line 290 of core/modules/migrate/src/Plugin/migrate/source/SqlBase.php).
Notice: Undefined index: joins in Drupal\migrate\Plugin\migrate\source\SqlBase->prepareQuery() (line 295 of core/modules/migrate/src/Plugin/migrate/source/SqlBase.php).
Warning: Invalid argument supplied for foreach() in Drupal\migrate\Plugin\migrate\source\SqlBase->prepareQuery() (line 295 of core/modules/migrate/src/Plugin/migrate/source/SqlBase.php).

Throwing it out here in case anyone can make the change easily, otherwise I'll add to the existing work when I get a free moment.

xurizaemon’s picture

As Gitlab MR URLs are subject to unexpected changes (ref #3204538), I'm attaching a patch copy of MR 849 as at f4c123cf.

xurizaemon’s picture

*sigh* and a copy of an earlier MR state which applies against Drupal 9.2 for reasons. Sorry for the noise -.-

Version: 9.4.x-dev » 9.5.x-dev

Drupal 9.4.0-alpha1 was released on May 6, 2022, which means new developments and disruptive changes should now be targeted for the 9.5.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Matroskeen’s picture

@quietone:

This was discussed at #3267552: [meeting] Migrate Meeting 2022-03-10 2100Z and I just want to ask the question that mikelutz raised, sort of. Why do this in prepareQuery instead of initializeIterator?

We want to do this in prepareQuery method to reflect this information in the source count method, which calls only query() for now. In another issue, we're trying to make it smarter and call prepareQuery(): #2833060: SqlBase::prepareQuery() should be called also on count.

mikelutz’s picture

We should not be doing this in prepareQuery(). prepareQuery should only be used to add tags and metadata to the query. There should be nothing in prepareQuery() to affect the count. (Adding a tag that later causes the query to be altered in a query alter hook is a separate consideration, which I will comment on in #2833060: SqlBase::prepareQuery() should be called also on count

The addition of these conditions to the query should be offloaded to a protected method on SqlBase, and if they exist that method should be called in both initializeIterator, and count (or doCount, I guess)

Matroskeen’s picture

We should not be doing this in prepareQuery(). prepareQuery should only be used to add tags and metadata to the query.
There should be nothing in prepareQuery() to affect the count.

Why? It is common to add query tags and then apply some conditions:

function hook_query_alter(Drupal\Core\Database\Query\AlterableInterface $query) {
  if ($query->hasTag('something')) {
    $query->condition('do', 'something');
  }
}

Why wouldn't it affect the count if the query is being modified this way?

The addition of these conditions to the query should be offloaded to a protected method on SqlBase, and if they exist that method should be called in both initializeIterator, and count (or doCount, I guess)

I don't mind moving this into another method, but why wouldn't we call it in prepareQuery()? In other words, why do we need to be able to call query() without prepareQuery()?

When I was looking into it, it felt like query should be even protected and called only within prepareQuery().

mikelutz’s picture

I'm still going back and forth on whether/how to support query alters in the count function. Clearly, in the original design the concept of tag alters affecting the count was not considered. Ultimately, you may be right, there may be little choice but to include prepareQuery in count and toString, and everywhere else that query is called. Or we may document that tag alters on the query aren't supported in count (which I also agree, seems wrong. They should be included somehow). My main concern is that there are so many places to override this query that it becomes much too easy to override the query in multiple places in incompatible ways. This is already an issue with the highwater implementation in initialzeIterator being incompatible with queries that have sorts, for example.

While I don't know what the ultimate roles of query, prepareQuery, SqlBase:initializeIterator, and tag alters will be, or if we will change these around to better define their roles, I think that at the end of the day there is more in this system to clean up than just simply adding prepareQuery to count, and whatever we do needs to be thought through and designed and named in a better way than it is now. At that point, it's likely that prepareQuery will play the role of a precount_query_alter and the stuff currently in initializeIterator should go in a sort of postcount_query_alter (conditions used for filtering unneeded rows in a particular run while keeping the actual count unaffected).

Regardless of that bigger discussion and ultimate resolution, it's not what we currently have. Core uses no tag alters, and a case can be made that if you make a tag alter on a migration query, you are responsible for adjusting or extending your source to override your count function if you care about it. prepareQuery is documented to only be for tags and metadata and is not supposed to add conditions to affect the count directly, and at the moment and until we settle on a new organization, I think we need to respect that distinction in this issue. Keeping the conditions in a separate method will allow us to move them into a precount query alter later easily enough if that's the route we go, but for the purposes of this issue, I think having initializeIterator and doCount being aware of and processing them is the right way to go.

jwilson3’s picture

I found this issue by searching for "migrate left join" on Google.

I just want to clarify that this approach does not allow you to magically reference the fields left joined to the migration source: clause from within the migration's process: clause. For this you still need to implement custom Process plugins and execute any necessary database joins again on every row.

Version: 9.5.x-dev » 10.1.x-dev

Drupal 9.5.0-beta2 and Drupal 10.0.0-beta2 were released on September 29, 2022, which means new developments and disruptive changes should now be targeted for the 10.1.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

tatianag’s picture

As Gitlab MR URLs are subject to unexpected changes (ref #3204538), I'm attaching a patch copy of MR 849 as at f4c123cf.

- This commit is excluded from the updated patch: 7b72898b Applied the same bahaviour for countQuery()
- The change in function doCount() was already applied in #2833060: SqlBase::prepareQuery() should be called also on count
- Otherwise, patch for #3069776 and patch for #2833060 don't work together as I'm getting the reject error on that 7b72898b commit.

Version: 10.1.x-dev » 11.x-dev

Drupal core is moving towards using a “main” branch. As an interim step, a new 11.x branch has been opened, as Drupal.org infrastructure cannot currently fully support a branch named main. New developments and disruptive changes should now be targeted for the 11.x branch, which currently accepts only minor-version allowed changes. For more information, see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

alison’s picture

I know this is both old and has up-in-the-air elements to it, BUT, I successfully applied MR #849 to a Drupal 10.1 10.2 site, and then successfully migrated users of just one role. (I only applied the changes in MR #849, I didn't grab anything from #2833060.) (EDIT: I forgot we're on 10.2 now; just fixed the core version in my note.)

I did get warnings when I ran drush mim, but it still ran successfully -- but I'll share them anyway:

 [warning] Undefined array key "conditions" SqlBase.php:365
 [warning] foreach() argument must be of type array|object, null given SqlBase.php:365
 [warning] Undefined array key "joins" SqlBase.php:370
 [warning] foreach() argument must be of type array|object, null given SqlBase.php:370
 [warning] Undefined array key "fields" SqlBase.php:375
 [warning] foreach() argument must be of type array|object, null given SqlBase.php:375
 [warning] Undefined array key "conditions" SqlBase.php:365
 [warning] foreach() argument must be of type array|object, null given SqlBase.php:365
 [warning] Undefined array key "joins" SqlBase.php:370
 [warning] foreach() argument must be of type array|object, null given SqlBase.php:370
 [warning] Undefined array key "fields" SqlBase.php:375
 [warning] foreach() argument must be of type array|object, null given SqlBase.php:375

My very rough take on those warnings is that they mean my conditions syntax could/should be better. (I copied from one of the code examples in SqlBase.php, but maybe I certainly may have misinterpreted it.)

Speaking of which, here's the "source" section of my user migration YML:

source:
  plugin: d7_user
  joins:
    -
      table: users_roles
      alias: ur
      condition: u.uid = ur.uid
  conditions:
    -
      field: ur.rid
      value: [6]
      operator: IN
  distinct: TRUE