Problem motivation

Follow-up to #2933620: Polish migrate.api.php API documentation.

Stubbing concept was discussed from comment #48 onwards. It was concluded in #54 that we should move the stubbing concept under its own sub heading and include also other fundamental key concepts such as rollbacks and updates / incremental migrations.

Proposed solution

Document also the other key migrate concepts.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

masipila created an issue. See original summary.

masipila’s picture

Version: 8.6.x-dev » 8.5.x-dev
Status: Active » Needs review
FileSize
3.58 KB

Here's the first patch for review. It covers the following key concepts:

  • Stubs
  • Map tables
  • Highwater marks
  • Rollbacks

Since this is documentation change only, I'm proposing that it would be cherry picked to 8.5.x.

Cheers,
Markus

phenaproxima’s picture

Status: Needs review » Needs work

Magnificent work. I only request a few minor verbiage changes, then this is good to go.

  1. +++ b/core/modules/migrate/migrate.api.php
    @@ -82,6 +79,39 @@
    + * parent term has not yet been migrated. Migrate API addresses this 'chicken or
    + * the egg' dilemma by creating a stub term for the parent so that the child
    

    Should be 'chicken and egg'.

  2. +++ b/core/modules/migrate/migrate.api.php
    @@ -82,6 +79,39 @@
    + * source ID and the hash in the map allow for tracking changes for continuous
    

    Let's remove 'in the map'.

  3. +++ b/core/modules/migrate/migrate.api.php
    @@ -82,6 +79,39 @@
    + * establishing relations between records.
    

    'relations' should be 'relationships'.

  4. +++ b/core/modules/migrate/migrate.api.php
    @@ -82,6 +79,39 @@
    + * only content that has been created or updated in the source since the
    

    Let's change 'content' to 'data'.

  5. +++ b/core/modules/migrate/migrate.api.php
    @@ -82,6 +79,39 @@
    + * migrated so far. If the source system has for example a timestamp that
    + * indicates when a node was created or last updated, that timestamp could be
    + * used as a highwater property. If we execute the migration again, only those
    

    Let's rephrase this whole sentence: "For example, a timestamp property that indicates when a row of data was created or last updated would make an excellent highwater property."

  6. +++ b/core/modules/migrate/migrate.api.php
    @@ -82,6 +79,39 @@
    + * nodes that have a higher timestamp than in the previous migration would be
    

    'nodes' should be 'rows'; this documentation doesn't need to be Drupal-specific.

  7. +++ b/core/modules/migrate/migrate.api.php
    @@ -82,6 +79,39 @@
    + * It is quite typical that when developing a migration, the first version does
    

    Let's rephrase: "When developing a migration, it is quite typical..."

  8. +++ b/core/modules/migrate/migrate.api.php
    @@ -82,6 +79,39 @@
    + * migration, adjust the migration and then execute it again.
    

    Can this be slightly rephrased? "...undo a migration, then execute it again after adjusting it."

heddn’s picture

  1. +++ b/core/modules/migrate/migrate.api.php
    @@ -82,6 +79,39 @@
    + * parent term has not yet been migrated. Migrate API addresses this 'chicken or
    + * the egg' dilemma by creating a stub term for the parent so that the child
    

    This is more typically called "Chicken or egg" in English. Ahh, I see that @phenaproxima saw the same thing.

  2. +++ b/core/modules/migrate/migrate.api.php
    @@ -82,6 +79,39 @@
    + * term can establish a reference to it. When the parent term is eventually
    + * being migrated, Migrate API updates the previously created stub with the
    

    is eventually migrated. Loose the word "being", I think it will read smoother.

masipila’s picture

Status: Needs work » Needs review
FileSize
3.53 KB
3.31 KB

This API doc improvement issue was left behind when things went crazy in my day job. Let's finalize this now so that we have one issue less.

This patch should address all feedback from @phenaproxima and @heddn.

Cheers,
Markus

heddn’s picture

Status: Needs review » Needs work
+++ b/core/modules/migrate/migrate.api.php
@@ -82,6 +79,38 @@
+ * Once a migrated row is saved and the destination ID is known, Migrate API
+ * saves the source ID, destination ID and the row hash into a map table. The
+ * source ID and the hash allow tracking changes for continuous migrations.

There might be more than one source or destination id. So we need to account for plurality. I think this is important to document as folks get confused and think they can only use a single id. Which isn't correct, multiple are supported.

Version: 8.5.x-dev » 8.6.x-dev

Drupal 8.5.6 was released on August 1, 2018 and is the final bugfix release for the Drupal 8.5.x series. Drupal 8.5.x will not receive any further development aside from security fixes. Sites should prepare to update to 8.6.0 on September 5, 2018. (Drupal 8.6.0-rc1 is available for testing.)

Bug reports should be targeted against the 8.6.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.7.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.6.x-dev » 8.8.x-dev

Drupal 8.6.x will not receive any further development aside from security fixes. Bug reports should be targeted against the 8.8.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.9.x-dev branch. For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

Version: 8.8.x-dev » 8.9.x-dev

Drupal 8.8.7 was released on June 3, 2020 and is the final full bugfix release for the Drupal 8.8.x series. Drupal 8.8.x will not receive any further development aside from security fixes. Sites should prepare to update to Drupal 8.9.0 or Drupal 9.0.0 for ongoing support.

Bug reports should be targeted against the 8.9.x-dev branch from now on, and new development or disruptive changes should be targeted against the 9.1.x-dev branch. For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

quietone’s picture

Version: 8.9.x-dev » 9.2.x-dev
Status: Needs work » Needs review
FileSize
2.08 KB
3.54 KB

Using plurals as suggested in #6.

Also change a sentence in the highwater section to remove use of 'we'.

jibran’s picture

Just one question

+++ b/core/modules/migrate/migrate.api.php
@@ -82,6 +79,38 @@
+ * so far. For example, a timestamp property that indicates when a row of data
+ * was created or last updated would make an excellent highwater property. If we

How can one set the current timestamp as hwm in the migration template? Maybe we can add a code example here.

quietone’s picture

FileSize
1.82 KB
3.6 KB

Wow, that question would never occur to me! I don't think a code example is needed because the only thing the user needs to do is declare the high_water_property. I have reworded that section, hoping to emphasize that point.

jibran’s picture

The rewording looks good now. About the example, I had to refresh my memory about hwm usage by reading \Drupal\Tests\migrate\Kernel\HighWaterTest and high_water_test.yml. :D

I think we can say something like, e.g. for D7 node migration 'changed' source property can be used to set up the highwater mark

source:
  plugin: d7_node
  high_water_property:
    name: changed
quietone’s picture

FileSize
756 bytes
3.79 KB

Example added.

jibran’s picture

+++ b/core/modules/migrate/migrate.api.php
@@ -107,6 +107,15 @@
+ *   plugin: source_plugin
...
+ *     name: changed

Without the source plugin 'd7_node' 'changed' is missing the context here.

+++ b/core/modules/migrate/migrate.api.php
@@ -82,6 +79,48 @@
+ * In this example, the row property 'changed' is the high_water_property.

How about
In this example, the row property 'changed' is the high_water_property. The row property 'changed' contains the last updated timestamp for each row which will be used to setup highwater mark.
or something along these lines.

quietone’s picture

Thanks. I did try something like that extra explanation but failed (have a headache which is distracting) but I could work with your suggestions and do something useful.

jibran’s picture

Thank you for updating the patch. It looks good now.

jhodgdon’s picture

@jibran asked me to review this patch for wording/clarity/proofreading. I think it is looking great! It is very clear and it seems like a great addition to the documentation.

Here are 4 very small suggestions:

a)

+ * saves the source IDs, destination IDs and the row hash into a map table. The

The Drupal project uses serial commas: like "apples, oranges, and bananas" (comma before "and"). So there needs to be a comma after "IDs" in this line.

b) next line down:

+ * saves the source IDs, destination IDs and the row hash into a map table. The
+ * source IDs and the hash allow tracking changes for continuous migrations.

For that sentence, how about "facilitate" instead of "allow"?

c)

+ * In this example, the row property 'changed' is the high_water_property. If
+ * the value of 'changed' is greater than the current highwater mark the row
+ * is processed and the value of the highwater mark is updated to the value
+ * 'changed'.

Suggest putting "of" before that last word "changed".

d)

+ * When developing a migration, it is quite typical that the first version does
+ * not provide correct results for all scenarios. Rollbacks allow you to undo a
+ * migration and then execute it again after adjusting it.

Maybe instead of the word "scenarios" here, substitute something like "migrated data"?

quietone’s picture

@jhodgdon, thanks for taking the time to review this.

It is very clear and it seems like a great addition to the documentation.

Credit goes to masipila, who made great documentation contributions to Migrate.

#18 a, b, c, and d) Fixed as suggested. For d) I wondered if it should be 'source data' instead of 'migrated data'. In the end I decided that either get the meaning across so used the suggestion by the more experienced writer.

jhodgdon’s picture

Thanks! I will not mark it RTBC since I am not sure about the technical correctness, but to me it looks great.

jibran’s picture

Status: Needs review » Reviewed & tested by the community

Thank you @jhodgdon for the review. Yeah, the patch is ready IMO.

quietone’s picture

@jibran and @jhodgdon, thanks for the quick reviews. Feels good to get a 3 years old issue to RTBC in two days.

Status: Reviewed & tested by the community » Needs work

The last submitted patch, 19: 2944846-19.patch, failed testing. View results

quietone’s picture

Status: Needs work » Reviewed & tested by the community

Random test failure, on Media_library.Drupal\Tests\media_library\FunctionalJavascript\WidgetUploadTest #3066447: [random test failure] Random failures building media library form after uploading image (WidgetUploadTest)

Setting back to RTBC

Matroskeen’s picture

Status: Reviewed & tested by the community » Needs review

Great job here! I really like documentation improvements, especially for migrate API.

I checked the proposed changes via Grammarly and it suggested replacing "When a term is being migrated, it is possible that its parent term has not yet been migrated." with something more simple. Can it be something like "When a term is being migrated, its parent term can be not migrated yet."?

jhodgdon’s picture

Status: Needs review » Reviewed & tested by the community

That suggestion changes the meaning of the sentence and it wouldn't make any sense at all with the remaining text, so I think we need to leave it as it is.

  • catch committed 589f814 on 9.2.x
    Issue #2944846 by quietone, masipila, jibran, jhodgdon, heddn,...

  • catch committed dc6ffa6 on 9.1.x
    Issue #2944846 by quietone, masipila, jibran, jhodgdon, heddn,...
catch’s picture

Version: 9.2.x-dev » 9.1.x-dev
Status: Reviewed & tested by the community » Fixed

Good improvements.

Committed/pushed to 9.2.x and cherry-picked to 9.1.x, thanks!

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.