Problem/Motivation

Currently, the destination bundle is set via a process plugin default_value (or some such manner). This results in some very nasty things and the near impossibility of determining what is the eventual destination bundle. This blocks both core (#2630732: Implement Entity::fields() for migration destinations and contrib (#2642612: Create Entity Lookup & Generate process Base plugins) from determining the destination bundle.

The destination bundle currently can be set in a prepareRow, with a default_value process plugin or a static value.

Proposed resolution

The destination plugin should accept the destination bundle as so:
entity:entity_type:bundle (e.g. entity:node:article)

I'd like to propose that if no bundle is provided, then an exception is thrown (or some rather strong warning)

We still need to account for the possibility of the destination bundle being dynamically set in the process phase, but 95% of the time it should be set in the destination plugin. Use case for proces plugin: term migration that is splitting up a source data of 'tags' into multiple destination vocabularies.

Remaining tasks

Some of the logic in #2642612: Create Entity Lookup & Generate process Base plugins could be stolen and used here to jump start things.

User interface changes

API changes

Yes, add a bundle to the destination plugin definition in the yml
Provide a getter (and associated interface updates) for the bundle. Probably should add an isBundleable() for good measure.

Data model changes

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

heddn created an issue. See original summary.

heddn’s picture

Priority: Normal » Major

Bumping to major since this is blocking some fairly sizable follow-up issues.

heddn’s picture

Charlotte17’s picture

Added some notes on issues this is blocking. There are some fairly significant core and contrib tasks that are blocked so assigning the critical tag.

vasi’s picture

Would it make sense to make 'bundle' another attribute of the destination? I'm not sure we really want to further overload the name of the destination.

xjm’s picture

Thanks @Charlotte17 and @heddn! Discussed with @alexpott, @mikeryan, et al at the Migrate triage sprint and we agreed to keep this as a Migrate critical for now since itis blocking so much other work for the API. It may not be a critical bug for the core migrations per se but it is definitely worth keeping the visibility.

mikeryan’s picture

Working on this now. In #5 vasi suggests an alternative, so let's compare.

destination:
  plugin: entity:node:article

vs.

destination:
  plugin: entity:node
  bundle: article

I have a mild preference for the former, I think vasi's preference for the latter is also mild - any other opinions?

mikeryan’s picture

Status: Active » Needs review
FileSize
4.56 KB

Having weird things happen in apparently-unrelated areas when trying to test manually, let's see what the testbot does with this.

mikeryan’s picture

Status: Needs review » Needs work

Looking more carefully, I was missing the role of the MigrateEntity derivative - vasi's idea is looking better to me now, we shouldn't mess with the destination plugin ID...

The last submitted patch, 8: destination_bundle_set-2700581-8.patch, failed testing.

mikeryan’s picture

Status: Needs work » Needs review
Issue tags: +Needs tests
FileSize
3.85 KB

I think there's some other broken stuff getting in the way of my manual testing, but here's the basic patch. Thinking it might be nice if the derivers would configure the destination plugins with the bundles...

Status: Needs review » Needs work

The last submitted patch, 11: destination_bundle_set-2700581-11.patch, failed testing.

mikeryan’s picture

Status: Needs work » Needs review
FileSize
3.89 KB
566 bytes

Well, this should be a bit better...

heddn’s picture

An important test is to see if we can still pass the destination bundle from the source or in a process step.

mikeryan’s picture

That is being tested - most of the migrations in core are still setting the bundle in the process pipeline (e.g. type: type).

When/where/how would the destination bundle be set in the source?

mikeryan’s picture

Status: Needs review » Needs work

OK, time to add explicit tests. If I do have the derivers make use of this, maybe we should make explicit a test for the type:type pattern...

+++ b/core/modules/migrate/src/Plugin/migrate/destination/Entity.php
@@ -6,6 +6,7 @@
+use Drupal\migrate\MigrateException;

@@ -82,6 +83,8 @@ public static function create(ContainerInterface $container, array $configuratio
+   * @throws \Drupal\migrate\MigrateException

Leftover from my attempt to implement plugin: entity:type:bundle (throwing an exception if malformed), I will remove.

mikeryan’s picture

Status: Needs work » Needs review
FileSize
6.2 KB
2.98 KB

Actually, first, let's see what (if anything) breaks if the node derivers set the bundle in the destination...

vasi’s picture

This looks good so far! I don't think you missed any relevant content types.

+++ b/core/modules/migrate/src/Plugin/migrate/destination/Entity.php
@@ -112,6 +122,10 @@ protected function getEntity(Row $row, array $old_destination_id_values) {
+      if ($bundle = $this->getEntityBundleId()) {
+        $row->setDestinationProperty($this->getKey('bundle'), $bundle);
+      }
+

This will override anything set by process. We should decide if that's the behaviour we want, instead of eg: throwing if they conflict. We should also document this behaviour.

What tests do you envision needing? Maybe:

  • Unit: can set bundle as a destination configuration
  • Unit: can still set bundle via process plugins
  • Unit: behviour if process/config values conflict
  • Integration tests for the above should be already covered by the existing migrate tests
  • Since stubbing is one of the use cases, maybe ensure that's covered?
  • Are there any existing tests that need to be updated?
  • Anything else?
heddn’s picture

Status: Needs review » Needs work

I think we are missing vocabulary terms? They have a vid.

vasi’s picture

@heddn, there's only one migration for terms from all vocabularies. The vid has to come from the process in that case.

vasi’s picture

Status: Needs work » Needs review
mikeryan’s picture

Time to work on tests...

This will override anything set by process. We should decide if that's the behaviour we want, instead of eg: throwing if they conflict. We should also document this behaviour.

Perhaps process should override, since it's more specific (i.e., at the granularity of the row), and the destination bundle should function as a default bundle. This could accommodate a scenario where you want to divert just a few special cases to a different bundle. I'll do this in the next patch.

Unit: can set bundle as a destination configuration
Unit: can still set bundle via process plugins
Unit: behaviour if process/config values conflict
Integration tests for the above should be already covered by the existing migrate tests

Ayup.

Since stubbing is one of the use cases, maybe ensure that's covered?

Can you explain the stubbing use case? This doesn't really seem to affect processStubRow(), which just uses the bundle destination property directly without referencing process or destination configuration.

Are there any existing tests that need to be updated?
Anything else?

mikeryan’s picture

Issue tags: -Needs tests
FileSize
5.45 KB

Here we go...

mikeryan’s picture

Ummm, the actual patch is helpful...

benjy’s picture

Status: Needs review » Needs work
  1. +++ b/core/modules/block_content/migration_templates/d7_custom_block.yml
    @@ -17,6 +14,7 @@ process:
    +  bundle: basic
    
    +++ b/core/modules/menu_link_content/migration_templates/menu_links.yml
    @@ -49,6 +46,7 @@ process:
    +  bundle: menu_link_content
    

    I think we should call this default_bundle, which is more honest since you can technically have migrations that import multiple bundles.

  2. +++ b/core/modules/migrate/src/Plugin/migrate/destination/Entity.php
    @@ -89,6 +89,16 @@ protected static function getEntityTypeId($plugin_id) {
    +  protected function getEntityBundleId() {
    +    return isset($this->configuration['bundle']) ? $this->configuration['bundle'] : '';
    
    @@ -112,6 +122,12 @@ protected function getEntity(Row $row, array $old_destination_id_values) {
    +      $bundle_key = $this->getKey('bundle');
    +      // Apply any configured bundle if the bundle has not already been set.
    +      if (($bundle = $this->getEntityBundleId()) && empty($row->getDestinationProperty($bundle_key))) {
    +        $row->setDestinationProperty($bundle_key, $bundle);
    +      }
    

    Lets make this getBundle(Row $row) and just let the getBundle() method figure out which to use.

Rest looks good.

heddn’s picture

I'm not sure if this is addressable here or in a follow-up. Part of my motivation of seeing this get in is the ability to determine in a standard manner the destination bundle during source processing. If *that* is achievable, then I'd be handing out karma points right & left.

We need to know that bundle for so many reasons. If we cannot know the bundle at some point in the source processing, then we don't solve anything in this issue and actually make things more complicated by adding yet another way to set the bundle. https://xkcd.com/927 seems apt.

benjy’s picture

You can have migrations that import entities of many different bundles in the one migration, in that case, the bundle is derived per row. Unless we remove that functionality, what we have here is the best outcome.

IMO, the UI could have just shown all fields for the entity type on all bundles and left it to the user to map the correct fields, that's how Panels works in D7 when placing fields and it seems fine.

mikeryan’s picture

Part of my motivation of seeing this get in is the ability to determine in a standard manner the destination bundle during source processing.

Having the source know anything about the destination feels very wrong to me. We're trying (for example in #2695297: Refactor EntityFile and use process plugins instead) to better decouple the migration stages and keep them focused on their specific duties, and the source's responsibility is simply to deliver the raw source data in a canonical form - it should not need to know anything about where the data is going.

Can you describe the use case? How would you use the destination bundle information at source time?

IMO, the UI could have just shown all fields for the entity type on all bundles and left it to the user to map the correct fields, that's how Panels works in D7 when placing fields and it seems fine.

With the D7 migration tools, it is really very helpful to see the specific fields that are relevant to the migration (which is always bundle-specific in that case). Migrating multiple bundles in a single migration is an edge case, and as a best practice I always discourage it - we should optimize for the common case, which is one migration per bundle.

mikeryan’s picture

Status: Needs work » Needs review
FileSize
11.21 KB
5.26 KB

Addressing benjy's feedback.

heddn’s picture

mikeryan’s picture

@heddn: The implementation of Entity::fields() for destinations will be able to use the destination default_bundle just fine.

As for the lookup/generate process plugins (I see now you were talking about process plugins, not source plugins) - the idea is that they need the field instance, not just the field storage, for some reason? What precisely do they need from the field instance?

heddn’s picture

+++ b/core/modules/migrate/src/Plugin/migrate/destination/Entity.php
@@ -89,6 +89,23 @@ protected static function getEntityTypeId($plugin_id) {
+ protected function getBundle(Row $row) {

Should this be up on MigrateDestinationInterface as well?

From the field instance you can determine the destination bundle for an ER field. Because the destination bundle is stored on the instance level, not in field storage.

Additionally, process plugins technically run during the source phase. Just very late in the flow. We know this because after process runs the source is frozen.

benjy’s picture

Should this be up on MigrateDestinationInterface as well?

I think it's too entity specific to go on the generic destination interface? I've made the method public, we probably need a new interface.

Additionally, process plugins technically run during the source phase. Just very late in the flow. We know this because after process runs the source is frozen.

This isn't a great way to think about, sure process plugins are applied one row at a time to the source data but the two things are decoupled. Otherwise, we'd be back into D7 days where all processing happens in prepareRow().

Attached a new patch, I actually meant to have a method to calculate the bundle but still do the set onto the row in getEntity().

vasi’s picture

Status: Needs review » Reviewed & tested by the community

This looks great to me.

Status: Reviewed & tested by the community » Needs work

The last submitted patch, 33: 2700581-33.patch, failed testing.

mallezie’s picture

Status: Needs work » Reviewed & tested by the community

Retested, cause i can't see any failures.

alexpott’s picture

Status: Reviewed & tested by the community » Fixed

What about existing migrations in contrib and the wild? Will any of the changes here break them? Looking at https://www.drupal.org/core/experimental we're still in alpha so nothing to worry about here.

Committed 3e5ef21 and pushed to 8.1.x and 8.2.x. Thanks!

  • alexpott committed a3947a9 on 8.2.x
    Issue #2700581 by mikeryan, benjy, heddn, vasi: Destination bundle set...

  • alexpott committed 3e5ef21 on 8.1.x
    Issue #2700581 by mikeryan, benjy, heddn, vasi: Destination bundle set...
benjy’s picture

We still support setting the bundle in process so they should be fine.

chx’s picture

Additionally, process plugins technically run during the source phase. Just very late in the flow. We know this because after process runs the source is frozen.

The source freezing happens before the process plugins run. This was a very important design decision, if it would be otherwise there would be no reason to freeze.

The only way to change the source is in prepare row.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.