Migrate comes a long with a few great examples. They helped me enormously.

Unfortunately, there's nowhere information to be found on the process of creating stubs. Any chance someone can copy/paste some code on how this actually works? Haven't been able to wrap my head around it yet. Could also be used to fill up the documentation page on http://drupal.org/node/1013506.

Comments

tiyberius’s picture

I second that motion! I read about it here: http://drupal.org/node/915102 but then when I went to do it, I realized I had no idea how to actually go about writing the code for it. An example of this would be greatly appreciated.

moshe weitzman’s picture

Status: Active » Fixed

This is FAQ. We need some docs on it.

I could have sworn that stub nodes were created automatically by the nodereference integration in migrate module. I don't see that code anymore. Instead, there is an API where migrations can implement stub support. Start at Migration::handleSourceMigration method.

However, usually you don't need any stub nodes. If you are migrating music albums and tracks, you should simply import your albums and in a subsequent migration, import your tracks. That way, assuming the node reference points from track to album, your referenced nodes are guaranteed to already exist. Stub nodes come into play when you have a self-reference like when albums have a node reference called 'related albums'. You can't easily assure that all the related albums exist by the time you create a given album.

Hope this helps.

tiyberius’s picture

Stub nodes come into play when you have a self-reference like when albums have a node reference called 'related albums'.

This is quite similar to my situation!

Thank you for pointing out Migration::handleSourceMigration. Unfortunately, it is not clear to me how to use it to create stub node(s). Are there plans to put an example in the documentation?

mikeryan’s picture

Category: support » task
Status: Fixed » Active

Yes, we will be filling out the documentation over time. The missing piece here is that you need to define a createStub() method in the destination migration (i.e., the migration corresponding to the sourceMigration() argument), to create the stub node and return its nid.

  /**
   * Create a stub node for a so-far-unresolved node reference.
   */
  protected function createStub() {
    migrate_instrument_start('create stub');
    $node = new stdClass;
    $node->title = t('Stub');
    $node->body = t('Stub body');
    $node->type = $this->destination->getBundle();
    // Default to admin account, unpublished
    $node->uid = 1;
    $node->status = 0;
    node_save($node);
    migrate_instrument_stop('create stub');
    if (isset($node->nid)) {
      return array($node->nid);
    }
    else {
      return FALSE;
    }
  }
tiyberius’s picture

Perfect. This is exactly what I was looking for. Many thanks!

rp7’s picture

Awesome, thanks, it works! If it's OK by you, I will fill up the documentation a bit (once I find the time).

1 more question though (hope this is the correct place - it's about stubs).

I'm importing a node reference field which can reference nodes of various content types (TypeX, TypeY, TypeZ). See code below, which works (XML source btw - sourceid's of the to-reference nodes are XML attributes).

$this->addFieldMapping('field_related_nodes', 'RelatedNodes')
 ->sourceMigration(array('TypeXNode', 'TypeYNode', 'TypeZNode'));
public function prepareRow(&$current_row) {
  if(!empty($current_row->xml->RelatedNodes)) {
    $related_nodes= array();
    foreach($current_row->xml->RelatedNodes->RelatedNode as $key => $value) {
	  $attributes = $value->attributes();
	  $related_nodes[] = (string) $attributes['Id'];
    }	
    $current_row->RelatedNodes = $related_nodes;
  }
}

Stubs for nodes that haven't been imported yet are being created. The problem I'm currently facing is that the stubs are always of type 'TypeX' (handled by TypeXNode). How do you define wether the stub-create function in TypeXNode, TypeYNode or TypeZNode should be used?

Been digging in the code for some time now, no luck so far. Not giving up though!

moshe weitzman’s picture

Component: Documentation » Code
Category: task » bug

Looks like a bug to me. The Stub system has no way to know which source_migration won. From handleSourceMigration():

if ($destids = $source_migration->createStubWrapper(array($source_key), $migration)) {

Would be good if Mike confirmed this.

mikeryan’s picture

Category: bug » feature

It's true, the handling of multiple sourceMigrations will call the first sourceMigration's createStubWrapper when the ID is not found in any of the map tables. I don't see how handleSourceMigration() can choose in a general way - presumably there is something in the outer migration's source query that indicates what destination type (and thus what sourceMigration) to use, but it's very likely that the value (if it is indeed a single field) is not the name of the desired sourceMigration.

So, how do we get the necessary information into handleSourceMigration? Perhaps the field mapping sourceMigration() method could have an optional second argument, $stub_migration_field - if present, then the value in the source row field of that name would be taken by handleSourceMigration as the name of the migration to call for creating stubs. With clever SQL you may be able to map your source data values into migration names, but more likely you would use prepareRow() to translate into that field. Let's see how this might look in practice:

...
  $this->addFieldMapping('field_related_nodes', 'RelatedNodes')
       ->sourceMigration(array('TypeXNode', 'TypeYNode', 'TypeZNode'), 'RelatedTypes');
...
public function prepareRow($current_row) {
  $type_mappings = array(
    'type_x' => 'TypeXNode',
    'type_y' => 'TypeYNode',
    'type_z' => 'TypeZNode',
  );

  if (!empty($current_row->xml->RelatedNodes)) {
    $related_nodes = array();
    $related_types = array();
    foreach ($current_row->xml->RelatedNodes->RelatedNode as $key => $value) {
      $attributes = $value->attributes();
      $related_nodes[] = (string)$attributes['Id'];
      $related_types[] = $type_mappings[(string)$attributes['Type']];
    }   
    $current_row->RelatedNodes = $related_nodes;
    $current_row->RelatedTypes = $related_types;
  }
}

Thoughts?

rp7’s picture

Looks good to me!

q0rban’s picture

To solve this problem, we filtered the source migrations based on the node type by implementing handleSourceMigration in our abstract migration class. We store the source node type in $this->sourceNodeType on each migration.


  /**
   * Look up a value migrated in another migration.
   *
   * @param mixed $source_migrations
   *   An array of multiple source migrations, or string for a single migration.
   * @param mixed $source_values
   *   An array of source values, or string for a single value.
   * @param mixed $default
   *   The default value, if no ID was found.
   */
  protected function handleSourceMigration($source_migrations, $source_values, $default = NULL) {
    // Filter out $source_migrations based on the source node type. Otherwise we
    // end up with stubs being created with an incorrect node type. To do this,
    // first query the source database for the node type.
    $query = self::getSourceConnection()
        ->select('node', 'n')
        ->fields('n', array('type'))
        ->condition('n.nid', $source_values);
    $result = $query->execute();

    if (($source_node_type = $result->fetchField())
      && $source_migration = self::migrationFromSourceNodeType($source_node_type, (array) $source_migrations))
    {
      // If we have a node type, and that node type has an associated migration,
      // set the $source_migrations to that one migration.
      $source_migrations = array($source_migration);
    }

    return parent::handleSourceMigration($source_migrations, $source_values, $default);
  }

  /**
   * Return the migration associated with a source node type.
   *
   * @param string $source_node_type
   *   The source node type.
   * @param array $migrations
   *   An array of migration machine names to check. If empty, all migrations will
   *   be checked. Not recommended for performance reasons.
   * @return mixed
   *   The string machine name of the migration, or FALSE if none was found.
   */
  public static function migrationFromSourceNodeType($source_node_type, $migrations = NULL) {
    static $map;

    if (!isset($map[$source_node_type])) {
      migrate_instrument_start(__FUNCTION__);
      if (!isset($migrations)) {
        $migrations = array_keys(migrate_migrations());
      }
      foreach ($migrations as $machine_name) {
        $migration = MigrationBase::getInstance($machine_name);
        if (!empty($migration->sourceNodeType)) {
          $map[$migration->sourceNodeType] = $machine_name;
        }
      }
      migrate_instrument_stop(__FUNCTION__);
    }

    return isset($map[$source_node_type]) ? $map[$source_node_type] : FALSE;
  }
Corentor’s picture

Title: Creating stubs: any help? » Stubs for self-referencing records (and not only tables)

Hello everyone,

sorry for reopening this thread. I'm a brand newbie of Drupal AND Migrate and I've to say that these are really great softwares. I don't know if what I'm trying to do is possible but I thought it's a good place to post a question about it.

I'm dealing with a table in a database that references itself (e.g. a table of individuals that may reference other individuals). Till there, it's nothing new, and it has been covered widely here and there. Stubs do the job, and do it quite well, as long as records reference OTHER records in the table, but not THEMSELVES. As an example of use case, individuals may be insured by other individuals, or may be their own insurants.

I've noticed that with my naive implementation of a stub (taking simply the code sample given above or in the chicken and eggs page), self-referencing nodes usually do not reference themselves, but create a dummy stub that stay in the Drupal database (one dummy stub being created per self-referencing node). There is an exception to that though : it's when the self referencing node was already referenced by a previous node and doesn't need to create its stub. In that case, the self-referencing node seems to be built correctly.

I know that my use case may not very common and may seem a bit contrived, but if anyone has a solution for this, I would be very glad to hear about it.

Many thanks to everybody

Corentor’s picture

Title: Creating stubs: any help? » Creating stubs : any help?
Corentor’s picture

Title: Creating stubs : any help? » Creating stubs: any help?
mikeryan’s picture

Title: Stubs for self-referencing records (and not only tables) » Need means to choose which of multiple sourceMigrations to use for creating stubs
Version: 7.x-2.0-rc3 » 7.x-2.x-dev
mirsoft’s picture

My solution to the problem described in #8 with the minimalistic approach (though it may not be the nicest in the world..).

Assumption: in this example the source data are in some relational database outside Drupal (because of that I couldn't use solution in #10, because we don't have the information about source node type in source Drupal's node table). Each external source table contains data for one destination content type. Data from all these source tables shall be imported into one common Drupal Node Reference field, of course with remembering reference to correct Content Type.

In Migration class:

...
  $this->addFieldMapping('field_related_nodes', 'RelatedNodes')
       ->sourceMigration(array('TypeXNode', 'TypeYNode', 'TypeZNode'));
...
public function prepareRow($current_row) {

  // You should of course replace example arrays of IDs with your own SQL query which returns IDs of matched records from child tables (something like SELECT id FROM TypeXTable WHERE related_id=$current_row->id)
  $current_row->RelatedNodes['TypeXNode'] = array(1,2,3,4); // Example result IDs from the TypeXTable
  $current_row->RelatedNodes['TypeYNode'] = array(3,4,5);  // example result IDs from the TypeYTable
  $current_row->RelatedNodes['TypeZNode'] = array(4,5,7);  // example result IDs from the TypeZTable
}

Now the implementation of handleSourceMigration() method to support this syntax is following (insert this method into your migration class or in abstract migration class):

	/**
	 * Enhanced handleSourceMigration, which supports multiple source migrations in $source_keys variable
	 * To match this enhancement, $source_keys array should be in this form
	 * array(
	 * 		'SourceMigration1' => array of IDs from SourceMigration1,
	 * 		'SourceMigration2' => array of IDs from SourceMigration2,
	 * 		etc.
	 * )
	 * You can format the $source_keys in your prepareRow() method
	 *
	 * @param mixed $source_migrations
	 * @param mixed $source_keys
	 * @param null $default
	 * @param null $migration
	 * @return array|mixed|null
	 */
	protected function handleSourceMigration($source_migrations, $source_keys, $default = NULL, $migration = NULL) {
		$result = array();

		// If we get source migrations in array and we've got set the array $source_keys[$source_migrations[0]],
		// we assume we've got associative source migrations specification in input
		if ( (is_array($source_migrations) && (!empty($source_keys[$source_migrations[0]])) ) ) {
			foreach($source_keys as $source_migration=>$values) {
				$partial_result = parent::handleSourceMigration($source_migration, $values, $default, $migration);

				// If we get string as result, add it to the result array, otherwise merge resulting array together
				if (!is_array($partial_result)) {
					$result[] = $partial_result;
				} else {
					$result = array_merge($result, $partial_result);
				}
			}
		// else Legacy migration run, don't solve anything special here
		} else {
			$result = parent::handleSourceMigration($source_migrations, $source_keys, $default, $migration);
		}

		return $result;
	}

rich.3po’s picture

I had a very similar problem and the approach in #15 worked for me - ie overriding the handleSourceMigration() method and manually determining which source migration should be used for each source key. Thanks for the code, mirsoft.

Going forward it would be great if the framework supported use of a callback function to determine the source migration - for ultimate flexibility. This might be used in the following way:

$this->addFieldMapping('target_field', 'source_field')
  ->sourceMigrationCallback('mymodule_determine_source_migration');

..and it would be the responsibility of mymodule_determine_source_migration() to return the source migration machine name, given the source data

Cheers

nerdcore’s picture

Issue summary: View changes

I am attempting to use migrate_d2d (https://drupal.org/project/migrate_d2d) to migrate Drupal 6 Menu Links into Drupal 7. Node migrations have been run first, but there are many of them, and some node types are not going to be migrated.

If a source menu link doesn't have a properly migrated node, I'd like to simply skip that link, perhaps with a message in the migrate_message_menulinks table. Unfortunately when a source node was encountered which had not been migrated, the whole thing blew up on me with this somewhat cryptic message:

Invalid argument supplied for foreach()
File /var/www/drupal7/sites/all/modules/contrib/media/includes/media.filter.inc, line 110 in MigrationBase->errorHandler() (line 519 of
/var/www/drupal7/sites/all/modules/contrib/migrate/includes/base.inc).

After some debugging, I found that what was happening was that the handleSourceMigration() function, upon discovering that the node wasn't migrated in any one of a number of possible source migrations, tried to call createStub() on each of the source migrations, and immediately bailed on the first one. The migration stopped entirely with the above error.

In my case, I don't want a stub created. Ever. I just want the bulk of the menu links imported and if a few get missed that's okay with me - a message would be nice to track errors. This code stopped the migration dead in its tracks:

      if (!$destids) {
        foreach ($source_migrations as $source_migration) {
          // Break out of the loop if a stub was successfully created.
          if ($destids = $source_migration->createStubWrapper($source_key, $migration)) {
            break;
          }
        }
      }

I'm unsure how to apprach this use case. Perhaps an argument to handleSourceMigration() to tell it not to create stubs, and if that argument is TRUE, skip this chunk of code and proceed directly to assigning the $default value?

wheatpenny’s picture

mirsoft, #15 was exactly what I needed. Thank you for sharing. You made my Saturday.

I had to make one small change to the handleSourceMigration function as I had a single value field give me the "Invalid argument supplied for foreach()" error that nerdcore mentioned. To account for that message, I also tested to make sure that $source_keys is an array.

   /**
     * Enhanced handleSourceMigration, which supports multiple source migrations in $source_keys variable
     * To match this enhancement, $source_keys array should be in this form
     * array(
     *         'SourceMigration1' => array of IDs from SourceMigration1,
     *         'SourceMigration2' => array of IDs from SourceMigration2,
     *         etc.
     * )
     * You can format the $source_keys in your prepareRow() method
     *
     * @param mixed $source_migrations
     * @param mixed $source_keys
     * @param null $default
     * @param null $migration
     * @return array|mixed|null
     */
    protected function handleSourceMigration($source_migrations, $source_keys, $default = NULL, $migration = NULL) {
        $result = array();
        // If we get source migrations in array and we've got set the array $source_keys[$source_migrations[0]],
        // we assume we've got associative source migrations specification in input
        if ( (is_array($source_migrations) && (is_array($source_keys)) && (!empty($source_keys[$source_migrations[0]])) ) ) {
            foreach($source_keys as $source_migration=>$values) {
                $partial_result = parent::handleSourceMigration($source_migration, $values, $default, $migration);
                // If we get string as result, add it to the result array, otherwise merge resulting array together
                if (!is_array($partial_result)) {
                    $result[] = $partial_result;
                } else {
                    $result = array_merge($result, $partial_result);
                }
            }
        // else Legacy migration run, don't solve anything special here
        } else {
            $result = parent::handleSourceMigration($source_migrations, $source_keys, $default, $migration);
        }
        return $result;
    }
mpdonadio’s picture

I have a fairly sub-optimal workaround for this, but I think it hints at a possible solution.

In the site I am working on now, which is a D6-to-D7 migration, I ended up implementing

protected function createStub(Migration $migration, array $source_id) {
    return FALSE;
}

This prevents the stubs from being created, which also means that my pointer fields (entityreference) fields won't be populated. I then manually reset the highwater mark, and rerun the node migrations. In this case, the dependent nodes already exist, so my entityreference get set.

In my particular case, I tried something similar to

protected function createStub(Migration $migration, array $source_id) {
    $type = Database::getConnection('default', $this->sourceConnection)
      ->select('node', 'n')
      ->fields('n', array('type'))
      ->condition('nid', $source_id[0])
      ->execute()
      ->fetchField();

    ...
    $node->type = $type;

but had problems with this because the parent class calls this in the D2D hierarchy placed the new node in the wrong migration map, and I couldn't figure out how to get it into the right map.

I think this means (maybe for just the D2D case) two possibilities.

1. Decouple the maps from the migrations, and have a single map object (yay! a singleton!) manage the maps.
2. Add a parameter to ->sourceMigration to prevent stubs from being created, which will defer the field from being populated. When population is deferred, do the necessary bookkeeping so the destination will be updated the next time the migration runs. Otherwise, populate as normal.

I think #2 is the most straightforward, but I don't understand how the classes work well enough to propose a patch.

RoSk0’s picture

Had the same issue like in #19 but for D7->D7 migration and the fix appears to be simple:

$type = Database::getConnection('default', $this->sourceConnection)
      ->select('node', 'n')
      ->fields('n', array('type'))
      ->condition('nid', $source_id[0])
      ->execute()
      ->fetchField();

if ($type == $this->destinationType) {
      // Good to go. Lets create this stub.
      $node = new stdClass();
      $node->title = t('Stub for @nid', array('@nid' => $source_id));
      $node->body = array(LANGUAGE_NONE => array(array("value" => t('Stub body'))));
      $node->type = $type;
      $node->uid = 1;
      node_save($node);
}

In simple words the createStub() was implemented only in base class for different bundles that is why no matter what source migrations was specified the first one was creating a stub with incorrect bundle.

Trying to figure out is there a way to avoid this query for each attempt to create stub?

milos.kroulik’s picture

#20 Were you using Migrate D2D when defining your migration? If so, how did you get $source_id parameter, when this method is defined as http://www.drupalcontrib.org/api/drupal/contributions!migrate_d2d!node.i... ? Thanks in advance.

RoSk0’s picture

tssarun’s picture

@Rosk0

https://www.drupal.org/node/1096132#comment-10750874: Good tip, this solved my problem. I want to more optimize the create stub query. I have some 32 refer nids with 4 different bundle. If i execute it 32 extra query will execute(inside createStub). Need to understand how to solve this query optimization. Anyidea?

Thanks
tssarun