A key feature of the Migrate module is the tracking of relationships between source records and the Drupal objects that have been created from them. This tracking is managed by the map classes - the abstract class MigrateMap and the concrete class MigrateSQLMap.

MigrateMap defines the API to be provided by concrete map classes (and, in hindsight, really should have been a PHP interface rather than abstract class). While in theory different representations of the map and message data could be implemented, in practice MigrateSQLMap (described below) is the only known implementation, so we will focus on usage of that class.

MigrateSQLMap maintains two tables for each defined migration - a map table and a message table. For each source record that is processed by the import function (whether or not it is successfully imported), a row in the migration's map table is added, keyed by the source record's unique ID. If import of the record was successful, the unique ID of the resulting Drupal object is also stored there. Thus, we know what Drupal objects have been created via migration (thus which ones can be rolled back), and when updating imported content we know what source records to update it from. And, critically, when migrating relationships between objects, we can use the map table to rewrite those relationships so they are properly maintained on the Drupal side.

Usually, in developing migrations you only need to worry about constructing the map object in your migration constructor:

    $source_key = array(
      'pgid' => array('type' => 'int',
                      'unsigned' => TRUE,
                      'not null' => TRUE,
                      'description' => t('Source ID'),
                     )
    );
    $this->map = new MigrateSQLMap($this->machineName, $source_key,
      MigrateDestinationNode::getKeySchema(), 'legacy', array('track_last_imported' => TRUE));

The constructor parameters are:

  1. Machine name - this may actually be any string, but is usually the machine name of the migration. This string will be appended to migrate_map_ and migrate_message_ to form the map and message table names. If your migration machine names are very long, you may prefer to pass a shorter string here.
  2. A Drupal schema definition for the source key - that is, a field or set of fields that is unique for each distinct record to be imported from the source. The keys (pgid in the example above) must correspond to fields in the raw data - Migrate will pull the values (e.g., $row->pgid) to save as the source IDs in the map and message tables. When the key field name may appear in more than one table, you need to add an 'alias' value to the key schema to disambiguate it. This alias is the alias name of the table this field is come from.
  3. A Drupal schema definition for the destination key - that is, a field or set of fields that is unique for each Drupal object (such as uid for users, nid, for nodes, etc.). While you need to build this array by hand for the source key, because Migrate knows nothing about the legacy data, in this case the destination object usually knows the correct Drupal schema and you can call its getKeySchema() method to obtain it.
  4. Connection key - this identifies the Drupal database connection where the map and message tables should be created and referenced. It defaults to 'default' - the Drupal database itself - but could be set to another connection, in particular the source connection for MigrateSourceSQL migrations. The advantage of this is that if the map table is on the same connection as the source data, the MigrateSourceSQL class can speed things up by adding the map table to the source query.
  5. Options array - you can pass array('track_last_imported' => TRUE) to have the map table track the last time imported for each individual row. There is currently no builtin feature for displaying or using this information in Migrate, but the data is available for querying the table directly. Also, you may set array('cache_map_lookups' => TRUE) to have lookups of source or destination IDs against the map table be statically cached, which can help performance on sites with a large volume of relationships.

The map and message tables are lazily created by the MigrateSQLMap constructor - if the map table does not exist when the constructor is called, both tables are created based on the source and destination key parameters. It is important to understand that if you change the schema you pass to the constructor - for example, you decide that the source key must be a varchar rather than an int - the tables are not automatically changed. In this instance, you must make sure the migration is fully rolled back and manually drop the map and message tables so they can be recreated with the new schema.

Comments

seanbfuller’s picture

This has tripped me up on three different migrations so I figured I'd add a note. As discussed in https://drupal.org/node/1123504, if you get a database warning about ambiguous keys you may need to add an alias to the source key definition. The alias it is looking for is the primary table alias. For example, if you're migrating from an older drupal database and your source query's primary table is "node" with an alias of "n", then your source key alias would be "n".

    // Set the mapping
    $source_key = array(
      'nid' => array(
        'type' => 'int',
        'unsigned' => TRUE,
        'not null' => TRUE,
        'alias' => 'n',
      )
    );
    $this->map = new MigrateSQLMap(
      $this->machineName,
      $source_key,
      MigrateDestinationNode::getKeySchema()
    );

--------------------
Sean B. Fuller
www.seanbfuller.com

opdavies’s picture

I was having the same issue. Thanks for posting!

Shyghar’s picture

Thank you for sharing!! I was stuck here ^^