Last updated July 2, 2013. Created on January 2, 2011.
Edited by mikeryan, joetsuihk, gaas. Log in to edit this page.

The Migration class is built on the fundamental assumption that there is a one-to-one correspondence between source rows coming in and destination objects going out. What if on the source side you have multiple values spread among multiple database rows for one or more of your fields? For example, if your source were a Drupal site, your first stab at a source query to migrate nodes and related terms might look like

<?php
$query
= db_select('node', 'n')
         ->
fields('n', array('nid', 'title')
         ->
fields('tn', array('tid'));
$query->leftJoin('term_node', 'tn', 'n.vid=tn.vid');
?>

The problem is, if you have multiple terms for a node, you'll end up with multiple rows for the same node feeding into the migration:

1 First node  23
1 First node  89
2 Second node 23
...

That won't work - you'll create 'First node' with term 23, then the second row will be skipped because nid 1 has already been migrated. What you want is a way to deal with one node at a time, and pull in all related terms with that node. Here are a couple of options:

GROUP_CONCAT

If your source data is in a MySQL database, you can use GROUP_CONCAT to pull the multiple values into a single row:

<?php
$query
= db_select('node', 'n')
         ->
fields('n', array('nid', 'vid', 'title');
$query->leftJoin('term_node', 'tn', 'n.vid=tn.vid');
$query->addExpression('GROUP_CONCAT(DISTINCT tn.tid)', 'tag_list');
$query->groupBy('n.nid');
?>

The results of your query then look like:

1 First node  23,89
2 Second node 23

Make sure you tell the field mapping to expect multiple comma-separated values (and, since we're migrating tids rather than term names, to expect tids):

<?php
$this
->addFieldMapping('field_tags', 'tag_list')
     ->
separator(',');
$this->addFieldMapping('field_tags:source_type')
     ->
defaultValue('tid');
?>

Query in prepareRow

An alternative (and the only option if your source data isn't in MySQL) is to pull the data in after the initial query.

<?php
$this
->addFieldMapping('field_tags', 'tag_list')
     ->
description('Populated in prepare()');
$this->addFieldMapping('field_tags:source_type')
     ->
defaultValue('tid');
...
public function
prepareRow($row) {
 
$row->tag_list = db_select('term_node', 'tn')
                   ->
fields('tn', array('tid'))
                   ->
condition('vid', $row->vid)
                   ->
execute()
                   ->
fetchCol();
}
?>

$row->tag_list will be array(23, 89) for node 1, array(23) for node 2, etc.

Looking for support? Visit the Drupal.org forums, or join #drupal-support in IRC.

Comments

pthurmond’s picture

So how would this work for file fields? I don't get that one and I would really like to know.

Patrick Thurmond
patrickthurmond.com

smichel’s picture

MySQL has a size limit on text returned by GROUP_CONCAT. The default value is 1024, so if you are querying to get a large number of items, it can truncate the results on you. It doesn't warn you, of course, and I couldn't even find anything in the error logs about this.

It's explained at in the MySQL documentation.. I had a case where the site I was migrating from had hundreds of files attached to nodes, and it took a while to figure out why it wasn't importing them all.

The solution is to edit my.cnf, and add a line like:

group_concat_max_len = 16K

Which worked for me.

bdimaggio’s picture

Shouldn't the call to prepare() above be prepareRow()? I haven't used either yet, but the docs here indicate that we'd need an $entity as the first argument if you meant prepare.

gaas’s picture

Yes. I've fixed it now.

Bhanu’s picture

@bdimaggio

As in link, prepareRow() method is called by the source class next() method so it needs to call first then comes prepare, where prepare() method is called by the destination class prepare() method.

Regards,
Bhanuji

lpeabody’s picture

This isn't really a valid example and should cause PHP to fatally error. Specifically I'm referring to building up the SQL query. Building the query in this example should generate:

PHP Fatal error: Call to a member function fields()

<?php
$query
= db_select('node', 'n'// returns a new SelectQuery object
        
->leftJoin('term_node', 'tn', 'n.vid=tn.vid'// operates on SelectQuery but returns a string (alias of the joined table I believe)
        
->fields('n', array('nid', 'title'// fails because it's trying to operate on a string
        
->fields('tn', array('tid'));
?>

I would recommend updating the example with a valid solution.

adaddinsane’s picture

The idea of running a query for every row is fairly bad - I'm importing 6000 users currently and it's managing about 60/minute, and that's *not* running a query for every user to fetch their roles (damn I wish I'd read about GROUP_CONCAT before).

Anyway as a better alternative, run one query in the __construct() to fetch all the data you need into an array, and then in prepareRow() just extract the data from the array. (I should say that I'm migrating data from a Drupal 6 database to Drupal 7.)

<?php
 
protected $userRoles = array();

  function
__construct() {
    ...
   
// Separately collect and store user roles
   
$this->collectUserRoles();
    ...
?>

calls this:

<?php
 
protected function collectUserRoles() {
   
$result = Database::getConnection('default', 'for_migration')
      ->
select('users_roles', 'ur')->fields('ur', array('uid', 'rid'))
      ->
orderBy('ur.uid')->execute();
    while (
$row = $result->fetchAssoc()) {
     
$this->userRoles[$row['uid']][$row['rid']] = $row['rid'];
    }
  }
?>

and then:

<?php
 
public function prepareRow($row) {
   
// Build any extra roles for this user
    // (and remove the existing setting which is a dummy)
   
$row->rids = array_key_exists($row->uid, $this->userRoles)
      ?
implode(',', $this->userRoles[$row->uid])
      :
'';
    return
TRUE;
  }
?>

And that's it.

msbrar’s picture

Thanks. This helped me. I was getting a csv feed where user name of the author was passed in the csv but i did not have the uid. The users had been migrated seperately out of the migrate process. I used the above snippet like this

public function __construct() {
...
$columns = array(
<>
array('User', 'Author Name'),
array('uid', 'Author Id') //this column was introduced even though it was not there in the csv
);
//called a method to collect user ids and user names from table .
$this->collectUserIds();
....
$this->addFieldMapping('uid', 'uid');
...
}

public function prepareRow($row) {
// Build any extra roles for this user
// (and remove the existing setting which is a dummy)
$row->uid = array_key_exists($row->User, $this->userIds)
? $this->userIds[$row->User]
: 1;
return TRUE;
}
protected function collectUserIds() {
$result = db_select('users', 'u')
->fields('u', array('name', 'uid'))
->execute();
while ($row = $result->fetchAssoc()) {
$this->userIds[$row['name']] = $row['uid'];
}
}

adaddinsane’s picture

ovidenov’s picture

Tip:
Don't forget to use groupBy, if you are using GROUP_CONCAT. Failing to do so will result in only one result row.

Oleg Videnov