Using MigrateListJSON and MigrateItemJSON

Migration from data encoded in JSON can be accomplished using the MigrateListJSON and MigrateItemJSON classes, based on the MigrateList pattern.

Consider this scenario:

  $list_url = 'http://www.example.com/list.json';
  $item_url = 'http://www.example.com/item.json?id=:id';
  $fields = array(
    'id' => 'Unique ID',
    'title' => 'Title of the object',
    'content' => 'Text of the object'
  );
  $this->source = new MigrateSourceList(
    new MigrateListJSON($list_url), 
    new MigrateItemJSON($item_url, array()), 
    $fields
  );

In the above example, $list_url is a JSON endpoint that contains a list of document id's that should be migrated. $item_url represent the endpoint at the source that contains a json representation of the source object. The $item_url contains the ':id' element that will be replaced by a source identifier.

$fields contains an array with field names available in the source objects.

Using list pattern with a single JSON file

Let's consider a different scenario, where the JSON input consists of a single file with an array of objects to be imported:

[
  {
    "id":"3595062:Page:24924",
    "title":"This is our second page",
    "description":"<p>This is the second page body.</p>",
    "createdDate":"2012-05-11T23:39:27.433Z",
  },
  {
    "id":"3595062:Page:24863",
    "title":"The first page",
    "description":"<p>Hello World!</p>",
    "createdDate":"2012-04-05T21:43:35.894Z",
  }
]

In this case, the same file serves as both the ID listing and the data source, and the assumptions made by the JSON classes don't work here - we need to override them.

class ExampleListJSON extends MigrateListJSON {
  /**
   * The default implementation assumes the IDs are top-level array elements, 
   * but the array elements are the data items - we need to look inside them
   * for the IDs.
   */
  protected function getIDsFromJSON(array $data) {
    $ids = array();
    foreach ($data as $item) {
      $ids[] = $item['id'];
    }
    return $ids;
  }
}

class ExampleItemJSON extends MigrateItemJSON {
  protected $data = array();
  /**
   * Two problems with the default getItem() - it reads and parses the JSON on
   * each call, which we don't want to do for multiple items in the file; and,
   * we need to retrieve a given item from the file according to its 'id' element.
   */
  public function getItem($id) {
    // We cache the parsed JSON at $this->data.
    if (empty($this->data)) {
      $data = $this->loadJSONUrl($this->itemUrl);
      if ($data) {
        // Let's index the array by the ID for easy retrieval.
        foreach ($data as $item) {
          $this->data[$item->id] = $item;
        }
      }
      else {
        // Error-handling here....
      }
    }
    // Return the requested item
    if (isset($this->data[$id])) {
      return $this->data[$id];
    }
    else {
      return NULL;
    }
  }
}

Then, we create the migration source using our extended classes:

    $fields = array(
      'id' => 'Unique page ID',
      'title' => 'Page title',
      'description' => 'Page body',
      'createdDate' => 'Page creation date',
    );
    $json_file = '/data/sources/example-pages.json';
    $this->source = new MigrateSourceList(new ExampleListJSON($json_file),
      new ExampleItemJSON($json_file, array()), $fields);

Importable data not at the top level

When JSON is returned from an API, the desired data may not be at the top level but rather wrapped inside a response. Often an API will return this:

{
  "success":true,
  "results":[
  {
    "id":"3595062:Page:24924",
    "title":"This is our second page",
    "description":"<p>This is the second page body.</p>",
    "createdDate":"2012-05-11T23:39:27.433Z",
  },
  {
    "id":"3595062:Page:24863",
    "title":"The first page",
    "description":"<p>Hello World!</p>",
    "createdDate":"2012-04-05T21:43:35.894Z",
  }
]
}

In this use case the above classes will not work as written and will need to be modified.

class MultiListJSON extends MigrateListJSON {
  /**
   * Originally this class assumed that the results would be all that was in the JSON
   * object. This class has been modified to look inside the JSON object and find the
   * expected data within the ['results'] member. 
   */
  protected function getIDsFromJSON(array $data) {
    $ids = array();
    foreach ($data['results'] as $item) {
      $ids[] = $item['id'];
    }
    return $ids;
  }
  
  /**
   * If this is the case then the count will be off as well and we will have to overriden as well.
   */
  public function computeCount() {
    $count = 0;
    if (empty($this->httpOptions)) {
      $json = file_get_contents($this->listUrl);
    }
    else {
      $response = drupal_http_request($this->listUrl, $this->httpOptions);
      $json = $response->data;
    }
    if ($json) {
      $data = drupal_json_decode($json);
      if ($data) {
        $count = count($data['results']);
      }
    }
    return $count;
  }
}

Using MigrateSourceJSON

The above classes use the drupal_json_decode() function (which wraps PHP's json_decode). While this works fine in many cases, it's important to understand that it takes the entire JSON file and decodes it into an in-memory structure. Thus, if you have a very large JSON file, as in Ning exports, you're likely to hit the PHP memory limit. The MigrateSourceJSON source class provides an alternative which incrementally parses the JSON file with an analog to the PHP XMLReader class (which addresses this issue for XML). To be able to incrementally parse the file does require some assumptions about how the JSON is organized - specifically, out-of-the-box it assumes the JSON contains an array of objects to be imported - but by providing an override to the MigrateJSONReader class you can handle different structures.

Consider the following example data, an array of simple objects:

[
  {"id":"53","field1":"value1"},
  {"id":"54","field1":"value2"}
]

This is delivered from the URL http://example.com/services/data.json. So, to migrate from this file, your source setup in your migration constructor will look like:

// Document the source fields
$fields = array(
  'id' => 'Unique ID for each source data row',
  'field1' => 'The actual source data',
);
// Second parameter is the unique ID from the source file
$this->source = new MigrateSourceJSON('http://example.com/services/data.json', 'id', $fields);
...
$this->addFieldMapping('field_example_data', 'field1');
...

Quite simple, if your data fits the default structure. What if your data is not a simple array of objects, with the data fields at the top level of each object? Consider this example:

[{
  "Release": {
    "id": "3263",
    "name": "Debut",
    "release_date": "2008-06-10",
    "artist_id": "1080",
  },
  "Artist": {
    "id": "1080",
    "name": "Jane Roe"
  },
},
{
  "Release": {
    "id": "3368",
    "name": "Sophomore Slump",
    "release_date": "2009-03-24",
    "artist_id": "1080",
  },
  "Artist": {
    "id": "1080",
    "name": "Jane Roe"
  }
}]

We're trying to import the releases, but the data for the release is a layer deeper - the "fields" that would be available for mapping by default would be Release and Artist rather than id, name, etc. We can address this by overriding the underlying class that does the actual parsing and building of the data rows, MigrateJSONReader:

class ExampleJSONReader extends MigrateJSONReader {
  /**
   * Implements MigrateJSONReader::next().
   */
  public function next() {
    // Do the default fetch.
    parent::next();
    // At this point, $this->currentElement (which will eventually become our $row object
    // in the migration class) contains Release and Artist objects. We want to move the
    // Release fields up to the top level.
    if (is_object($this->currentElement)) {
      foreach ($this->currentElement->Release as $field_name => $field_value) {
        $this->currentElement->$field_name = $field_value;
      }
      // Get rid of the Release object, we won't be referencing it. Leave the Artist object - it
      // will be available in prepareRow(), prepare(), etc. should we want to reference it.
      unset($this->currentElement->Release);
    }
  }
}

Then, we tell the source class to substitute our version of the JSON reader class:

$this->source = new MigrateSourceJSON('http://example.com/services/data.json', 'id', $fields, array('reader_class' => 'ExampleJSONReader'));
...
$this->addFieldMapping('title', 'name');
$this->addFieldMapping('field_release_date', 'release_date');
...

Comments

basilbaby6’s picture

Any idea on how to to define a complex json source from an api return. I want to map the below json retun from an api.
I need to map the all the values inside entrydata->text->0.

Please help us

{
"@position": "2",
"@unid": "34234234",
"@noteid": "9A33A",
"@siblings": "1511",
"entrydata": [
{
"@columnnumber": "0",
"@name": "Date",
"text": {
"0": "02/02/2013"
}
},
{
"@columnnumber": "1",
"@name": "POLAR_Code",
"text": {
"0": "S364"
}
},
{
"@columnnumber": "2",
"@name": "PC_Code",
"text": {
"0": "TA1564"
}
},
{
"@columnnumber": "3",
"@name": "MyDay",
"text": {
"0": "Day 01"
}
},
{
"@columnnumber": "4",
"@name": "Bus_Location",
"text": {
"0": "EKM"
}
},
{
"@columnnumber": "5",
"@name": "Event_Venue",
"text": {
"0": "Dummy Pool"
}
},
{
"@columnnumber": "6",
"@name": "Start_Time",
"text": {
"0": "03:15 PM"
}
},
{
"@columnnumber": "7",
"@name": "End_Time",
"text": {
"0": "03:45 PM"
}
},
{
"@columnnumber": "8",
"@name": "EventType",
"text": {
"0": "Safety"
}
},
{
"@columnnumber": "9",
"@name": "Event_Desc",
"text": {
"0": "GES"
}
},
{
"@columnnumber": "10",
"@name": "EventStatus",
"text": {
"0": "Posted"
}
},
{
"@columnnumber": "11",
"@name": "TajPick",
"text": {
"0": ""
}
},
{
"@columnnumber": "12",
"@name": "TajFlag",
"text": {
"0": ""
}
},
{
"@columnnumber": "13",
"@name": "IntranetFlag",
"text": {
"0": ""
}
},
{
"@columnnumber": "14",
"@name": "CreatedBy",
"text": {
"0": "DeputyTajDirector"
}
},
{
"@columnnumber": "15",
"@name": "DateCreated",
"datetime": {
"0": "20130129T113339,95-07"
}
},
{
"@columnnumber": "16",
"@name": "DateUpdated",
"datetime": {
"0": "20130129T113340,60-07"
}
},
{
"@columnnumber": "17",
"@name": "MAPS_Name",
"text": {
"0": "Carel Smit"
}
}
]
}

basilbaby6’s picture

I have tried with above item,list and reader classes , but not getting successful result.
Please provide the working examples.

diego.banchero’s picture

Hi All,
I had to build a migration using a list pattern with a single JSON file and I had to set the highwaterField. Without this configuration, the migration fails.

Can someone help me to understand better this kind of setting (highwaterField)? Or if I missed something?

Many thanks

basilbaby6’s picture

Here is the working examples for JASON migration

The sample JASON (out put of http://example.com/services)

[
    {
        "id": "1883",
        
        "uuid": "7e597e2e-19a4-4731-84d7-002fa6e50a3f",
        
        "field_section_text": [
            "<p>a</p>",
            "<p>b</p>",
            "<p>c</p>",
            "<p>d</p>"
        ],
        "field_section_text_format": [
            "full_html",
            "full_html",
            "full_html",
            "full_html"
        ],
        "field_section_text_language": "en",
        "field_points_of_interest": [
            "1",
            "2",
            "3"
        ],
        "field_points_of_interest_format": [
            "full_html",
            "full_html",
            "full_html"
        ],
        "field_points_of_interest_language": "en",
        "multilingual_name": [
            "Kerala-en",
            "kerala",
            "台北(基隆)"
        ],
        "multilingual_name_languages": [
            "en",
            "und",
            "zh"
        ],
        
        "parents": [
            "12"
        ]
    }
]

Migration class

<?PHP
class termsMigration extends Migration {

    public function __construct() {

        parent::__construct();
        $url = 'http://example.com/services';

        $fields = array(
            'id' => 'term id',           
            'uuid' => 'uuid',
            'multilingual_name' =>'multilingual name',
             'multilingual_name_languages' =>'multilingual name languages',
            'field_section_text' => 'field section text',
            'field_section_text_format' => 'field section text format',
            'field_section_text_language' => 'field section text language',
            'field_points_of_interest' => 'field points of interest',
            'field_points_of_interest_format' => 'field points of interest format',
            'field_points_of_interest_language' => 'field points of interest language',
            'parents' => 'parent id',
        );
        $this->map = new MigrateSQLMap($this->machineName,
                        array(
                            'id' => array(
                                'type' => 'int',
                                'description' => 'id',
                            )
                        ),
                        MigrateDestinationTerm::getKeySchema()
        );

        $reader = array('reader_class' => 'CommonJSONReader');
        $this->source = new MigrateSourceJSON($url, 'id', $fields, $reader);
        /// Set up our destination - term of type voyages
        $this->destination = new MigrateDestinationTerm('port');

        $this->addFieldMapping('uuid', 'uuid');
         $this->addFieldMapping('field_multi_name', 'multilingual_name');
        $this->addFieldMapping('field_multi_name:language', 'multilingual_name_languages');
        $this->addFieldMapping('field_section_text', 'field_section_text');
        $this->addFieldMapping('field_section_text:format', 'field_section_text_format');
        $this->addFieldMapping('field_section_text:language', 'field_section_text_language');

        $this->addFieldMapping('field_points_of_interest', 'field_points_of_interest');
        $this->addFieldMapping('field_points_of_interest:format', 'field_points_of_interest_format');
        $this->addFieldMapping('field_points_of_interest:language', 'field_points_of_interest_language');
        $this->addFieldMapping('parent', 'parents');
        $this->addFieldMapping('parent_name')
                ->issueGroup(t('DNM'));
    }
}
?>

Here I included the multiple value mapping, multilingual mapping and parent id mapping

tanout’s picture

How to build a JSON migration source using the example in the text below "Importable data not at the top level"?

There is a new class MultiListJSON for the ID listing but cannot find any class for the data source (MultiItemJSON?)

$this->source = new MigrateSourceList(new MultiListJSON($json_file),
new ????ItemJSON($json_file, array()), $fields);

Any help how to build the JSONsource?

Kenzomus’s picture

I am trying to use the MigrateSourceJSON but I am receiving an error "file_get_contents(data/sources/mylist.json): failed to open stream: [warning]
No such file or directory opportunities.inc:24
Migration failed with source plugin exception: fopen(): Filename [error]
cannot be empty
File
/Users/indiaye/bkup_oct_17_14/hjf/docroot/sites/all/modules/migrate/plugins/sources/json.inc,
line 266".
The file mylist.json exist , was validate wit sonlint.com but I am having problem to read it or open.
I will appreciate any help or knowledge base to solve it.
Thanks

zeeshan_khan’s picture