So I'm currently in the process of creating a custom data parser for JSON:API format. I came a cross different issues when using the default JSON data parser:

  • Paging
  • Relationships & included data

I could improve this by caching the result of the response in 'nextUrls()' so we don't have to execute the request again in 'getSourceData()'. But I first want to know if this data parser is a good idea :)

Maybe we could add this as a default data parser in this module?

Comments

robin.ingelbrecht created an issue. See original summary.

robin.ingelbrecht’s picture

Issue summary: View changes
robin.ingelbrecht’s picture

Issue summary: View changes
weynhamz’s picture

I am working on the exact same thing now, also created a plugin that supports the relationships and included, will share soon.

robin.ingelbrecht’s picture

Version with cached responses:


/**
 * @DataParser(
 *   id = "json_api",
 *   title = @Translation("JSON API")
 * )
 */
class JsonApi extends Json {

  protected $cachedResponses = [];

  /**
   * JsonApi constructor.
   * @param array $configuration
   * @param $plugin_id
   * @param $plugin_definition
   */
  public function __construct(array $configuration, $plugin_id, $plugin_definition) {
    parent::__construct($configuration, $plugin_id, $plugin_definition);

    $urls = [];
    foreach ($this->urls as $url) {
      // Determine the next urls dynamically (if any).
      $urls += $this->getUrls($url);
    }
    $this->urls = $urls;
  }

  /**
   * {@inheritdoc}
   */
  protected function getUrls($url) {
    $urls = [];

    do {
      $urls[] = $url;
      $response = $this->getDataFetcherPlugin()->getResponseContent($url);
      // Cache the response to use it later on in "getSourceData()"
      $this->cachedResponses[$url] = $response;
      $data = \Drupal\Component\Serialization\Json::decode($response);
      $url = isset($data['links']['next']) ? $data['links']['next'] : FALSE;
    } while ($url);

    return $urls;
  }

  /**
   * {@inheritdoc}
   */
  protected function getSourceData($url) {
    if(!$response = $this->getCachedResponse($url)){
      // For some reason the response was not cached. Fetch it again.
      $response = $this->getDataFetcherPlugin()->getResponseContent($url);
    }

    // Convert objects to associative arrays.
    $source_data = json_decode($response, TRUE);

    // If json_decode() has returned NULL, it might be that the data isn't
    // valid utf8 - see http://php.net/manual/en/function.json-decode.php#86997.
    if (is_null($source_data)) {
      $utf8response = utf8_encode($response);
      $source_data = json_decode($utf8response, TRUE);
    }

    // Store the included data so we can later access it by using the selector /relationships/[FIELD_NAME].
    $included = [];
    if (!empty($source_data['included']) && !empty($source_data['data'])) {
      foreach ($source_data['included'] as $item) {
        $included[$item['id']] = $item;
      }
    }

    // JSON:API always provides the source data in the "data" child of the response.
    $source_data = !empty($source_data['data']) ? $source_data['data'] : [];

    // Now include relationships to the data.
    foreach ($source_data as &$item) {
      foreach ($item['relationships'] as &$relationship) {
        if (isset($relationship['data'][0])) {
          foreach ($relationship['data'] as &$relation_item) {
            $id = $relation_item['id'];
            if (isset($included[$id])) {
              $relation_item['data']['attributes'] = $included[$id]['attributes'];
            }
          }
        }
        else {
          $id = $relationship['data']['id'];
          if (isset($included[$id])) {
            $relationship['data']['attributes'] = $included[$id]['attributes'];
          }
        }
      }
    }

    return $source_data;
  }

  /**
   * Returns cached response.
   *
   * @param $url
   * @return bool|mixed
   */
  protected function getCachedResponse($url){
    return isset($this->cachedResponses[$url]) ? $this->cachedResponses[$url]: FALSE;
  }

}

weynhamz’s picture

Maybe we can somehow join forces together, here is my version.

Currently implemented:

1. introduced a 'relationship' configuration key to specify the 'field key' in relationships
2. automatically add the required relationship as include to API request
3. 'selector' configures the selector path for the included object
4. support paging 'next'

An example:

JSON:API response

Configuration example

id: news_type
label: 'News Type'
source:
  plugin: url
  data_fetcher_plugin: http
  data_parser_plugin: jsonapi
  urls: http://local.vagrant/jsonapi/taxonomy_term/news_type
  ids:
    tid:
      type: integer
    langcode:
      type: string
  item_selector: data/
  fields:
    -
      name: tid
      selector: /attributes/tid
    -
      name: name
      selector: /attributes/name
    -
      name: parent
      selector: /attributes/tid
      relationship: parent
    -
      name: langcode
      selector: /attributes/langcode
  track_changes: true
process:
  uid: 1
  name: name
  langcode:
    plugin: skip_row_by_lang
    source: langcode
    default_value: und
    language_code: 'en'
  parent:
    -
      plugin: migration
      migration: news_type
      source: parent
      no_stub: true
    -  uid: 1
      plugin: default_value
      default_value: 0
destination:
  plugin: entity:taxonomy_term
  default_bundle: news_type
heddn’s picture

Status: Active » Closed (duplicate)
Related issues: +#2640516: Support paging through multiple requests

The other issue #2640516: Support paging through multiple requests already has some level of tests and has been open for a while. Let's consolidate our work on it.

weynhamz’s picture

@heddn, this is not just about the paging for jsonapi support, it also includes the support for jsonapi specific relationships/include handling, after the #2640516 commited, we can adapt this to the built-in paging support.

heddn’s picture

Issue summary: View changes
Status: Closed (duplicate) » Needs work

Expound on the concept of relationships? What is it all about? I'm not familiar with it.

weynhamz’s picture

@heddn, JSON:API handles entity_references as relationships, as you can see from my previous screenshot, there is a 'relationships' key in the response, it gives the UUID for the referenced entity. JSON:API also support by adding an 'include' query parameter to the API to include the needed 'relationship' into the response as 'included' key.

https://www.drupal.org/docs/8/modules/jsonapi/includes

My approach above added below configuration change to the fields definition of extended json parser, it says for parent field, extract from /attributes/tid within the parent relationship of the included section. The parser tries to find all defined/needed relationship in the configuration and automatically added them to the API request as include query parameter.

      name: parent
      selector: /attributes/tid
      relationship: parent

And the work here is still in progress, I have actually created a JSON:API fetcher today as well, added support for filtering, see https://www.drupal.org/docs/8/modules/jsonapi/filtering.

Basically, it tries to turn configuration like below

jsonapi_filters:
  groups:
    -
      key: ag
      conjunction: OR
    -
      key: bg
      conjunction: and
  conditions:
    -
      key: a
      path: value
      operator: STARTS_WITH
      value: todo
      memberOf: ag
    -
      key: b
      path: value
      operator: STARTS_WITH
      value: todo
      memberOf: ag
   -
      key: c
      path: value
      operator: STARTS_WITH
      value: todo
      memberOf: bg
    -
      key: d
      path: value
      operator: STARTS_WITH
      value: todo

to a valid JSON:API filter query parameter like this.

/jsonapi/node/news?
filter[a][condition][operator]=STARTS_WITH
&filter[a][condition][path]=value
&filter[a][condition][value]=todo
&filter[ag][group][conjunction]=OR
&filter[b][condition][operator]=STARTS_WITH
&filter[b][condition][path]=value
&filter[b][condition][value]=todo
&filter[bg][group][conjunction]=and
&filter[c][condition][operator]=STARTS_WITH
&filter[c][condition][path]=value
&filter[c][condition][value]=todo
&filter[d][condition][operator]=STARTS_WITH
&filter[d][condition][path]=value
&filter[d][condition][value]=todo

I will submit my patch later.

In long term, I want help make migrate_plus have full JSON:API source support.

benjifisher’s picture

+1 for the idea of a data parser, or perhaps a source plugin, for JSON:API. Let's make it easy to import data that uses this standard.

This may be the future of migration: old system produces JSON:API and then the Migrate API sucks it into Drupal.

Now that Drupal core includes the JSON:API module, this could become the standard method for migrations from D8 to D8, or D8 to D9. It is not a new idea: see https://www.lullabot.com/articles/pull-content-from-a-remote-drupal-8-si...

ttamniwdoog’s picture

Looks like https://www.drupal.org/project/jsonapi_include took care of the relationships selector until recently https://www.drupal.org/project/jsonapi_include/issues/3057327 . Since jsonapi_include will not work, if I am reading the Normalizer's changes correctly here: https://www.drupal.org/project/jsonapi/issues/2923779#comment-12407443 , is there a way to "get" the referenced entity information without a separate "GET" request?

weynhamz’s picture

@benjifisher, I already have a working data_parser and data_fether for url source plugin, I was also thinking to create a jsonapi source plugin as well, I will try to consolidate the work I have done, and submit a patch for review by this weekend.

@ttamniwdoog, I think it is already within the jsonapi module in core, with 'include' parameter, it will contain the referenced entity within the same request.

weynhamz’s picture

Just in case someone is interested, here are the data_parser and data_fetcher plugin created and in use in of one my projects.

ttamniwdoog’s picture

Thanks @weynhamz. I think you are right but since I have more than one entity reference on this content type, when I use the include parameter in the URL, like this: https://[domain].tld/jsonapi/node/blog_post?include=field_tags.vid&include=field_category.vid I only see the last name/value pair named in the result set. I may need to write a source plugin for my use case but would like to know if I am mistaken before I take that road. Thanks again

weynhamz’s picture

@ttamniwdoog I think it should be like this https://[domain].tld/jsonapi/node/blog_post?include=field_tags,field_category

ttamniwdoog’s picture

Thanks @weynhamz. That works great. In your .yml file, which source item_selector are you able to use to get both "data" and "included" since they both are at the root level of our JSON?

zipme_hkt’s picture

i was update new release help to merge include and relationship: https://www.drupal.org/project/jsonapi_include.
Now you can also toggle include data with query param jsonapi_include=1 or jsonapi_include=0
It help to easy work with migrate module.

dmlb2000’s picture

Yet another project in this space...

https://git.drupalcode.org/project/migrate_source_jsonapi

Has some similar code as above.