Follow-up to #2623012: Implement interfaces and base classes for URL-based sources

Right now the item_selector for JSON is an integer depth (e.g. 3) - it would be nice to make it an xpath-like selector (e.g. response/items/item). And using the same pattern for field selectors will be helpful for pulling values not at the top level of the JSON item.

CommentFileSizeAuthor
#3 implement_xpath_like-2640514-3.patch5.45 KBmikeryan

Comments

mikeryan created an issue. See original summary.

mikeryan’s picture

Component: Source plugins » Plugins
mikeryan’s picture

Version: 8.x-1.x-dev » 8.x-2.x-dev
Status: Active » Needs review
StatusFileSize
new5.45 KB

Well, my current project has a more complex JSON feed than the depth-based selector can handle, so here goes... This patch maintains support for the depth method so existing JSON migrations won't break.

  • mikeryan committed ad60821 on 8.x-2.x
    Issue #2640514 by mikeryan: Implement xpath-like selectors for the JSON...
mikeryan’s picture

Status: Needs review » Fixed

Another patch I wanted committed before beta2...

ressa’s picture

Thanks for all your work on the Migrate module. I am trying to create a JSON test import, but can't seem to get access to nested fields. Will I be able to do it with below code and latest dev-version 8.x-2.x-dev from 2016-Aug-05?

# Migration configuration for articles.
id: migrate_article
label: Migrate article
migration_group: Migrate articles
source:
  plugin: url
  data_fetcher_plugin: http
  data_parser_plugin: json
  urls: http://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/all_day.geojson
  item_selector: features
  fields:
    -
      name: machine_name
      label: 'Unique position identifier'
      selector: id
    -
      name: place
      label: 'Place'
      selector: 'properties/place'
    -
      name: types
      label: 'Types'
      selector: 'properties/types'
  ids:
    machine_name:
      type: string

destination:
  plugin: entity:node
process:
  type:
    plugin: default_value
    default_value: article
  # Note that the source field names here (machine_name and friendly_name) were
  # defined by the 'fields' configuration for the source plugin above.
  title: place
  body: types
migration_dependencies: {}
ressa’s picture

To answer my own question: Yes, it is possible. Just download the latest dev-version of Migrate Plus with drush dl migrate_plus-8.x-2.x.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

mrmikedewolf’s picture

Awesome patch.

lquessenberry’s picture

Has this been rolled in by any chance? It is a wonderful patch and seems to be the only way I can get things to work properly with importers.

heddn’s picture

Yes, this is already closed. With links to where the patches were committed.

bobodrone’s picture

I have a somewhat complex JSON structure that (as I can see it) not can be parsed with either depth or path style right now. I need something like the JSON-path style syntax. See: https://github.com/FlowCommunications/JSONPath

I wonder if it could be a good idea to also use this approach?

I have a nested array where I also need to be able to iterate over each array of products to get all product_variations (commerce).
I can not just use the depth: 2 nor can I use: products/fields/product_variations since I need the parser to iterate over the outer array as well as the inner array. JSON path is a somewhat combination of depth and xpath style... kind of...

But With the "JSON path" syntax I could do either:

item_selector: $..product_variations
Get all nodes named product_variations from the root.

or :

item_selector: $.products.*.fields.product_variations
Use the * (or []) as a wildcard replacement for the outer array in my case.

Example json: (I need to migrate only ALL "product_variations")

  "products": [
    {
      "remote_product_id": "V-123456789",
      "fields": {
        "product_variations": [
          {
            "remote_product_variation_id": "V-123456789",
          },
          {
            "remote_product_variation_id": "V-123456789",
          }
        ]
      }
    },
    {
      "remote_product_id": "V-123456780",
      "fields": {
        "product_variations": [
          {
            "remote_product_variation_id": "V-123456780",
          },
          {
            "remote_product_variation_id": "V-123456781",
          }
        ]
      }
    }

I have tried to use that library in an extended version of the Json.php class in this module and it seams to work. But it would be interesting to hear what you think about this.

// Fredric

naveenvalecha’s picture

Adding a related issue.
@bobodrone,
There's a patch in the #3007709: Add XPath-style filtering ability in JSON data parser plugin to provide the new JSON Parser plugin with the https://github.com/FlowCommunications/JSONPath Can you test your case with the new patch and see if there's anything needs to be improved.