How to explode a string? [#2781611]

I can import the "types" field from the JSON file as tags, as defined in this migrate_plus.migration.tags.yml file:

# Migration configuration for article tags.
id: migrate_article_tags
label: Migrate article
migration_group: Migrate articles
source:
  plugin: url
  data_fetcher_plugin: http
  data_parser_plugin: json
  urls: http://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/all_day.geojson
  item_selector: features
  fields:
    -
      name: machine_name
      label: 'Unique position identifier'
      selector: id
    -
      name: types
      label: 'Types'
      selector: 'properties/types'
  ids:
    machine_name:
      type: string

destination:
  plugin: 'entity:taxonomy_term'

process:
  vid:
    plugin: default_value
    default_value: tags
  name: types

migration_dependencies: {}

They are imported as for example ",nearby-cities,origin,phase-data,scitech-link,", but if I try to separate the items, by adding the explode plugin to the process (see below) the migration fails silently, with just (0 created, 0 updated, 1 failed, 0 ignored) for each import:

process:
  vid:
#    plugin: default_value
    plugin: explode
    default_value: tags
    limit: 100
    delimiter: ,
  name: types

What am I doing wrong?

Comments

Comment #1

9 August 2016 at 20:45

ressa created an issue. See original summary.

Comment #2

mikeryan

he/him

English

Pittsfield, MA, USA

commented 5 September 2016 at 23:35

Status:

Active

» Postponed (maintainer needs more info)

First, let me explain what your current .yml is doing:

process:
  # vid is the vocabulary machine name, which you want to be 'tags'. The original .yml above was the
  # correct way to do this. What the below is doing is... nothing, because there is no source field, and
  # the explode plugin has no "default_value". The migration will fail, because no vocabulary is specified.
  vid:
#    plugin: default_value
    plugin: explode
    default_value: tags
    limit: 100
    delimiter: ,
  # This sets the term name to the comma-separated string.
  name: types

So, it would be tempting to do

process:
  vid:
    plugin: default_value
    default_value: tags
  name:
    plugin: explode
    source: types
    delimiter: ,
    limit: 100

This demonstrates proper usage of the explode plugin. Yet, that isn't going to work in this particular case. It's going to generate an array - a list of term names - but the taxonomy term "name" field is a single-value string, not an array. What you want to do here is create multiple terms, but it's inherent in how migration work that each source row must correspond to a single destination object - i.e., each pass through the processing pipeline will only create one item of the destination type.

There are two things you can do to get your "types" into the tags vocabulary:

Have a preprocessing step which reads the JSON, pulls all the types out, and saves them to a database table (or some other form of storage) so there is one unique row per distinct term. Then, your migrate_article_tags migration would read from that table and generate the terms quite simply.
Instead of having a distinct migration for your tags, you can use the entity_generate plugin provided in migrate_plus to have them generated as needed while migrating your articles.

entity_generate can be used like this:

process:
...
  field_tags:
    -
      plugin: explode
      source: types
      delimiter: ,
      limit: 100
    -
      plugin: entity_generate

The first step of the field_tags processing will explode the types into an array of simple strings - each one will then be passed to entity_generate, which will see if a tag of that name already exists. If it does, it will return a reference to the existing term - if not, it will create the term, then return the reference.

Now, I foresee a little problem - with those leading and trailing commas, explode is going to give you empty strings, and you'll end up with a term with a zero-length name which all articles reference. Off the top of my head I'm not sure of a way to clean that up purely through YAML - it may be simplest to just let the migration create it and manually delete that term after the fact.

One more note - when your migration reports failures, you can see the detailed messages with

drush mmsg migrate_article_tags

To see them while running your migration, add -d to your drush mi command.

Comment #3

ressa

he/him

commented 6 September 2016 at 07:43

Thank you @mikeryan for the very detailed write up on how to migrate taxonomy terms, and also a big Thank you! for all your work on the Migrate modules. The information posted here will be very useful to me, as well as to many others, I am certain.

A follow up question: If I use option #2 and generate tags while migrating my articles with the entity_generate plugin, is there a way of including tags when doing a rollback of my article import? Or is it only possible to empty the Tags vocabulary of terms programmatically?

Comment #4

mikeryan

he/him

English

Pittsfield, MA, USA

commented 6 September 2016 at 14:09

Afraid that's the downside of entity_generate - because the referenced entities are generated as a side-effect rather than their own migration, they aren't tracked in a map table and thus can't be rolled back via drush migrate-rollback, you'd need a custom process to clean them out (e.g., something that deletes tags which have no entity references pointing at them).

Comment #5

ressa

he/him

commented 7 September 2016 at 09:51

Okay, I'll see if I can find a way to just wipe all Tags programmatically and post it here if I do.

EDIT: I found this snippet, which will erase all terms from a specific vocabulary, where "tags" is the name of the vocabulary:

$tids = \Drupal::entityQuery('taxonomy_term')
  ->condition('vid', 'tags')
  ->execute();

$controller = \Drupal::entityTypeManager()->getStorage('taxonomy_term');
$entities = $controller->loadMultiple($tids);
$controller->delete($entities);

From: http://drupal.stackexchange.com/questions/213256/how-do-i-delete-a-vocab...

Comment #6

mikeryan

he/him

English

Pittsfield, MA, USA

commented 11 October 2016 at 21:17

Status:

Postponed (maintainer needs more info)

» Fixed

Comment #7

25 October 2016 at 21:24

Status:

Fixed

» Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

Comment #8

leisurman commented 17 November 2016 at 16:35

Thank you!!!!!!!
I was able to pull in a csv file with multiple terms like this

  field_country:   # this is your entity reference field attached to your content type
    -
      plugin: explode
      source: country   # Your csv column name and the name of your drupal vocabulary
      delimiter: ,
      limit: 100
    -
      plugin: entity_generate

My csv file is:

"nid","country"

"1","usa,brazil"

"2","vietnam"

Comment #9

leisurman commented 17 November 2016 at 16:48

I did not know you can use 2 plugin for one field, is that why you are using the 2 dashes?

Comment #10

leisurman commented 17 November 2016 at 16:58

I noticed that I was not using destination: plugin: 'entity:taxonomy_term' I am using destination: plugin: entity:node. And It still added my taxonomy to my nodes and to the drupal vocabulary. Why does this work without using plugin: 'entity:taxonomy_term.

Below is my migration file

langcode: en
status: true
dependencies:
  enforced:
    # List here the name of the module that provided this migration if you want
    # this config to be removed when that module is uninstalled.
    module:
      - custom_migrate
# The source data is in CSV files, so we use the 'csv' source plugin.
id: migrate1e
label: CSV file migration
migration_tags:
  - CSV
source:
  plugin: csv
  # Full path to the file.
  path: 'public://csv/people.csv'
  delimiter: ','
  enclosure: '"'
  # The number of rows at the beginning which are not data.
  header_row_count: 1
  # These are the field names from the source file representing the key
  # uniquely identifying each game - they will be stored in the migration
  # map table as columns sourceid1, sourceid2, and sourceid3.
  keys:
    - id
  # Here we identify the columns of interest in the source file. Each numeric
  # key is the 0-based index of the column. For each column, the key below
  # (e.g., "start_date") is the field name assigned to the data on import, to
  # be used in field mappings below. The value is a user-friendly string for
  # display by the migration UI.
  column_names:
    # So, here we're saying that the first field (index 0) on each line will
    # be stored in the start_date field in the Row object during migration, and
    # that name can be used to map the value below. "Date of game" will appear
    # in the UI to describe this field.
    0:
      id: Identifier
    1:
      first_name: First Name
    2:
      last_name: Last Name
    3:
      email: Email Address
    4:
      country: Country
    5:
      ip_address: IP Address
    6:
      date_of_birth: Date of Birth
process:
  # The content (node) type we are creating is 'people'.
  type:
    plugin: default_value
    default_value: people
  # Most fields can be mapped directly - we just specify the destination (D8)
  # field and the corresponding field name from above, and the values will be
  # copied in.
  title:
    plugin: concat
    source:
      - first_name
      - last_name
    delimiter: ' '
  field_first_name: first_name
  field_last_name: last_name
  field_email: email
  
  field_country:
    -
      plugin: explode
      source: country
      delimiter: ,
      limit: 100
    -
      plugin: entity_generate
    
  field_ip_address: ip_address

destination:
  # Here we're saying that each row of data (line from the CSV file) will be
  # used to create a node entity.
  plugin: entity:node
# List any optional or required migration dependencies.
# Requried means that 100% of the content must be migrated
# Optional means that that the other dependency should be run first but if there
# are items from the dependant migration that were not successful, it will still
# run the migration.
migration_dependencies:
  required: {}
  optional: {}

How to explode a string?

Comments

Comment #1

Comment #2

Comment #3

Comment #4

Comment #5

Comment #6

Comment #7

Comment #8

Comment #9

Comment #10

News items

Our community

Documentation

Drupal code base

Governance of community