Migrating data from a JCR source
The Migrate Source JCR module provides a source plugin for migrating from Java Content Repository (JCR) storage. It leverages Jackalope, an implementation of PHPCR, for queries.
How to import simple JCR nodes with Migrate Source JCR plugin
This example is intended to explain how to import the Title and Body fields from a JCR database into a Drupal node.
- Download the Migrate Source JCR module and enable it.
- Download Migrate Plus and enable it
- Download Migrate Tools and enable it.
- Ensure you're using the latest version of Drush.
- Ensure your Drupal instance has a content type called "Blog" (machine name "blog") with at least two fields:
- Title (title) - a plain text field (Drupal adds this for you to Node types)
- Body (body) - a long formatted text field
- Ensure you have access to a JCR storage. See "How to setup a JCR server" below if you do not have a JCR server available. Take note of the following:
- Host: This should be the full URL to the JCR endpoint.
- User: The username, if authentication is required.
- Pass: The password, if authentication is required.
- Workspace: The JCR workspace to read from.
- Populate the JCR source with the exact test data shown here. You can do this by:
- Save https://gist.github.com/josephdpurcell/728dd8744fabd4fcf80798664e270df5#... as "migrate_source_jcr_import_sample_data.php" in your repository root
- Save the following XML file as "migrate_source_jcr_example.xml" in your repository root:
<?xml version="1.0"?> <blog xmlns:jcr="http://www.jcp.org/jcr/1.0" xmlns:nt="http://www.jcp.org/jcr/nt/1.0" xmlns:sling="http://sling.apache.org/jcr/sling/1.0" jcr:primaryType="nt:unstructured" jcr:createdBy="admin" jcr:created="2019-10-16T12:34:56.123+00:00"> <jcr:content jcr:uuid="f8db54ed-593a-420c-a291-d7cc650577eb"> <blog> <node1> <jcr:content jcr:title="Node with no body" sling:resourceType="components/structure/page" jcr:uuid="c2ce1e97-51bf-48b2-ab8f-daa963b73aa8"/> </node1> <node2> <jcr:content jcr:title="Node with a body" sling:resourceType="components/structure/page" jcr:uuid="24146ada-9567-4455-a2b8-8b6582e78c36"> <body jcr:primaryType="nt:unstructured" text="Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum."/> </jcr:content> </node2> </blog> </jcr:content> </blog> - Now, in your repository root run this command from the command line:
php migrate_source_jcr_import_sample_data.php
- Copy the following YML contents:
id: blog label: Blog source: plugin: jcr host: "http://localhost:8080/server" query: 'SELECT * FROM [nt:unstructured] AS node WHERE ISDESCENDANTNODE(node, "/migrate_source_jcr_example/blog") AND [sling:resourceType] = "components/structure/page"' type: "JCR-SQL2" user: "admin" pass: "admin" workspace: "default" keys: - title fields: - name: title subpath: '' property: 'jcr:title' - name: body subpath: body property: text process: title: - plugin: skip_on_empty method: row source: title - plugin: get source: title body/value: - plugin: get source: body body/format: - plugin: default_value default_value: rich_text destination: plugin: 'entity:node' default_bundle: blog - Import the YAML you just copied as a migration. To do this:
- Navigate to Administration > Configuration > Development > Synchronize (
admin/config/development/configuration/single/import) - Select Migration as the Configuration type
- Copy-paste the YAML format migration definition, changing the Host, User, Pass, and Workspace values as needed. Click Import.
- Navigate to Administration > Configuration > Development > Synchronize (
- Execute the migration. To do this run this command:
drush migrate:import blog - Confirm you see 2 nodes imported. The drush command should output:
[notice] Processed 2 items (2 created, 0 updated, 0 failed, 0 ignored) - done with 'blog'
How to migrate Multilingual Content
NOTE: this section of documentation needs review!
Migrate Source JCR supports multilingual migrations because Migrate API supports them. The process for writing such a migration should look similar. If you need a place to start, a suggestion is this:
- Write a migration for your default language.
- Write a migration for each additional language, ensuring the migration is using the same entity ID as the default language migration.
Here's a sketch of what that might look like, assuming the JCR data has a property "jcr:language" containing a valid langcode:
Default language migration
id: blog_en
source:
query: 'SELECT * FROM [nt:unstructured] AS node WHERE ISDESCENDANTNODE(node, "/migrate_source_jcr_example/blog") AND [sling:resourceType] = "components/structure/page" AND [jcr:language] = "en"'
keys:
- title
...
process:
langcode:
plugin: default_value
default_value: 'en'Spanish translation
id: blog_es
source:
query: 'SELECT * FROM [nt:unstructured] AS node WHERE ISDESCENDANTNODE(node, "/migrate_source_jcr_example/blog") AND [sling:resourceType] = "components/structure/page" AND [jcr:language] = "es"'
keys:
- title
...
process:
nid:
plugin: migration_lookup
source: title
migration: blog_en
no_stub: true
langcode:
plugin: default_value
default_value: 'es'
...
migration_dependencies:
required:
- blog_enThere are plenty of variations to this approach, for example you could turn the Spanish translation migration into a generic migration of any non-default languages by making the langcode a static map instead of a default value.
How to setup a JCR server
If you do not have access to a JCR server and you want to test this module out you can do so using the Jackrabbit standalone server. However, it's important to remember that while JCR is an open standard, not all JCR servers will have the same features and data types. It's best to test with a copy of the real data you'll be migrating with.
- Install the Java virtual machine. You can find this here: http://www.java.com/en/download/manual.jsp
- Download the latest stable standalone Jackrabbit server: http://jackrabbit.apache.org/jcr/downloads.html
- In the folder you downloaded the JAR file, start the Jackrabbit server with this command:
java -jar jackrabbit-standalone-*.jar --port 8080 - You should now be able to access the server at: http://localhost:8080/
Help improve this page
You can:
- Log in, click Edit, and edit this page
- Log in, click Discuss, update the Page status value, and suggest an improvement
- Log in and create a Documentation issue with your suggestion