D7 migrate included abstract MigrateList and MigrateItem classes to support migration scenarios such as REST APIs where rather than iterating over a monolithic source, you obtain a list of items from one place (endpoint) and retrieve each individual item from another. I'll be working on this support in conjunction with #2429137: Implement basics of wine example in D8 (which demonstrated some examples of this pattern) and #2500955: Port MigrateSourceXML to Drupal 8 migrate source plugin.
Comment | File | Size | Author |
---|---|---|---|
#32 | 2608610-32.patch | 8.04 KB | grathbone |
#31 | migrate_plus_list_item_pattern-2608610-20.patch | 7.1 KB | emilorol |
#27 | migrate_plus_list_item_pattern-2608610-8.x.-4.1.patch | 7 KB | emilorol |
#18 | list_item_pattern-2608610-17.patch | 6.68 KB | dmouse |
#16 | list_item_pattern-2608610-16.patch | 6.7 KB | dmouse |
Comments
Comment #2
mikeryanSee also #2621490: [meta] Maximize consistency with JSON source plugin and #2621492: [meta] Maximize consistency with XML source plugin, which will be the major consumers. migrate_source_json does include some support for the list/item pattern, so I will be looking at if any of it can be generalized and moved into migrate_plus to be shared amongst JSON, XML, and potentially other source plugins.
Comment #3
mikeryanThis might be addressed as part of #2623012: Implement interfaces and base classes for URL-based sources.
Comment #4
mikeryan#2640508: Refactor fetcher/parser classes to make fetcher the iterator should make it fairly simple to support the list/item pattern as a fetcher - (re)postponing on that issue.
Comment #5
mikeryanComment #6
mikeryanThe "make fetcher the iterator" issue probably isn't going to proceed - this should be pursued on its own.
Comment #7
Grayside CreditAttribution: Grayside at Phase2 for Norwegian Cruise Line commentedGoing to start digging on this tomorrow.
Comment #8
Grayside CreditAttribution: Grayside at Phase2 for Norwegian Cruise Line commentedAfter thinking about this for awhile, creating a generic solution for this is pretty tricky, and here are a couple issues:
I don't want to spend time digging into those right now without a clearer sense of project goals in the space, so I am working on a change to the JSON parser.
Comment #9
Grayside CreditAttribution: Grayside at Phase2 for Norwegian Cruise Line commentedAttached is my work in progress, going to resume tomorrow with some live exercises.
Comment #10
Grayside CreditAttribution: Grayside at Phase2 for Norwegian Cruise Line commentedThis seems to be working for me.
Approach
I could not see a way to break apart the existing components without wildly expanding scope, so I focused on my JSON use case. Within the JSON parser's
fetchNextRow()
I saw an opportunity to check on config and use thegetSourceData
method to do a follow-up request for the item. This seems like a pattern that might work cleanly for other parser plugins, but not in a generic manner.Configuration Changes
As noted in the docblock for
getRemoteItemData()
, this introduces a new list configuration parameter for the JSON data parser, which is stuck directly on the top-level of the source plugin config. list can be set to TRUE for a default execution, or it can contain any of the following nested configuration:New/Refactored Utility Methods
As part of this work, I ended up with two static methods that seem like they don't belong in the Json parser anymore.
select()
is a consolidation of the xpath-like processing, and inspired me to file #2800295: Use external property access library in lieu of "xpath-like" implementation.getTemplatedUri()
is the "naive" templating solution. Seems like something core should provide, but I couldn't find an API for it.Problems
I noticed the HTTP data fetcher has a tendency to throw all errors as MigrationException. This makes sense for larger source URLs, but when we are making requests for individual item data it's a pain. I investigated the idea of setting a "row" context in which empty responses or ClientException (4xx errors) would throw a MigrateSkipRowException instead to make the migration more robust, but unfortunately these exceptions fall all the way up to
Drupal\migrate\MigrateExecution::import()
where it stops the migration.Not sure where we want to try catching a MigrateSkipRowException, or if it should be in this issue.
Potential Follow-ups
Comment #11
Grayside CreditAttribution: Grayside at Phase2 for Norwegian Cruise Line commentedDone for now.
Comment #12
mikeryanThanks for this patch - I'm going to need to think on this bit, just want to let you know I'm not ignoring it as I catch up on reviews...
Comment #13
Grayside CreditAttribution: Grayside at Phase2 for Norwegian Cruise Line commentedReroll.
Comment #14
Grayside CreditAttribution: Grayside at Phase2 for Norwegian Cruise Line commentedI suspect the current implementation of this or #2640516: Support paging through multiple requests is disrupting rollbacks. Got this second-hand so I do not have a lot of details, but likely related to item counts.
Comment #15
Grayside CreditAttribution: Grayside at Phase2 for Norwegian Cruise Line commentedMissed one spot in getSourceData() where $this->itemSelector => $item_selector.
Comment #16
dmouseI just update this patch with the 8.x-3.x branch
Comment #18
dmouseUpdated patch
Comment #19
heddnAny chance we could add a unit test or two? I think that would be super helpful to iron out any outstanding bugs to the interface.
Comment #20
Grayside CreditAttribution: Grayside at Phase2 for Norwegian Cruise Line commentedWe're pretty busy working to finish the project that necessitated this patch, so I doubt I'll be able to add tests in the next few weeks. Is the request a +1 that the basic approach is good?
Comment #21
heddnThere's enough complexity in what is being developed here I was hoping to see tests. From the deprecated code in http://cgit.drupalcode.org/migrate_source_json/tree/src/Plugin/migrate/s..., there was the concept of a single source and a multi source. And an implementation of JSONapi's recommendations. I feel like the list solution is going to vary widely for different sources. It might make sense to use a reader pattern that lets one easily swap out the list reader for another in an easy manner.
Comment #22
rv0 CreditAttribution: rv0 at Coworks.be commented@Grayside
Nice work.
I've been trying to use it for a few hours but I'm afraid I'm not fully getting the code.
On the bottom of getSourceData, you select the data using $this->itemSelector, then pass it on to the select() function, which uses the same selector to narrow it down again (which does not work of course, it's already narrowed down to the same selector).
I must be doing something wrong? I'm not sure if I get the whole $this->itemSelector vs $item_selector difference.
Comment #23
Grayside CreditAttribution: Grayside at Phase2 for Norwegian Cruise Line commentedgetSourceData() is used two different ways. First, it is used to pull the list of resources. Then as the source iterates across that list, it pulls a URL referenced by each of those rows.
Here's some example configuration.
Keep in mind the configuration key item_selector and method parameter item_selector are not hte same thing. The source config key item_selector becomes $this->itemSelector in the parser.
Comment #24
rv0 CreditAttribution: rv0 at Coworks.be commented@Grayside
Thank you for your quick reply. I tried all sorts of config options, either I'm missing something (I'm new to D8 and the D8 migrate API) or the code just won't work for my case. Perhaps it will help if I describe the data structure a bit (simplified):
There's a "list" url http://api/properties
returns an array with key "properties" and individual items "id, date"
There's an item url http://api/properties/id where id is a variable id
returns a property detail with key "property" and individual items "id, title, .."
This would leave me with structure like this I guess:
I've tried all sorts of variations, but usually end up with undefined index type of errors
Notice: Undefined index: properties in Drupal\migrate_plus\Plugin\migrate_plus\data_parser\Json::select()
Comment #25
pingwin4egAny progress on this issue, guys?
Comment #26
ilaz CreditAttribution: ilaz at Balidea commentedIt's not a solution for this specific issue, but I made this workaround that is related. Maybe is usefull for someone: Drupal Answer link.
Basically, for each row I call to a new URL.
Comment #27
emilorol CreditAttribution: emilorol commentedI couldn't apply patch #17 to 8.x-4.1 release.
Here is a new patch for the new release.
Comment #28
emilorol CreditAttribution: emilorol commentedSame patch with code standards applied.
Comment #29
emilorol CreditAttribution: emilorol commentedOne more time.
Comment #30
emilorol CreditAttribution: emilorol commented3rd time is charm, right?
Comment #31
emilorol CreditAttribution: emilorol commentedLast time
Comment #32
grathbone CreditAttribution: grathbone at VMLY&R commentedWas having a super rough time getting this patch to work due to issues where the selector found the data before being passed to ::select. I fixed those issues and as well created a couple more list: parameters to pass in to allow to specify whether the new item data should be merged with the original list's item data, should be placed in a nested key in the array, or should replace it.
Would love to get the functionality into the module. 2 year hiatus is a long time for what seems to be pretty useful functionality.
Comment #33
hodba CreditAttribution: hodba as a volunteer and at WITS Web Innovations & Technology Service commentedWrong file PLEASE IGNORE
Comment #34
hodba CreditAttribution: hodba as a volunteer and at WITS Web Innovations & Technology Service commented