Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
Problem/Motivation
Over in #2202511: Implement migration groups we are adding group support and there was talk of adding a way to store configuration for the group.
Storage of configuration shared among related migrations, most commonly source information - migrate_d2d stores database connection info in the group, wordpress_migrate stores the location of the source WXR file, etc.
Proposed resolution
Discuss our options.
Remaining tasks
Write the patch.
User interface changes
n/a
API changes
Possible additions.
Comment | File | Size | Author |
---|---|---|---|
#43 | interdiff.txt | 1.75 KB | mikeryan |
#43 | support_shared-2302307-43.patch | 13.91 KB | mikeryan |
#24 | interdiff.txt | 617 bytes | benjy |
#24 | 2302307-24.patch | 11.94 KB | benjy |
Comments
Comment #1
mikeryanTime to discuss our options. I have a patch for migrate_drupal at #2404401: Use dependency injection for the database connection which takes advantage of #2404397: Support injection of database configuration in SqlBase to store database credentials in the active configuration for each migration - it would of course be much cleaner to store them in one place, the group, not least because it would be much simpler to update the credentials (moving the source db to a different server, belatedly deciding to use a proper read-only SQL user instead of root, etc.).
So, I see migration_group as a new configuration entity. It would support id (machine name) and label, and apart from that could store any configuration relevant to the particular application (db credentials for any SQL-based migrations, location of files for a migration from CSV or XML, etc.). I think this should be relatively straightforward.
The main question in my mind is where to maintain the relationship between migrations and their groups. I believe we are agreed there is value is allowing a migration to appear in multiple groups. Do we extend the current migration_groups support (using machine names instead of labels)
or list the migrations in the group (note the source information would be written to active configuration based on dynamically provided configuration, not provided in the original .yml file) - much like a manifest that's part of the active configuration?
Edit: wrapped added configuration under 'arguments'
Thinking about some use cases - migrate_upgrade loading the migrations to run, a dashboard UI or drush migrate-status loading migrations to display - it seems clearly more straight-forward to me to have the group contain the list of migrations, rather than scanning all active migrations to find a matching group. The first approach makes it simpler to see what groups a migration belongs to, but that should be a relatively rare need, not to mention since there will be far fewer groups than migrations there's less to scan for the answer.
Unless anyone sees a problem with the general configuration entity approach, I'll start working on a patch.
Comment #2
mikeryanI'm digging in, and have a hackish version of this sort of almost working with migrate_upgrade. One element I didn't account for above, however, is exactly when and how the group arguments get merged into the migration instance. If a migration is in more than one group, which one does it pull arguments from when loaded? My hack at the moment is to set migration_groups on each migration (in the active configuration) to the specific migration_group being used, then when loading the migration load the specified group and array_merge the group's arguments into the migration's configuration (not saving to active config, just using it locally). It's still not quite working right though.
I'm not seeing a clean way to deal with this at the moment (maybe just because it's Friday afternoon of a long week). Now I'm kind of feeling like addressing #2202475: Registering/instantiating migrations first - this part will make more sense if we're dealing with concrete instances. Guess I'll let it percolate over the weekend...
Comment #3
benjy CreditAttribution: benjy commentedWould this mean though that users writing migrations by hand then have to create a group as well as list the migrations in that group in a separate file?
It's a bit more complicated than simply adding a group name in the yaml file like we currently have.
Comment #4
mikeryanFinally got around to doing this. The attached patch adds MigrationGroupInterface/MigrationGroup. Group config contains a shared_configuration key, which provides default values to be applied to all migrations which reference the group through the active_groups key. When a Migration is constructed, if it contains an active_group key which is the machine name of a MigrationGroup instance, it will merge in any value in shared_configuration that is not already explicitly set. There are a couple of POC patches using this:
Ideally, I'd like the key pointing a specific migration instance to its group to be "migration_group" rather than "active_group". Right now, we have the "migration_groups" key, but that is being used more as tagging rather than full-fledged groups. Actually, in the current migrate_drupal usage, I'd rather see a migrate_drupal-specific key "drupal_versions", since what it's really doing is listing what Drupal versions the given migration configuration applies to. I tried implementing this but got stuck trying to figure out how to define and apply the configuration schema properly, so figured I should first focus on the basic group functionality.
Comment #5
andypostWhy migration_group a config entity?
It's more like content that describes a migration process with migrations ER field pointing to migration config entities.
Comment #6
mikeryanI don't see how the migration group would be content? Consider a custom migration project - you work on it in a development environment, tweaking the individual migrations' configuration (field mappings etc.) and the common configuration all migrations should have access to, and ultimately want to deploy that configuration to staging and ultimately production environments.
Comment #7
alexpottA migration group should have a module field to store the providing module and this should be used for dependency calculation. If migrations are config, then migration groups feel like config to me - especially since they are going to be storing configurable data.
Comment #8
alexpottA migration group should have a module field to store the providing module and this should be used for dependency calculation. If migrations are config, then migration groups feel like config to me - especially since they are going to be storing configurable data.
Comment #9
mikeryanTook a stab at adding the module field and accounting for it in the dependencies, but it doesn't seem to be working - when creating a group in migrate_upgrade I added
module: migrate_drupal
, but uninstalling migrate_drupal did not remove the group. What am I missing?Also, deleting a group should delete the migrations attached to it - added support for that (untested, since the group isn't getting deleted...).
Comment #10
mikeryanOops, accidentally committed some dpm() calls.
Note that calculateDependencies() is never called.
Comment #11
alexpott@mikeryan how did you create the migration group?
Comment #12
mikeryanI had migrate_upgrade doing this:
Looking in the config table, migrate.migration_group.upgrade_6 looked good.
Comment #13
alexpott@mikeryan you need to create an entity to use the config entity system. So something like:
Comment #14
mikeryanOK, so I had a false equivalence in my head between a configuration entity and the configuration that defines it. If we have a YAML file in config/install describing a configuration entity, that results in the creation of an entity, but that doesn't mean the entity and the configuration are the same thing. So creating the entity in PHP saves the configuration to the active configuration, but simply saving the configuration does not create an entity.
Basically, replacing the configFactory call with an entity_create did the job. When I uninstalled migrate_drupal, the group and all the migrations it contained were deleted. So, this patch seems to be on the right track - next step is adding some tests.
Thanks!
Comment #15
mikeryanOK, fully ready for review. No interdiff, it would be about the same size as the full patch. These are the changes:
Comment #16
benjy CreditAttribution: benjy at CodeDrop commentedI read the code, it all looks good, nice work. I'd still like to have a test with this patch applies but a few nitpicks below.
Why do groups need weight?
Missing docs
Comment #17
mikeryanPrimarily because of copypasta from the Migration definition... Actually, it would be nice for UIs to provide the ability to order groups for display, but given probably 90-95% of sites using migrate will have one migration group, and 99% of the remainder no more than three or four, sorting them isn't a priority, we can hold off till someone eventually opens a minor feature request.
Will add, updated patch coming, thanks.
Comment #18
mikeryanComment #19
benjy CreditAttribution: benjy at CodeDrop commentedNice work, I think this is RTBC for me. One small thing below:
We could inject this.
Comment #20
mikeryan@benjy: I'm not quite sure I follow - what would be injected, the migration_group storage manager?
Thanks.
Comment #21
benjy CreditAttribution: benjy at CodeDrop commentedYes exactly. See ConfigEntityStorage::createInstance() for an example.
Comment #22
benjy CreditAttribution: benjy at CodeDrop commentedOK, I've added that. I think this is RTBC.
Comment #24
benjy CreditAttribution: benjy at CodeDrop commentedDoh! I fixed this once and then un-did it by accident.
Comment #25
anavarre80 chars
s/overrideen/overridden (I *think* 80 chars doesn't apply for inline comments)
Also, shouldn't we replace entity_create() by
\Drupal::entityManager()->getStorage($entity_type)->create($values);
?Comment #26
mikeryan@benjy: Thanks for the injection!
@anavarre: Here you go.
Comment #27
benjy CreditAttribution: benjy at CodeDrop commentedGreat, I think this is ready. I only added the boiler plate injection so I think i'm still good to RTBC this.
Comment #31
mikeryanRestoring status after testbot glitch.
Comment #32
alexpottWhy is this needed?
Comment #35
benjy CreditAttribution: benjy at CodeDrop commentedWow, great spot. There were a few issues in the tests to do with schema, i've fixed them up.
Comment #36
ultimikeThis looks good - it'll be great to get this one committed.
thanks,
-mike
Comment #37
alexpottThis is going to be super confusing. Also I don't get how we're going to fix the dependency problem if the active group is changeable - what is the situation where a migration can belong to more than one group?
Comment #38
mikeryanre: confusing - see the followup issue #2466797: Rename migration_groups key to avoid confusion with MigrationGroup support.
A migration can't belong to more than one group (in the concrete sense of group we're using in this issue). A migration can have more than one tag (what migration_groups has been used for).
Comment #39
mikeryanCan we get this back to RTBC?
Thanks.
Comment #40
benjy CreditAttribution: benjy at CodeDrop commentedRTBC for me but I think #2466797: Rename migration_groups key to avoid confusion with MigrationGroup support should go in first.
Comment #42
mikeryanHrrm, let me see if I need to reroll...
Comment #43
mikeryanThe new CCK plugin manager injection conflicted with the migration group storage injection we're adding here - rerolled.
Comment #44
benjy CreditAttribution: benjy at CodeDrop commentedBack to RTBC.
Comment #45
xjmAssigning to @alexpott for his followup per his request.
Comment #46
alexpottIn a recent meeting we agreed that migration groups should be explored in contrib as part of the migrate UI. Postponing this issue as a result.
Comment #47
benjy CreditAttribution: benjy commentedI'm going to close this as part of some triage. It is currently in the migrate_plus module.