Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
Problem/Motivation
In a typical Drupal upgrade migration, you'll end up with a few pairs of tables like migrate_map_56059480b192d/migrate_message_56059480b192d ("56059480b192d" appearing where you expect to see the ID of a migration), which are not actually used. These are coming from BuilderBase::getSourcePlugin() despite:
// By default, SqlBase subclasses will try to join on a map table. But in
// this case we're trying to use the source plugin as a detached iterator
// over the source data, so we don't want to join on (or create) the map
// table.
// @see SqlBase::initializeIterator()
$configuration['ignore_map'] = TRUE;
Obviously this is not working as intended...
Proposed resolution
Two resolutions were suggested by mikeryaņ #2.
- if Sql is the only implementation we can imagine, but maybe our answer here is to have an alternate "null" implementation of idmaps for use by the builder?
- Maybe we just make SourcePluginBase work without an idmap...
The first option, a custom null idMap plugin was implemented. This solution was supported by benjy, #8
Remaining tasks
Write tests
Review
etc
User interface changes
N/A
API changes
N/A
Data model changes
N/A
Comment | File | Size | Author |
---|---|---|---|
#48 | interdiff.txt | 713 bytes | quietone |
#48 | 2575135-48.patch | 7.79 KB | quietone |
#43 | interdiff.txt | 2.72 KB | quietone |
#43 | 2575135-43.patch | 7.91 KB | quietone |
#40 | interdiff.txt | 527 bytes | quietone |
Comments
Comment #2
mikeryanignore_map prevents *joining* to the map table, but joining is just an optimization, the map table is checked by the default code path to see if stuff has already been migrated. I was thinking at first that ignore_map should be used to prevent any reference at all to the idmap plugin, but while poking around I was looking at Migration::getIdMap() and its reference to $this->idMap. This is configuration for the idMap plugin, defaulting to [], which is never set and never can be set unless you extend the Migration class - meaning that the Sql idmap plugin is the only one that can be used as things are now. We've discussed previously whether there's any point in having idmaps as plugins if Sql is the only implementation we can imagine, but maybe our answer here is to have an alternate "null" implementation of idmaps for use by the builder?
Or, maybe we just make SourcePluginBase work without an idmap...
Comment #4
quietone CreditAttribution: quietone as a volunteer commentedWhile running test migrations for #2695297: Refactor EntityFile and use process plugins instead and working on #2603124: Sql::getRowBySource doesn't adhere to MigrateIdMapInterface::getRowBySource I noticed an over abundance of these dummy files, >700. Both of those are 8.2.x issues. So, I did some testing to see if this is just happening in 8.2.x.
Starting with a clean install of 8.1.x, I ran MigrateUpgrade6Test. That completed successfully with 9 dummy files. And then I switched to 8.2.x. Starting with a clean install, I ran MigrateUpgrade6Test. I didn't let it finish, I aborted the test when there were >400 dummy files.
Is this happening for anyone else, or is it just me?
Comment #6
ckaotikThis can quickly explode into very very very many tables very quickly. In my case, I've tracked it down to MigrationDeriverTrait::getSourcePlugin, which uses stub migrations to access source plugins. This gets called for migration derivers, e.g. D7NodeDeriver, on every definition cache clear.
I don't know if we really need to take this roundabout way of getting the source instance (probably not). We might also consider whether the fully functional SQL idMap is necessary for stub migrations (it probably is), or if we can at least check whether the migration seems legit (e.g. not
null
as the destination plugin).As a workaround, I've added a custom
null
idMap plugin and use that in the stub definition. I've attached this approach as a patch. I am currently unsure whether this is the whole issue, or just a single situation when things go awry.Comment #7
ckaotikSetting to needs review. May the testing commence ;)
Comment #8
benjy CreditAttribution: benjy at PreviousNext commentedThat seems like a pretty good fix and given it's green i'm presuming the map is not used with the fake source. Can we add a test for this?
Comment #9
quietone CreditAttribution: quietone as a volunteer commentedI get a WSOD with this installed and trying to run a migration from /upgrade. It occurs after entering db credentials.
Comment #10
benjy CreditAttribution: benjy at PreviousNext commentedYeah that became reserved in PHP7, we can just rename the class to something like NullIdMap
Comment #11
ckaotikOh, good to know about the reserved name! I've changed the class name to the suggested
NullIdMap
.However, I don't know how to go about writing tests, or even which cases we should test.
Comment #12
ckaotikAfter a little more thinking, I would love if the SQL tables were initialized only when the migration is actually run. As far as I can tell that is the way migrate worked in Drupal 7. This would also prevent mapping and messages tables being created for migrations whose requirements are not met (e.g. D6 nodes when migrating from a D7 legacy system).
Does anyone have an idea if this approach is reasonable with the way migrate & Co. currently work?
Comment #13
benjy CreditAttribution: benjy at PreviousNext commentedI believe $this->init() is only called when something actually uses the database or the map table name?
Comment #14
ckaotikYes, it is. However, the database gets used frequently, e.g. whenever status information is requested, such as processed counts. This happens in
migrate_tools
drush ms my_migration_name
or its migrations overview. I am currently trying to delay this initialization as long as possible but will open a new issue for that.While it seems sensible, I've noticed that source and id_map plugins are still very much intertwined. So I'll stick to the
NullIdMap
solution for this issue for now. :)Comment #15
quietone CreditAttribution: quietone as a volunteer commentedThx, ckaotik. I just ran a migration using d7_dump with the patch in #11 and didn't get WSOD or a fatal. There was one 'dummy' map and message table.
Comment #16
ckaotikWhich migrations did you run, @quietone? Do you use D6, D7 or something else as source system?
Comment #17
quietone CreditAttribution: quietone as a volunteer commented@ckaotik, I used db-tools.php to import the d7_dump test fixture to the db. I did a fresh install of D8 and then ran the full migration, from the UI.
If you haven't seen it already, Generating database fixtures for D8 Migrate tests has a little bit more info on working with the dumps.
Comment #18
ckaotikThat looks useful, indeed ;) Thanks for the link!
All right, I think I found the issue.
migrate_drupal
uses similar code for derived vocabularies. So do some source plugins from theuser
module. I've added these to the patch.Comment #19
quietone CreditAttribution: quietone as a volunteer commented@ckaotik, your welcome.
I ran it again and still get one map/message table pair. The map has 3 sourceids, Both d7_field_instance and d7_block have 3 source ids. Hope that helps.
Comment #20
ckaotikHrm, I've searched and couldn't find where this originates from. The only place that seems sensible is from the user migration (it queries
d7_field_instance
), which I've adjusted with the previous patch. Could this be a leftover from your previous test run? If so, the table name would still bemigrate_map_57ff429e55e18
.If it's not leftover, some debugging output in
Drupal\migrate\Plugin\migrate\id_map\Sql::ensureTables()
might lead to the source. As I don't have a test system at hand, I can't really check right now :/On a different topic, though: Do we need to adjust the existing tests to use the
null
map? Or can these be left unchanged?I've also updated the patch to make sure the basic
ArrayIterator
is used.Comment #21
quietone CreditAttribution: quietone as a volunteer commentedNo, not a leftover. Did a llittle debugging and the dummy table is created when the source plugin is d7_field_instance and the entity_type is user.
Comment #22
ckaotikThanks for checking! It was indeed the User migration.
The issue was just caused by a typo introduced in comment 18, when I used
id_map
instead ofidMap
. I've updated the patch accordingly (sorry, I have no interdiff at hand).It would be great if you could verify that it finally works, @quietone :)
Comment #23
mikeryanCode looks good to me, hopefully quietone can verify it works in practice.
Then, on to tests!
Comment #24
mikeryanNeeds work because needs tests.
Comment #25
quietone CreditAttribution: quietone as a volunteer commentedJust ran two migrations from the UI, one using d6_dump as the source and the other using d7_dump as the source. I am pleased to report that there are no dummy tables. Yea!
Comment #26
quietone CreditAttribution: quietone as a volunteer commentedJust noticed one small thing.
Missing doc block for method.
Comment #27
olegel CreditAttribution: olegel as a volunteer commentedI'm working on this..
Comment #28
olegel CreditAttribution: olegel as a volunteer commentedAdded @inheritdoc as mentioned in #26.
Comment #29
quietone CreditAttribution: quietone as a volunteer commentedChanging to Needs review for testing.
Comment #30
quietone CreditAttribution: quietone as a volunteer commentedForgot this still needs tests, back to needs work.
Comment #33
quietone CreditAttribution: quietone as a volunteer commentedUpdate IS
Comment #34
quietone CreditAttribution: quietone as a volunteer commentedHere is a test and a fail patch. I don't like that it relies on MigrateDrupal6TestBase. I reckon it can be done by creating the tables needed, but I started with what I know. To take this further will someone provide some direction, example etc. Thanks.
No interdiff since the change is solely the addition of the test.
Comment #37
quietone CreditAttribution: quietone as a volunteer commentedA namespace error in the test file. Let's try again.
Comment #39
heddnNit: code comment standards has {} around these.
Is this at all related to #2817833: Delay sql map table creation?
Comment #40
quietone CreditAttribution: quietone as a volunteer commentedFixed doc blocks.
I don't think is related to the other issue. That one is about where in the flow the tables are created. This is only about the dummy tables created by derivers. As explained by ckaotik in #6.
Comment #41
maxocub CreditAttribution: maxocub as a volunteer commentedThis looks good. Only one nit:
Line indented incorrectly.
This is not enough to go back to NW, this can be fixed at commit, right?
Comment #42
larowlanDo we need to implement ContainerFactoryPluginInterface if we're not injecting anything?
I don't think we do.
I think these can all use inheritdoc
We can use
assertCount
hereCan we fix #41 at the same time?
Thanks.
This looks like an annoying problem and an elegant solution - nice.
Comment #43
quietone CreditAttribution: quietone as a volunteer commentedFixes for #41 and #42.
Comment #44
heddnFeedback addressed
Comment #45
larowlanCrediting @mikeryan for proposing the solution in #2
Comment #46
larowlannit: No longer used - sorry :(
Comment #47
larowlanupdating issue credits to include myself for #42
Comment #48
quietone CreditAttribution: quietone as a volunteer commentedThanks for the review. Unused lines removed.
Comment #49
opdaviesComment #51
larowlanCommitted 694b95b and pushed to 8.5.x - thanks.