Problem/Motivation
Field names are limited to 32 characters, but profile field names may be longer on older sites. A Drupal 6 site with profile field names longer than 32 characters cannot be fully migrated.
Output from drush for a d6 site:
Attempt to create a field storage with an name longer than 32 characters: profile_expertise_electoralsystem [error]
(/www/cn8/core/modules/field/src/Entity/FieldStorageConfig.php:315)
Attempt to create a field storage with an name longer than 32 characters: profile_expertise_executivesystem [error]
(/www/cn8/core/modules/field/src/Entity/FieldStorageConfig.php:315)
Attempt to create a field storage with an name longer than 32 characters: profile_expertise_legislativesystem [error]
(/www/cn8/core/modules/field/src/Entity/FieldStorageConfig.php:315)
Attempt to create a field storage with an name longer than 32 characters: profile_expertise_naturalresources [error]
(/www/cn8/core/modules/field/src/Entity/FieldStorageConfig.php:315)
Processed 39 items (35 created, 0 updated, 4 failed, 0 ignored) - done with 'upgrade_user_profile_field' [status]
Proposed resolution
The resolution should take into account the field storage limitation to migrate Drupal 6/7 profile fields as field storage and fields, and map these new field names so that a migration of profile values will succeed.
All these requirements are met with make_unique_entity_field and setting a limit to 30 chars. That gives us 99 possible fields on a d6 user and surely is enough for the vast majority of sites.
Remaining tasks
- Update issue summary to make sure everything is clear going forward
- Fix test/patch
- Review
User interface changes
N/A
API changes
n/a
Data model changes
n/a
| Comment | File | Size | Author |
|---|---|---|---|
| #67 | 2797047-67.patch | 16.36 KB | jofitz |
| #67 | interdiff-65-67.txt | 841 bytes | jofitz |
| #65 | 2797047-65.patch | 16.21 KB | jofitz |
| #65 | interdiff-63-65.txt | 913 bytes | jofitz |
| #63 | interdiff.txt | 625 bytes | quietone |
Comments
Comment #2
mikeryanYes, the core user_profile_field.yml migration plugin should be using the dedupe_entity process plugin here.
Until that's fixed, in your own upgrade_user_profile_field you can use dedupe_entity to automatically truncate and maintain uniqueness of your field names:
If you'd like to control the resulting field names, you can use static_map:
Note that either way, you will of course need to change references to these fields in other migrations such as d6_profile_values.
Comment #3
niccottrell commentedThanks @mikeryan. I made these changes and recently tried a fresh migrate from the D6 site using the dedupe_entity plugin. I got a different error this time:
I've attached the config for this fullname field, which is not a required field.
I don't see why a simple textfield field should be a problem. What can I do to help track this down?
Comment #4
mikeryanCould you post your current migration configuration (.yml)?
Thanks.
Comment #5
niccottrell commentedThanks Mike - I've posted the relevant (I think) config files
Comment #6
mikeryanDon't forget to set the issue status back to "Active" when providing more info to a "Postponed (maintainer needs more info)".
Comment #7
mikeryanThe (immediate) problem is here in the field migration:
Please refer back to comment #2 above - the dedupe_entity plugin needs to be applied to field_name, not to type (which needs to be returned to its original configuration using static_map). You'll also need to apply the same transformation to field_name in the field_instance migration.
Now, that being said, I dug in a little farther and while the above should fix the creation of the profile fields, it looks like the migration of the actual values is using the names directly (not referring to the rewriting in the field migrations), so it doesn't look like that will work. Offhand, I don't see a way around that without writing an extension of the d6_profile_field_values source plugin.
Comment #9
mikeryanComment #10
quietone commentedComment #12
olarin commentedCapital letters in the field machine name will also break the migration. I assume both problems could both be addressed simultaneously? Otherwise that could be spun off into a separate issue of course.
Comment #13
olarin commentedOK, so I'm trying to pick at this myself and I understand at least some of what needs to be done here, but I may need some assistance for the rest:
First of all, user_profile_field.yml and user_profile_field_instance.yml need to be modified, as indicated above by mikeryan, to ensure that the field machine name is a valid drupal 8 machine name, and that there are no duplicates. I know how to do the first part - in the process section, this needs to be added:
and then that needs to chain into something to prevent duplicates - I presume that would be make_unique_entity_field which appears to have replaced the confusingly named dedupe_entity that was referenced above - but I'm not quite sure how to configure it (in particular I'm at a loss as to what the "field" parameter to make_unique_entity_field should be).
Then the second half, again as mentioned above, is to use migration_lookup to transform the field names when you're actually bringing in the values, in the d6_profile_values migration (and presumably also in some equivalent somewhere for Drupal 7, but I'm not yet sure what that is). This is the other place I get stuck right now, because d6_profile_values brings all the fields in at once in one row, so either the migration lookup needs to be called from the source plugin (which I'm unclear on how to do or even if it's valid to do), or the whole migration needs to be restructured somehow?
Comment #14
quietone commentedI haven't read the whole issue but maybe I can help clarify the two the make_unique_entity_field and migration_lookup process plugins.
The field property in the make_unique_entity_field process plugin. The field is just any field that the entity has. If the entity has properties 'id', 'name' and 'color', any of those can be used as the field you want to make unique. Obviously, context is important and making a color unique doesn't make much sense, but the process plugin would do it. The plugin is normally used to make sure that a key is unique.
In this excerpt from d6_taxonomy_vocabulary.yml, the 'vid' field is made unique, only when it is a migrated entity (existing non migrated vocabularies are not changed and limited to 32 characters. The resulting value is then piped to the forum_vocabulary process plugin.
The d6_user migrations has an example of migration_lookup :
Here the uid property of the existing row is the input to migration_lookup, which takes that input and determines what the result that a previous run of d6_user_picture_file returned for that same input uid. So, if the previous run d6_user_picture_file had an input of '5' and returned '24', then if the uid was 5 in this example, then 24 would be saved in the 'user_picture' field. This helps ensure that your destination site keeps things in sync, in this case the user picture fid for a user.
Here is a link to the api doc for all the process plugins, which may be useful; https://api.drupal.org/api/drupal/namespace/Drupal%21migrate%21Plugin%21...
Comment #15
olarin commentedOK, well apparently make_unique_entity_field is not the correct plugin, then. In this case, it's not a value *in* a field we want to ensure uniqueness of, it's the name of the field itself, which we're just now creating, that we want to ensure uniqueness of. (See, we might have had profile fields of both FirstName and firstname in Drupal 6, because it allowed machine names with capital letters. Drupal 8 doesn't, so we'll run the field names through the machine_name process plugin first, but now we risk duplicates (in this case two fields both named firstname), so we need to do something to make sure each field still gets a unique name (and also truncate them to 32 characters, which was the original point of this issue).)
As for migration_lookup, I did read the documentation and I do understand how it works. The problem is again that it's not a *value* we need to do a lookup for, it's actually the name of the field itself. (Once user_profile_field and user_profile_field_instance are modifying profile field machine names to be valid d8 machine names, d6_profile_values will need to make sure it's saving those values to the new names of the fields, not the old - possibly different and invalid - names.)
Comment #16
mradcliffeI talked with @Olarin about this issue, and we came to the following conclusions:
1. We need to create a new process plugin to guarantee length and uniqueness. MakeUniqueBase does not guarantee length and can output values greater than the provided length.
2. The profile field migration template needs to also have the machine_name plugin process the name to guarantee allowable characters.
3. We need to use the MigrationInterface object from the base class in ProfileFieldValues source plugin because process plugins only transform the value (field value), not the key (field name) and ProfileFieldValues source plugin explicitly sets the key.
Comment #17
mradcliffeSo after writing that and not writing my first instinct which was to inject migration plugin manager, I need to do that, because profile field isn't profile field values migration.
So maybe a trait makes sense to share code from MigrationLookup?
Comment #18
olarin commented@mradcliffe and I worked on this yesterday, and finally got something working. Here's my first attempt at compiling our work into a patch. This is against 8.4, I'll try to roll against 8.5 and 8.6 momentarily.
To review: we wrote a new process plugin for user_profile_field to use (along with the existing machine_name plugin) to ensure that profile field names are munged into something valid for Drupal 8, had user_profile_field_instance look up and use the mapping that user_profile_field created, and modified the source plugin for d6_profile_values to make sure it also looks up those mappings for the field names when it prepares the row of values to be saved.
I found that I also needed (a modified version of) the fix in https://www.drupal.org/project/migrate_upgrade/issues/2823414#comment-11... in order to actually get the values to import in my test setup. (For the record, I'm using migrate-upgrade --configure-only and then using migrate-import for the specific migrations I want, sometimes one at a time, which according to that issue queue might be part of why I'm still experiencing that bug.) I don't know if setting the destination property in the source plugin is a "correct" solution for that problem; if not I'd welcome suggestions in that regard.
Comment #20
olarin commentedHere's an 8.6.x patch, identical except for the migration_templates => migrations folder name change.
Comment #21
olarin commentedAaaand testing the same patch against 8.5.x.
Comment #22
heddnI'm not sure why 'make_unique_entity_field' won't work for this. Can we expound on that?
Based on https://www.drupal.org/core/issue-priority, I'm dropping the priority down to normal since this doesn't seem to effect a wide number of users.
Comment #23
heddnAlso needs work because needs tests.
Comment #24
olarin commentedFrom the documentation for MakeUniqueEntityField:
"Ensures the source value is made unique against an entity field."
Key word here being *value*. It's made to ensure that the values stored in a field are unique, and it in fact requires the name of the field as a configuration key. What we're trying to process in this issue is the *name* of the field (in user_profile_field), not the values we will later put in it (in d6_profile_values).
I'm not sure how else to expound upon it. Unless I'm completely failing to understand the entire situation (in which case I would of course welcome some education), make_unique_entity_field is made for a fundamentally different use case than what we're dealing with here.
(There's also the additional problem that make_unique_entity_field doesn't actually guarantee ouptut length; the length parameter is only used to truncate the input, not to limit the length of the output - the code that ensures uniqueness for make_unique_entity_field can still add an arbitrary number of characters to the output.)
Comment #25
olarin commentedPatch no longer applies against 8.5.x. Attempting to re-roll.
Comment #26
andypostComment #27
quietone commentedI've been watching this from afar and wondering why this has taken a different route than suggested by mikeryan at the beginning and hoping to finally see the problem. Didn't quite get there until tonight. The first thing I noticed is that it needs tests, so I studied the screenshots in #3 and updated the test fixture and wrote a test. The fixture now has two user profile fields, type checkbox, with very long similar names, deliberately chosen to verify the findings reported in the last paragraph of #24
With a failing test I then added what mikeryan suggested to user_profile_field and made the necessary changes to user_profile_field_instance. Then tested and found that setting the length to 32 wasn't guaranteeing the length of the final deduped value. I changed the length to 30 so there was room for the characters used to dedupe. Changing it to 30 means there are 2 characters available for deduping, which means there can be 99 unique values made for one base. Surely that is sufficient for user profile fields.
What are the concerns with this approach?
I can see how that adding characters to a string to dedupe it and as a result make the string longer than the length field provided is unexpected. And that is why the patch in #25 has a new process plugin, that will limit the deduped string to the length given in the configuration. Can someone chime in on the history of MakeUniqueBase and that behaviour? I just don't remember.
Comment #30
mradcliffeThanks for adding some test cases for profile field greater than 32, @quietone.
Let's see how the patch in #20 for 8.6.x fairs with those in this one (without any modification of the test)? It doesn't look like the approach in #27 worked.
I got a bit of the way to writing up an issue summary based on comments/questions asked by @heddn and @quietone and addressed by @Olarin, but I didn't quite get there so I'm adding the Needs issue summary update specifically in outlining the proposed resolutions.
Comment #32
quietone commentedThanks for taking on the IS update, that will be really helpful.
The patch in #27 is failing because the test fixture was changed. That will cause other tests to fail aside from the one I concentrated my effort on, MigrateUserProfileValuesTest.php. I just wanted to get that one correct and then, due to the late hour, I let testbot find the rest overnight.
This patch is based on #27 and now includes fixes for the other tests and includes an new assertion in MigrateUserProfileEntityDisplayTest.php. The changes are mostly to add a migration_lookup on the profile field name in all the profile migrations. Let's see how the tests fair now.
Comment #34
quietone commentedOops, forgot to update the entity counts in MigrateUpgrade6Test to account for the additional user profile field.
Comment #36
quietone commentedAnd missed another entity count
Comment #37
heddnI've reviewed the IS and updated it. I took a look at the code and we are now using make_unique_entity_field, which is what was suggested several times. Tests are added. All looks good.
This only seems to effect D6, so updated tags. And retitled.
Comment #39
mradcliffe@Olarin, can you test the current patch to see if it actually works on your data?
Comment #40
heddnComment #41
quietone commentedAdjusted the entity counts.
Comment #42
olarin commentedIn principle I don't like just hoping that two characters will be enough for the suffix, but in practice I admit that it's highly unlikely anyone will have more than 99 profile field names that begin with the same 30 characters (even after lowercasing). Improving MakeUniqueBase could be a separate, lower priority issue, if anyone cares enough. (I was wary of messing with it initially without knowing the background of where it came from and where else it might be used.)
The other (primary) reason I wrote my own processor, though, was because I did not understand how to use make_unique_entity_field for this use case. All the documentation on it seemed to be solely about using it to process field *values*. For future reference, where should I have been looking to understand that "field_storage_config" was a thing I could pass as an entity_type for make_unique_entity_field?
As for whether the latest patch is actually working on my data -
My test environment was still set up with 8.5.0 at the moment, against which the 8.6.x patch did not apply as is, but that problem appeared to just be limited to the tests - stripping the test modifications for the moment, I was able to get it to apply. When I ran it against the migration I've been working on, the profile fields were correctly generated, however, despite not reporting any failures or skips, d6_profile_values seems to have failed to bring in any actual data for those fields. I will try to investigate further when I get a chance, to narrow down the issue and/or confirm whether it's just user error on my part.
Comment #43
quietone commentedI don't see any changes to the patch in #42 so setting back to NR. And adding tag for manual testing.
@Olarin, I wish I could answer your question about where to look to understand that "field_storage_config" was a thing. I just picked it up along the way as I worked on migrations. There is the configuration api overview which may help.
Comment #44
heddnI disagree that manual testing is needed. We've got tests here and this isn't an area that requires human intervention to easily reproduce.
We have a small chance of issues if someone has more than 99 similarly named fields. But I think we're OK and are easily hitting the 99%+ mark in that most folks won't.
Comment #46
alexpottComment #47
jofitzPatch in #41 no longer applies. Re-rolled.
Comment #48
heddnDiff stats are the same. Back to RTBC.
Comment #50
jofitzUpdate test fixture to avoid duplicate keys.
Comment #51
olarin commentedI don't know why this keeps getting marked RTBC. As I indicated in #42 (and confirmed again just now with the patch in #50), the field *values* are still not getting properly inserted if the field names themselves change (because they had to be shortened and/or lowercased), because it's missing the part from the original patch that @mradcliffe worked on to call into migration_lookup in the d6_profile_field_values source plugin to get the updated field name. The tests aren't catching it because the fixtures have no data for the profile fields that have the extra-long names and the tests are just doing an assertNull on them.
Comment #52
olarin commentedSo here's my first attempt ever at modifying a D8 test. This patch is the same as #50 except that it modifies the d6 test fixture and the MigrateUserProfileValuesTest to demonstrate that D6 profile fields with capital letters or long names fail to migrate their values. If I did it correctly, the next patch I submit, which will include the modifications to the d6_profile_field_values that I mentioned from the earlier patches, should get it passing again.
Comment #53
olarin commentedAnd here's the promised patch, which combines #52 with the aforementioned previous fix to the d6_profile_values migration. This should pass all the tests, and also successfully migrates my actual real-world example data.
Also I did the interdiff for the previous patch wrong because I misread the instructions on interdiffing, so here's a corrected version, along with the interdiff for this one, and another interdiff between this one and #50 just because.
Comment #54
benjifisherI usually put a
skip_on_emptyimmediately after amigration_lookup. If themigration_lookupfails here, then theextractstep will throw a MigrateException, whereasskip_on_emptythrows a MigrateSkipRowException (which does not extend MigrateException). Maybe that does not matter much, but it feels like falling into an error instead of testing for it.I see the same pattern in the
user_profile_entity_displaymigration and the others in the User module.Comment #55
mradcliffeIt's probably a good follow-up. Make a test to see if a simpler user.module migration fails because of that order, and then a follow-up to fix all of them?
Comment #56
quietone commented@Olarin, thanks for the patch, particularly the changes to the test fixture because now I see the problem.
I suspect it would be simpler to accomplish this if the profile values were handled one per row but then that is probably not a good idea when there are a large number of users in combination with many profile fields.
I've not yet reviewed that patch.
Comment #57
quietone commentedThe approach taken in #53 is to calculate the destination profile field name in the source plugin. That is a potential problem because it may not agree with the name as created by the 'user_profile_field_instance' migration. Instead of doing that it would be better to use the migration_lookup process to get the correct profile field name. That too has problems because the migration is unusual in that it doesn't do one row at a time, that is it doesn't process one profile field per user at a time. Instead, it adds all the profile fields for a given user into the same row, assuming that the destination profile field name is the same as the source profile field name. And as Olarin has been saying that isn't true.
In order to use the migration_lookup process plugin additional information, specifically the source field name and the source fid for each profile field, needs to be added to the row. That is done in ProfileFieldValue::prepareRow() by building an associative array with keys for the name and the fid. Now a process plugin (profile_field_name_value) can be used to loop over all the profile fields and use migration_lookup to get the correct destination name and store the value. Obviously, the migration has been updated to include this new process plugin.
This is still a work in progress and if nothing else it needs comments. Since this is a different approach that #53 there is an interdiff against #53 and #50.
Comment #58
maxocub commentedAssinging for review
Comment #59
maxocub commentedLike @benjifisher mentioned in #54, putting the skip_on_empty plugin right after the migration_lookup and before the extract is a far more common pattern in core. No need to open a follow-up for that, this should be done here.
Extra line.
This could use some docs.
Can't this be achieved with existing process plugins? Like sub_process to iterate over each profile field and then do the migration_lookup? Otherwise I guess this plugin will also need a unit test.
Unused variable.
Could we test the actual field value here?
Comment #60
quietone commentedFirst, a reroll
Comment #61
quietone commented59.1. Yes, I missed that
59.2. Fixed
59.3, 4 and 5 New approach and this file has been removed. The work to get the correct destination property name is now done in ProfileValues::getProcess().
6. Values tested now.
Comment #62
maxocub commented@quietone: Thanks, I like the new approach with no new plugin. All my feedback have been addressed, so I think this is ready. I just found one little nit, an unused variable:
Comment #63
quietone commented@maxocub, thanks I like this better too. Sad thing is that I think I looked into that method sometime before and rejected the idea for some reason. Now it seems so obvious that this is the way to fix this.
This removes the unused variable as well as the unnecessary removal of a blank line from core/modeuls/user/src/Plugin/migrate/source/d6/ProfileFieldValues.php so that the patch no longer has changes to that file.
edit: fix grammar
Comment #64
maxocub commentedOups, I didn't saw this the first time around, but I think we should check if
$new_value[1]is set before accessing it.Comment #65
jofitzAdd
isset().Comment #66
maxocub commentedHmm, looking at this again, maybe we should do something if
$new_value[1]is not set. Like in the process pipelines where we usemigration_lookup, we useskip_on_emptyjust after, which log a message ifmigration_lookupdidn't return anything. I guess we could throw a MigrateSkipRowException so we don't skip it silently.Comment #67
jofitzAdded Exception.
Comment #68
maxocub commentedGood for me.
Comment #70
catchCommitted/pushed to 8.6.x and cherry-picked to 8.5.x. Thanks!