Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
Problem/Motivation
Migrate has the concept of de-duplication, where it looks for things with identical machine names, and creates a copy if there's a match.
This is actually creating duplicates (or avoiding overwrites), but it's not really de-duplication.
Proposed resolution
Document how de-duplication works.
Remaining tasks
User interface changes
API changes
Data model changes
Comment | File | Size | Author |
---|---|---|---|
#67 | deprecatedMigrateModule.patch | 2.79 KB | DrupalMattS |
Comments
Comment #2
alexpottRenameOnClash?
Comment #3
mikeryanWell, it's actually the *IDs* getting deduplicated, not the entities. And entities are not (necessarily) duplicated... If you have an existing entity whose name is 'foobar', and an incoming entity also happens to have the name 'foobar', the presumption is that the incoming entity is a different entity, and to have it co-exist with the existing entity it needs to be renamed (its ID needs to be deduped).
Comment #4
mikeryanLet's just make this issue the child of #2776179: [meta] Add process plugin documentation to the codebase for the dedupe plugins.
Comment #5
quietone CreditAttribution: quietone as a volunteer commentedAnd here is a patch made from copying the DedupeEntity and DedupeBase changes from #2776179: [meta] Add process plugin documentation to the codebase.
Comment #6
heddnAssigning to myself for review this week.
Comment #7
heddndedupe plugin, no?
- 80 characters.
- 'The position at which to start reading.' is what substr uses. I think a variation of that would make more sense.
- 80 characters
- This isn't strictly accurate. The length limits the size of the substr call, not the returned value. The returned value also gets the postfix and the incremental digit added.
This doesn't implode.
Nit: Or property. It doesn't just support fields. Or do we call them the same thing in D8? Maybe we do...
If we are going to include this, then we should also include all the other config available from the base, not just this one.
Comment #8
heddnComment #9
quietone CreditAttribution: quietone as a volunteer commentedthx for the review,
1. Fixed
2 & 3. Changed to use the same text as in substr. It is good enough for that so why not?
4. Fixed.
5. Hmm not sure. No change since the configuration key is 'fields' it seems simpler to just stick with that?
6. Done.
Comment #10
quietone CreditAttribution: quietone as a volunteer commentedAnd uploading the patch helps.
Comment #11
heddnComment #13
MaskyS CreditAttribution: MaskyS at Google Code-In commentedComment #14
alexpottI've updated the issue summary to document the proposed solution. I still think we should consider the naming also I don't think the documentation really explains that this can result in things that feel exactly like duplicates.
Comment #15
catchAgain this is talking about avoiding overwrites not duplication, doesn't really address the issue summary and I really think we should consider a re-name and deprecating the old classes for clarity.
Comment #16
mikeryan@catch: What would we rename it to? What it does is dedupe IDs (or potentially other unique fields), I'm not sure how better to briefly describe that (other than dedupe_entity_id?).
Comment #17
mikeryanuniquify?
Comment #18
quietone CreditAttribution: quietone as a volunteer commentedWow, uniquify. It isn't in the dictionary. While I understand what it means, it is probably not helpful to people who aren't fluent in English.
And this adds some information:
http://english.stackexchange.com/questions/61939/uniquate-eliminate-or-r...
But I tend to the simple and direct. Since this is making any field unique how about 'Disambiguate' or the more wordy
'Make_unique'?
Comment #19
heddnWe discussed in the migrate weekly meeting and we will rename to 'make_unique' and update docs. Assigning to quietone for follow-up.
Comment #20
quietone CreditAttribution: quietone as a volunteer commentedThe name is changed to MakeUnique, documentation updated and the old classes deprecated (though not sure that I've done that correctly).
Comment #21
heddnInstead of deleting, I think we should mark as @deprecated. See #2187735: Add removal information to docblock of all @deprecated functions. And maybe you are already, but can you run the diff using -M?
git diff -M
I think it will more cleanly demonstrate the renames.Comment #22
quietone CreditAttribution: quietone as a volunteer commented@heddn, thanks for the review. Seems I uploaded the wrong patch. Let's try again.
Ad it looks I changed all the instance of dedupe:
Comment #23
heddnA couple of (hopefully) small nits.
Bikeshedding, but maybe MakeEntityUnique or MakeUniqueEntity to clarify this is for entities.
Instead of duplicated, let's state that this makes the entity field value unique. Here and elsewhere, where we use the phrase, "duplicate".
Comment #24
quietone CreditAttribution: quietone as a volunteer commented1. I thought about that. But since the field name that is to be made unique still has to be provided it could be any field and not the entity id. Why assume it is the entity id?
2. Yea, that phrasing works for me.
Comment #25
heddnWhat about 'make_unique_field_value'?
Or (another line of logic entirely) collapse the abstract & concrete classes into a single class. Someone can always extend a concrete class, why have the additional level of complexity if we aren't going to have some distinguishing aspect of MakeUnique. I think we either need to call it either MakeUnqiue or define what we are making unique.
Also, I cannot tell on point #2. Are you agreeing that we should change it or disagreeing and saying the current language is fine. I don't feel strongly enough to push the point. But if we are going to change the name of the thing, then I feel we should utilize language that emphasizes the new name, rather than vestiges of the old name.
Comment #26
quietone CreditAttribution: quietone as a volunteer commentedSince the exists method used is for entities, make_unique_entity_field_value., would be more accurate. But 'value' isn't used in other process plugin names, so maybe 'make_unique_entity_field'?
2. I agree with you.
This patch uses make_unique_entity_field and remove 'deduplicate' from the comments.
Sadly, the interdiff is larger than the patch, but attached for your reading pleasure.
Comment #28
quietone CreditAttribution: quietone as a volunteer commentedFixed typo.
Comment #30
quietone CreditAttribution: quietone as a volunteer commentedSeems to be my day for silly errors....
Comment #31
heddnI think this is good now. All feedback is now addressed.
Comment #33
quietone CreditAttribution: quietone as a volunteer commentedAdding rename to the title since this is also doing a rename.
Comment #34
xjmStraightforward code documentation improvements can always go into any patch release, alpha, beta, or RC, so please always file them against the production branch (currently 8.3.x). Thanks!
Comment #35
catchThis isn't a straightforward docs patch, it's also a rename. Moving back to 8.4.x but also switching the order of the title around to make that clearer.
Comment #37
jofitz CreditAttribution: jofitz at ComputerMinds commentedRe-roll.
Comment #39
quietone CreditAttribution: quietone as a volunteer commentedChanged filename from DedupeEntityTest.php to MakeUniqueEntityTest.php. And get an entity_type.manager instead of entity.query.
Comment #41
quietone CreditAttribution: quietone as a volunteer commentedThat filename change should have been to MakeUniqueEntityFieldTest.php.
No interdiff since the filename change didn't show up in it.
Comment #42
heddnReady again.
Comment #44
jofitz CreditAttribution: jofitz at ComputerMinds commentedRe-rolled.
Comment #45
joelpittetWas RTBC, let's set it back. I compared the two patches and all the changes are related to the recent coding standards for short array syntax patch.
#2776975: March 3, 2017: Convert core to array syntax coding standards for Drupal 8.3.x RC phase
Comment #47
heddnLegitimate failures. Tagging for reroll.
Comment #48
quietone CreditAttribution: quietone as a volunteer commentedDedupeEntity was incorrectly made an abstract class, which in combination to the addition of dedupe_entity in d7_taxonomy_vocabulary.yml, caused the errors.
d7_taxonomy_vocabulary.yml is now using make_unique_entity_field. DedupeEntity is no longer abstract.
Plus, I have restored the original DedupeEntity unit test, which would have caught the error.
Comment #49
heddnI don't see DedupeEntityTest in the patch, only the interdiff.
Comment #50
quietone CreditAttribution: quietone as a volunteer commentedPrevious patches moved DedupeEntityTest to MakeUniqueEntityFieldTest and then modified it, leaving only one test file. The latest patch copeid DedupeEntityTest to a new file MakeUniqueEntityFieldTest, leaving the original untouched and adding the new one.
Comment #51
quietone CreditAttribution: quietone as a volunteer commentedBut, I just realized that the original test needs to have deprecation added to it.
Comment #52
quietone CreditAttribution: quietone as a volunteer commentedNow with deprecation notice on the dedupe test. At least I hope, experiencing a bit of stress with canceled flights and delayed flights.
Comment #53
heddnNot a blocker per se, but I do have question/nit. Can we use sprintf and MakeUniqueEntityFieldBase::class to make this easier to read? Is there a pattern set that should be used for these trigger_error's?
Comment #54
quietone CreditAttribution: quietone as a volunteer commented@heddn, I think I implemented that as specified in the policy, https://www.drupal.org/node/2856615#how-class.
Tidied all the deprecation notices to be 8.4.x.
Comment #55
heddnThanks for the link. Then this is good to go.
Comment #56
phenaproximaCoding standards violation -- the . (concat operator) needs to be surrounded by a space on each side.
Again, the concat operator must have a space on either side of it.
I don't think we need to have a deprecation notice on a test...?
Comment #57
quietone CreditAttribution: quietone as a volunteer commentedFixed items in #56. And removed an unused use statement.
Beware if you copy the trigger error format from Drupal core deprecation policy that you need to correct the spacing.
Comment #58
ohthehugemanatee CreditAttribution: ohthehugemanatee as a volunteer commentedReviewed the changes, they make sense to me. Validated that dedupe is now only mentioned in BC stuff. Made the changes phenaproxima suggested.
Comment #59
ohthehugemanatee CreditAttribution: ohthehugemanatee as a volunteer commentedAh ffs - sorry to duplicate your work @quietone! The patches are effectively identical, except you caught another deprecation notice missing space, and the use statement. My review of the patch therefore stands... and we should use quietone's version. Please mark the issue as RTBC once tests come back green.
Comment #60
ohthehugemanatee CreditAttribution: ohthehugemanatee as a volunteer commentedComment #61
ohthehugemanatee CreditAttribution: ohthehugemanatee as a volunteer commentedI can't seem to show @quietone's patch. But that's the one we should accept. So, er... just look at that comment 57.
Comment #64
catchFixed these on commit:
Committed/pushed to 8.4.x and cherry-picked to 8.3.x, thanks!
Comment #65
quietone CreditAttribution: quietone as a volunteer commented@catch, thanks.
I did fix those errors in patch #57. But then I noticed there are 2 patches named 2824610-57.patch, one in comment #57 and one in #58. And only the later one appears in the file listing.
Comment #66
catchSomehow I managed to commit this from Code Needs Review yesterday (and missed the #58 patch with the coding style fixes). Since it was already RTBC a couple of comments before, and the changes were the same as I ended up making on commit anyway, going to leave status as is. Just wanted to leave a comment in case anyone gets confused. This might be my first ever direct from CNR -> Fixed commit, will try to make it the last one too.
Comment #67
DrupalMattS CreditAttribution: DrupalMattS commentedAdded @see records for:
/core/modules/migrate/src/Plugin/process/DedupeBase.php
/core/modules/migrate/src/Plugin/process/DedupeEntity.php
/core/modules/migrate/src/Plugin/process/MakeUniqueBase.php
/core/modules/migrate/src/Plugin/process/MakeUniqueEntityField.php
Comment #68
joelpittet@DrupalMattS If that is something we do (which I've yet to see, but doesn't mean we don't), it would be best to create a follow-up issue because this one is fixed. Click 'clone issue' in the top right of this page.
Comment #69
Mile23Probably un-deprecating DedupeEntityTest here, because tests can't really be deprecated and should just be replaced: #2488860: Bring phpunit bridge into drupal and use it for unit tests and simpletest to handle Deprecation