Problem/motivation
entity_id/type are required properties for a comment. So there are chances that weird things will start happening in migrations in the future, if we leave those fields empty for stub comments.
What prompted this:
#2318875: Redo CommentStatisticsInterface is going to change the function signature of CommentStatisticsInterface::update(), so Drupal\comment\Entity\Comment::postSave was going to contain this code:
// Update the {comment_entity_statistics} table
\Drupal::service('comment.statistics')->update($this->getCommentedEntity(), array($this->getFieldName()));
...and d6\MigrateCommentTest / MigrateDrupal6Test started failing with:
Argument 1 passed to Drupal\comment\CommentStatistics::update() must implement interface Drupal\Core\Entity\ContentEntityInterface, null given
Proposed resolution
The proposed patch was initially dubbed a 'cheat', because there is no clean way to derive which types of entities have already been imported. This is still true, but the approach has approval from Migrate people and the refined patch should work with > 99.9% of migrations in practice.
(Note: First I thought "let's make a property/method that specifies which destination fields must be filled for a stub entity". But while coding I ran into a wall: I don't know how to cleanly derive the source values connected to those destination fields.)
Comment | File | Size | Author |
---|---|---|---|
#52 | interdiff.txt | 3.23 KB | mikeryan |
#52 | fill_commented_entity-2340401-52.patch | 7.4 KB | mikeryan |
Comments
Comment #1
roderikComment #2
dawehnerNote: At the moment it already breaks in case you have RDF module enabled.
Maybe you can use this in order to provide a test case?
Comment #3
andypostTake a look at Comment::preCreate() we require comment_type, so entity type and field name should create stubs?
Comment #4
dawehnerI think adding a comment_type is fine for now?
In general the patch works fine, though I dont' really have a good idea how to test it. Let's see whether RDF fails.
Comment #6
roderikWe're already adding comment_type (the bundle name) in Drupal\migrate\Plugin\migrate\destination\Entity::getEntity() at the moment, so that's covered.
Only one test failure, it seems the insert 'duplicate key in node table' is caused not by adding rdf, but by the other change made in testComments(). I'm not sure; will just undo the change and let the testbot tell me.
I don't know how to test it 'properly' either. But when this goes in, I can reverse the 2nd code block in #2318875-13: Redo CommentStatisticsInterface, and that will be an implicit test because it will generate a "you are not allowed to pass NULL into \Drupal::service('comment.statistics')->update()" error when the stub comment gets inserted. Would that do, for now?
Comment #7
roderikSummarizing the test failure in #4: @dawehner so moving
from setUp() into testComments(), gives an
Integrity constraint violation: 1062 Duplicate entry '1' for key 'PRIMARY': INSERT INTO {node} "
. I don't know why but I'm not sure it's worth for me to investigate because it doesn't seem to be important for this issue.So, I think your comment/new patch implicitly validates my 'cheat' (a term I used in the issue summary) and this is good for inclusion?
That is, if the note made about an 'implicit test' of this patch in #6 (last paragraph) is OK.
Comment #8
andypostIt looks like quick hack but the cause is comment entity save, will try to dig in
needs doc block
fragile...
Comment #9
andypostmigration should set "comment type" (there's only one for d6&7 core's shipped)
Also comment type entity could not be skipped so because it's THE bundle of comment entity
Comment #10
roderik8.1 added doc block
8.2 correct, it's fragile. This is why I name it a 'cheat' and I will leave it to more experienced migrate-people to judge if this is even allowed. I didn't know a better way than this.
9.
This (setting the bundle) is done in the parent. Drupal\migrate\Plugin\migrate\destination\Entity::getEntity():
(If I did not understand something in #9, please tell me.)
Comment #11
andypostLet's get feedback from migrate pov.
RTBC +1 from me
Comment #12
benjy CreditAttribution: benjy commentedI see where you're coming from with this, the node migration has to run before comment and if there are no nodes there will be no comments so this should always work. One question, Is there any problems if the node id selected is of a bundle that doesn't have a comment field attached?
This isn't really doing anything since it's set as a constant in the migration anyway but I understand it would break if the entity type wasn't node. We should probably pass this stuff in as configuration because it would good if it was possible to make the entity type relevant but we could do that in a follow-up.
Yes, I think that is fine. It's not for Migrate to test Comment's API's and I presume you already have the appropriate test coverage. We could have added unit tests for the comment destination but very few other destinations have coverage so I don't think it would be fair to hold this issue up on that.
Comment #13
andypostComments uses "comment_type" as bundle so #9 explains the case when no bundle set (really bad because we need to find node with fields)
no comment no problem. Comments are displayed only with "comment field formatter" so when comment is bound to node without comment-field it will not shown
Comment #14
benjy CreditAttribution: benjy commentedIn #12, I was referring to the entity type the comment is attached too, nevertheless, I think this issue is ready.
I also ran MigrateCommentTest locally with nothing other than the RDF module enabled and the test does fail. Maybe a comment above $modules explaining why it's there would be good though?
Then, RTBC for me.
Comment #15
roderikOK, I hope this works for a comment.
Re #14 as far as I know a comment does not normally refer back to its commented entity's field configuration, by itself. So indeed, if a comment is attached to an entity without a comment field, that shouldn't be fatal.
Comment #16
benjy CreditAttribution: benjy commentedThat looks OK to me apart from it should have been "RDF" but I'm not gonna hold it up for that.
Comment #17
andypostLet's get @larowlan review
Comment #18
larowlanWhy does it store the entity type as a class property, seems hardwiring would be fine, it's always node.
$this->stubCommentedEntityType = 'node';
Comment #19
larowlanComment #20
andypost@larowlan That's right for d6 but d7 can carry different fields per node type
Comment #21
andypostIn #20 I mean a commentType but
stubCommentedEntityType
is always "node" so should be moved to constants.Once someone will try to migrate some entities to comments for that purpose he will need to hack core or override this
EntityComment
plugin... dx--I'm sure that parent node(entity) migration should run before the comment and when there's no parent at the time of comment import then this row should be simply skipped
I really can't get the reason to import comments that have no parent node imported.
Suppose better to throw exception
s/rdf/Rdf
Comment #22
benjy CreditAttribution: benjy commentedRegarding #18, we may as well remove it entirely, it's set in the migration YAML as a constant anyway.
I did also mention in #12, that we might want to move this to configuration for the destination in a follow-up. That would be better still.
Comment #23
roderik@benjy could you poke at this (extra) approach? I wanted to derive the "node" constant from the migration definition, but I'm not sure if there is a cleaner method than just going through the definitions as I did.
...in #18 vs #12 we have "hardwiring 'node' would be fine" vs "we could add a configuration option in a followup", but in practice, ~99% of migration definitions will define the entity type as a constant (in the source) already. We can refine the current approach by going with that. Points:
Also: throwing exception now, as noted by andypost/#21. With a MigrateSkipRowException you can get loads of error messages instead of one, but it seems the best solution.
Comment #24
roderikFix unnecessary line
Comment #25
andypost@roderik Thanx for working on this! The reasons for "node" makes sense only in context of migration from previous version of core. That means that comments before d8 are "sourced" from nodes only. But destination could be any
Comment #26
benjy CreditAttribution: benjy commentedFew small things and then I think this is looking good.
Can we put all this up on one line?
1 line
Can we inject this?
Comment #27
benjy CreditAttribution: benjy commentedOK, I re-rolled this myself and made the changes I suggested in #26 because we now have a new fatal coming out of the RDF module that the tests didn't catch but this patch fixes.
New test coming soon.
Comment #28
benjy CreditAttribution: benjy commentedFixed QueryFactory. We don't need a new test it was the same issue mentioned by dawehner relating to the RDF module, enabling that module in the test gives us an implicit test.
Comment #30
chx CreditAttribution: chx commentedWhy $this->stubCommentedEntityId and $this->stubCommentedEntityType instead of
$stub_comment_entity_id
. Why do we change the object state instead of just local variables?Comment #31
benjy CreditAttribution: benjy commentedIt looks like this was done to prevent having to recalculate the entity_id for every row, eg, we only need to know one valid id for which we can use in stubbing and that doesn't change.
Comment #32
benjy CreditAttribution: benjy commentedI simplified the code a little. I removed the clever way we were determining the entity_type field and simply just presumed it was at 'constants/entity_type'. I also removed the class property so we just recalculate the value each time, it's practically free anyway.
Comment #33
chx CreditAttribution: chx commentedI rarely push back on migrate patches but this is unacceptable. It flies against the migrate architecture completely: the only way data moves from the source to the destination is via the pipelines described in process. As far as I can see the stub row goes through normal process
$migrate_executable->processRow($stub_row, $process);
so there is no reason for this sort of violation.Comment #34
benjy CreditAttribution: benjy commentedI completely agree with #33, I should have spotted that myself :/
I've removed the fallback to node, I don't think we need that in core, see what the tests say.
Comment #36
chx CreditAttribution: chx commentedLet's put the stub entity handling in a separate method. It only needs $row anyways. Let's use a simple conditionless entity query. Despite the comment says we don't care about which entity id it is and we really don't we still pick the minimal. Let's just do the equivalent of
SELECT nid FROM node LIMIT 1
and then let the database pick whatever nid it wants. I feel like stubCommentedEntityId should be stubCommentedEntityIds an array keyed by entity type. Destination plugins are very generic; just because Drupal 6 happens to only use a single entity type, and node at that, once we get to Drupal 8 sources this will be problematic.Comment #37
benjy CreditAttribution: benjy commentedThanks for the feedback. New patch attached.
I missed this earlier when we discussed but the stub_row only contains the source ids. See the Migration process plugin. Relevant code:
I've left it as $this->getDestinationProperty() so the tests fail and put this back to NW.
Comment #39
chx CreditAttribution: chx commentedAh. That's a bug then. We should do
$stub_row = new Row($values, $source_ids + $migration->get('source'));
so that the constants are in.Comment #40
benjy CreditAttribution: benjy commentedHmm, I thought it was the values and the process array that would need the values. See attached interdiff?
Also, not sure if we have an explicit test for these changes, we do have an implicit test from the EntityComment destination.
Comment #41
chx CreditAttribution: chx commented#2411233: Stub in migration process plugin does not do complete process filed.
Comment #42
benjy CreditAttribution: benjy commentedPostponed on: #2411233: Stub in migration process plugin does not do complete process
Comment #43
benjy CreditAttribution: benjy commentedOK, this is ready to be worked on now #2411233: Stub in migration process plugin does not do complete process is in.
Comment #44
phenaproximaThis affects the Drupal 6 and 7 upgrade path as described in #2558923: Comment migrations produce fatal errors when RDF module is enabled (which is a workaround), which is Migrate critical. Therefore I think this should be Migrate critical as well.
Comment #45
mikeryanRerolling, starting with a fail patch.
Comment #46
mikeryanThis is just a straight reroll of the last patch here, doesn't actually work at the moment.
Comment #48
mikeryanHere we go... A few notes:
Comment #49
mikeryanOops, should have removed this - I switched to MigrateException instead of MigrateSkipRowException, more appropriate in the error case. I'll hold out till I see what other edits folks may have...
Also wanted to point out that, obviously, using RDF to test-by-side-effect is a little hinky. The alternative would be to add a test module implementing hook_comment_storage_load() checking for a valid entity_id - do we want to go there?
Comment #51
phenaproximaApart from these minor quibbles, I think this looks great!
Nit: s/Ids/IDs
Why do we need to check if $id_key is valid? All entity types should have an ID key, so we can always assume that getKey('id') will return something.
IMO, we should throw the MigrateException if $result is empty, rather than repeating the check of $this->stubCommentedEntityIds below.
Not sure about the use of the word "derive" here. Maybe "find" or "query"?
Nit: s/references/calls
Nit: Can we use $row->hasDestinationProperty() here?
Comment #52
mikeryanAll done, except:
Changed to "references a stubbed comment.".
No, we can't, that fails. I think there is actually an empty bundle coming in, which we want to replace...
Comment #53
phenaproximaParty on, Wayne.
Comment #55
webchickOk cool. It was explained to me that the reason we need to do this is comments are dependent on nodes, and so the stub entity needs to be complete. It looks like chx's concerns above were addressed by the latest patch, and this makes it much easier to test the migrations because you don't need to disable RDF module first.
Committed and pushed to 8.0.x. Thanks!