Problem/Motivation
When using Group module together with Content Moderation and when adding moderated content (e.g. a node) to a group before Content Moderation's hook_entity_insert() has run, the following exception is thrown:
PDOException: SQLSTATE[23000]: Integrity constraint violation: 1062 Duplicate entry 'node-1-1-editorial-en' for key 'content_moderation_state__lookup'
This is because Group module (re-)saves the entity when it is being added to the group (see #2872697: Stop saving an entity when it gets added to or removed from a group for some background/discussion on that). That "inner" save then causes the content moderation state entity to be created, so that it already exists when content_moderation_entity_insert() is triggered for the original save. Content Moderation then tries to create a new content moderation state entity which causes the problem.
Content Moderation already tries to be smart and load any existing content moderation state for the given entity, but that fails because the loaded revision ID is not set at this point, so it looks for a database record with the content_entity_revision_id set to NULL but there is a record with content_entity_revision_id set to (e.g.) 1. Since it cannot find what it is looking for it creates a new entity to INSERT (instead of UPDATE) which then triggers the exception.
See \Drupal\content_moderation\Entity\ContentModerationState::loadFromModeratedEntity().
Steps to reproduce
- Add contrib Group module to codebase:
composer require drupal/group - Apply patch from #2872697-41: Stop saving an entity when it gets added to or removed from a group to Group module:
curl https://www.drupal.org/files/issues/2020-11-06/2872697-41.patch | git apply - Add a custom module that:
- Has a lower weight than Content Moderation (which has 0)
- Does
$group->addContent()inhook_node_insert()
- Install Content Moderation and Group:
drush en content_moderation group - Enable Content Moderation for a given node type
- Add a node of that type
Expected result: Node is created
Actual result: SQL Exception (see above)
Proposed resolution
When attempting to load the existing content moderation state entity when there is no loaded revision ID, but there is a revision ID, use the revision ID instead.
The ("not loaded") revision ID is what will be set on the content moderation state entity anyway, so it is what causes the duplicate key exception, so it seems appropriate to check for that.
Remaining tasks
Review the patch.
User interface changes
None.
API changes
None.
Data model changes
None.
Release notes snippet
-
| Comment | File | Size | Author |
|---|---|---|---|
| #19 | 3181439-19.patch | 8.51 KB | tstoeckler |
| #5 | 3181439-5--tests-only--no-syncing.patch | 6.61 KB | tstoeckler |
| #5 | 3181439-5--tests-only.patch | 7.04 KB | tstoeckler |
Comments
Comment #2
tstoecklerHere's a test for this. I tried to come up with the simplest example of a hook implementation that would trigger this. In the example of Group module described in the issue summary, instead of cloning the entity it is actually being reloaded (via accessing the
->entityproperty of an entity reference field). So while the cloning might seem a bit contrived here, the larger issue still stands. The recording of the state is not strictly necessary, it's just there to avoid a false positive, in case the test hook implementation would not get called for some reason as the test would then still pass (without the check for the state value).Also including a patch with the fix. I'm not too much of an expert on Content Moderation, so would be glad to have some input on the fix, but looking at surrounding code, I am fairly convinced that this is the correct way to fix it. Also, the inline comment could maybe be improved, I had difficulty explaining the situation properly without basically pasting the issue summary there.
Comment #4
larowlanCould this be a use case for isSyncing on a content entity?
E.g. content moderation ignores this operation if it is syncing, and then group sets it to syncing before it does the resave?
The patch looks good to me, but asking about an alternate approach.
Comment #5
tstoecklerIt's a very good question, unfortunately the answer is that no, that will not help.
In fact, the issue only occurs if the entity is set to syncing in the re-save. Group module "as is" does not set the entity to syncing when re-saving it, but I had #2872697-41: Stop saving an entity when it gets added to or removed from a group - that does exactly that - applied when finding this bug. I had forgotten about this detail, so thanks for bringing it up! Noting it in the issue summary, as well.
In the test module hook implementation in the patch - which is basically the simplified approximation of what Group module does - I'm already setting the entity to syncing. And the "test-only" patch does in fact fail without it, I had just assumed as much before but checked now that you brought it up.
It's all a bit confusing but it does (somewhat) make sense why that is the case: When the entity is not set to syncing, the re-save in fact triggers the creation of a new revision from Content Moderation. So the initial content moderation state entity gets added with a target revision ID of 2, because in this scenario the Content Moderation update hook for the resave is run before the insert hook for the initial save. When the latter is then run, it is not able to find the content moderation state that was just created so it attemps to create a new one, just like in the scenario described in the issue summary. Here, though, it attempts to create it with a target revision ID of 1, which does not cause the duplicate key exception so no error is thrown.
However, as I am writing this, I realize that means that without the
setSyncing()call, while the save (and re-save) does not trigger an exception, it in fact causes two content moderation state entities to be created. I verified this locally by commenting out thesetSyncing()part of the patch.So as part of the test I think we should be asserting that only a single content moderation state entity gets created in the process. This would make the test fail without the
setSyncing()call, so it avoids a false positive if there are any changes in the future.So here's an updated patch that does that. I also included a second tests-only patch (and interdiff) to prove that the
setSyncing()is needed now to make the test pass.Thanks again, really on point!
Comment #8
sam152 commentedGreat explanation of the problem, thanks!
Before reading the patch, I was wondering why saving a revision didn't in fact hydrate the loaded revision ID, but it seems it does.
Does the fact the entity needs to be cloned here indicate that this may be the location of the bug and when cloned entities are saved, they should have similar semantics to the original and CM just happens to be a module that exposes this issue? If this is the case, could the test case instead be asserting that saving a clone has the same
getLoadedRevisionIdbehaviour as saving the original?Comment #9
tstoecklerThanks @Sam152!
Well it does, however, that happens after
hook_entity_insertis called. See\Drupal\Core\Entity\ContentEntityStorageBase::doPostSave()for details.I'm not sure I fully understand the question and/or assertion. Saving the clone in principle does have the same the same behavior as saving the original. The issue here is with the in memory state being different which exposes this bug, in particular the loaded revision ID being 0 (in the original entity) while there is already a content moderation state (from the re-save).
I'll try to explain the flow of things a bit more in detail at the risk of being overly verbose.
Here is what happens just with Content Moderation without any re-saving:
hook_entity_insert()is invoked on the nodecontent_moderation_entity_insert()is called which ends up callingEntityOperations::updateOrCreateFromEntity().ContentModerationStateEntity::loadFromModeratedEntity($entity)correctly cannot find a content moderation state for the new node, soEntityOperations::updateOrCreateFromEntity()will create a new one. (Note that the fact that we add a condition oncontent_entity_revision_id = NULL(while weird and arguably incorrect) is not a problem here, because there is no content moderation state for the new node anyway, regardless of the target revision ID.)Here is the scenario in the test with the re-saving of a clone (but without the fix) with the differences to the scenario above in bold:
hook_entity_insert()is invoked on the nodecontent_moderation_test_resave_entity_insert()is called which clones the node, marks it as syncing and re-saves it which means the following happens:\Drupal\content_moderation\Entity\Handler\ModerationHandler::onPresave()). The revision ID is still 1 and the loaded revision ID is still not set.hook_entity_update()is invoked on the nodecontent_moderation_entity_update()is called which ends up callingEntityOperations::updateOrCreateFromEntity().ContentModerationStateEntity::loadFromModeratedEntity($entity)correctly cannot find a content moderation state for the new node, soEntityOperations::updateOrCreateFromEntity()will create a new one. (Note that the fact that we add a condition oncontent_entity_revision_id = NULL(while weird and arguably incorrect) is not a problemm here, because there is no content moderation state for the new node anyway, regardless of the target revision ID.)content_moderation_entity_insert()is called which ends up callingEntityOperations::updateOrCreateFromEntity().ContentModerationStateEntity::loadFromModeratedEntity($entity)incorrectly cannot find a content moderation state for the new node, soEntityOperations::updateOrCreateFromEntity()will create a new one. It cannot find one because of the condition oncontent_entity_revision_id = NULLwhile the actual state as a target revision of 1.This would be what happens without the cloning with the differences to the scenario with cloning in bold:
hook_entity_insert()is invoked on the nodecontent_moderation_test_resave_entity_insert()is called which clones the node, marks it as syncing and re-saves it which means the following happens:\Drupal\content_moderation\Entity\Handler\ModerationHandler::onPresave()). The revision ID is still 1 and the loaded revision ID is still not set.hook_entity_update()is invoked on the nodecontent_moderation_entity_update()is called which ends up callingEntityOperations::updateOrCreateFromEntity().ContentModerationStateEntity::loadFromModeratedEntity($entity)correctly cannot find a content moderation state for the new node, so\Drupal\content_moderation\EntityOperations::updateOrCreateFromEntity()will create a new one. (Note that the fact that we add a condition oncontent_entity_revision_id = NULL(while weird and arguably incorrect) is not a problem here, because there is no content moderation state for the new node anyway, regardless of the target revision ID.)content_moderation_entity_insert()is called which ends up calling\Drupal\content_moderation\EntityOperations::updateOrCreateFromEntity().ContentModerationStateEntity::loadFromModeratedEntity($entity)correctly finds the existing content moderation state for the new node, so\Drupal\content_moderation\EntityOperations::updateOrCreateFromEntity()will update it and not create a new one. It finds it because of the condition oncontent_entity_revision_id = 1So yes, the cloning is necessary to expose this bug, but it's not a problem with the cloning itself. As noted above, you could just as well do a re-load of the entity to get a separate entity object, but that would get into the whole topic of when the static cache is set and unset during the save process, so I think just cloning is clearer here.
Hope I haven't hopelessly confused everyone now with that wall of text!
Comment #10
sam152 commentedInteresting, so to summarise, you could say that CM incorrectly assumes the internal memory state of an entity during the invocation of the presave hook? And additionally, it does not make sense to re-hydrate that in-memory state when reloading or unserializing an entity?
My only feedback on the patch is that the tests depend on the internal entity, instead of verifying the public APIs behave as expected. Maybe it's worth the additional coverage though.
Comment #11
tstoecklerHmm..., not 100% sure how that is meant, I can read it two different ways. I would summarize as follows: Content Moderation assumes that new entities do not already have a content moderation state when
hook_insert()is called, but - while it is certainly the most "usual"/"common" case - that is not necessarily true.Sorry, not sure if I missed some part of the context, but I don't get what (re-)hydration has to do with this. Do you mean the loaded revision ID? No that state is being managed totally fine, I don't think anything needs to change when (re-)loading or unserializing an entity. Maybe you can elaborate.
Totally fine with removing the more explicit assertions in the test if that is the consensus, maybe a committer can weigh in on that as part of a sign-off (crossing my fingers... ;-).
Comment #12
sam152 commentedOkay, understood. I think based on your very thorough explanation and analysis, I'm comfortable with this and despite this particular bit of CM being a low level and risky component to modify, it solves a real problem with the groups module.
I've queued up a test against 9.x, which I suspect will fail because of the test modules info file, but if we can get it green I can RTBC.
Comment #13
sam152 commentedComment #15
tomefa commentedReroll patch for 9.2.5
Comment #16
tstoecklerThanks @Sam152 for following up with this, unfortunately I had already lost track of this when you replied. And nice catch with the info file, fixed that now so hopefully we can get this RTBC now (per #12).
And thanks @Tomefa for the re-roll and for the implicit reminder about this issue ;-)
Comment #17
gauravvvv commentedRe-rolled patch #16, attached interdiff for same.
Comment #19
tstoecklerThanks @Gauravmahlawat for that! I should have checked the errors before, sorry, I had just blindly assumed that it had something to do with the info file.
Fixed the deprecations now. And this time I actually ran the test locally to verify that everything is fine and dandy, so hoping this will be the last one ;-)
Comment #23
smustgrave commentedTriggering for 10.1
Comment #24
smustgrave commentedReviewing the patch
Everything looks good but will ask the committers is there a test module that could be leveraged or for this use case is it best to have it's own?
Comment #25
gauravvvv commentedUpdated attributions
Comment #29
catchI wondered about the separate test module too but it makes sense:
- we already have two one-off test modules in content_moderation
- we wouldn't want this running on every content_moderation test that uses a generic test module,and if we added a state key to enable/disable that adds complexity too
Committed/pushed to 10.1.x, cherry-picked to 10.0.x and 9.5.x, thanks!