Problem/Motivation

When using Group module together with Content Moderation and when adding moderated content (e.g. a node) to a group before Content Moderation's hook_entity_insert() has run, the following exception is thrown:

PDOException: SQLSTATE[23000]: Integrity constraint violation: 1062 Duplicate entry 'node-1-1-editorial-en' for key 'content_moderation_state__lookup'

This is because Group module (re-)saves the entity when it is being added to the group (see #2872697: Stop saving an entity when it gets added to or removed from a group for some background/discussion on that). That "inner" save then causes the content moderation state entity to be created, so that it already exists when content_moderation_entity_insert() is triggered for the original save. Content Moderation then tries to create a new content moderation state entity which causes the problem.

Content Moderation already tries to be smart and load any existing content moderation state for the given entity, but that fails because the loaded revision ID is not set at this point, so it looks for a database record with the content_entity_revision_id set to NULL but there is a record with content_entity_revision_id set to (e.g.) 1. Since it cannot find what it is looking for it creates a new entity to INSERT (instead of UPDATE) which then triggers the exception.

See \Drupal\content_moderation\Entity\ContentModerationState::loadFromModeratedEntity().

Steps to reproduce

  • Add contrib Group module to codebase: composer require drupal/group
  • Apply patch from #2872697-41: Stop saving an entity when it gets added to or removed from a group to Group module: curl https://www.drupal.org/files/issues/2020-11-06/2872697-41.patch | git apply
  • Add a custom module that:
    1. Has a lower weight than Content Moderation (which has 0)
    2. Does $group->addContent() in hook_node_insert()
  • Install Content Moderation and Group: drush en content_moderation group
  • Enable Content Moderation for a given node type
  • Add a node of that type

Expected result: Node is created
Actual result: SQL Exception (see above)

Proposed resolution

When attempting to load the existing content moderation state entity when there is no loaded revision ID, but there is a revision ID, use the revision ID instead.

The ("not loaded") revision ID is what will be set on the content moderation state entity anyway, so it is what causes the duplicate key exception, so it seems appropriate to check for that.

Remaining tasks

Review the patch.

User interface changes

None.

API changes

None.

Data model changes

None.

Release notes snippet

-

Comments

tstoeckler created an issue. See original summary.

tstoeckler’s picture

Issue summary: View changes
Status: Active » Needs review
StatusFileSize
new5.99 KB
new7.38 KB

Here's a test for this. I tried to come up with the simplest example of a hook implementation that would trigger this. In the example of Group module described in the issue summary, instead of cloning the entity it is actually being reloaded (via accessing the ->entity property of an entity reference field). So while the cloning might seem a bit contrived here, the larger issue still stands. The recording of the state is not strictly necessary, it's just there to avoid a false positive, in case the test hook implementation would not get called for some reason as the test would then still pass (without the check for the state value).

Also including a patch with the fix. I'm not too much of an expert on Content Moderation, so would be glad to have some input on the fix, but looking at surrounding code, I am fairly convinced that this is the correct way to fix it. Also, the inline comment could maybe be improved, I had difficulty explaining the situation properly without basically pasting the issue summary there.

The last submitted patch, 2: 3181439-2--tests-only.patch, failed testing. View results

larowlan’s picture

Issue tags: +Bug Smash Initiative

Could this be a use case for isSyncing on a content entity?

E.g. content moderation ignores this operation if it is syncing, and then group sets it to syncing before it does the resave?

The patch looks good to me, but asking about an alternate approach.

tstoeckler’s picture

Issue summary: View changes
StatusFileSize
new4.05 KB
new7.04 KB
new1.59 KB
new6.61 KB
new8.42 KB

It's a very good question, unfortunately the answer is that no, that will not help.

In fact, the issue only occurs if the entity is set to syncing in the re-save. Group module "as is" does not set the entity to syncing when re-saving it, but I had #2872697-41: Stop saving an entity when it gets added to or removed from a group - that does exactly that - applied when finding this bug. I had forgotten about this detail, so thanks for bringing it up! Noting it in the issue summary, as well.

In the test module hook implementation in the patch - which is basically the simplified approximation of what Group module does - I'm already setting the entity to syncing. And the "test-only" patch does in fact fail without it, I had just assumed as much before but checked now that you brought it up.

It's all a bit confusing but it does (somewhat) make sense why that is the case: When the entity is not set to syncing, the re-save in fact triggers the creation of a new revision from Content Moderation. So the initial content moderation state entity gets added with a target revision ID of 2, because in this scenario the Content Moderation update hook for the resave is run before the insert hook for the initial save. When the latter is then run, it is not able to find the content moderation state that was just created so it attemps to create a new one, just like in the scenario described in the issue summary. Here, though, it attempts to create it with a target revision ID of 1, which does not cause the duplicate key exception so no error is thrown.

However, as I am writing this, I realize that means that without the setSyncing() call, while the save (and re-save) does not trigger an exception, it in fact causes two content moderation state entities to be created. I verified this locally by commenting out the setSyncing() part of the patch.

So as part of the test I think we should be asserting that only a single content moderation state entity gets created in the process. This would make the test fail without the setSyncing() call, so it avoids a false positive if there are any changes in the future.

So here's an updated patch that does that. I also included a second tests-only patch (and interdiff) to prove that the setSyncing() is needed now to make the test pass.

Thanks again, really on point!

The last submitted patch, 5: 3181439-5--tests-only.patch, failed testing. View results

The last submitted patch, 5: 3181439-5--tests-only--no-syncing.patch, failed testing. View results

sam152’s picture

Great explanation of the problem, thanks!

+++ b/core/modules/content_moderation/tests/modules/content_moderation_test_resave/content_moderation_test_resave.module
@@ -0,0 +1,30 @@
+    // Saving the passed entity object would populate its loaded revision ID,
+    // which we want to avoid. Thus, save a clone of the original object.
+    $cloned_entity = clone $entity;

Before reading the patch, I was wondering why saving a revision didn't in fact hydrate the loaded revision ID, but it seems it does.

Does the fact the entity needs to be cloned here indicate that this may be the location of the bug and when cloned entities are saved, they should have similar semantics to the original and CM just happens to be a module that exposes this issue? If this is the case, could the test case instead be asserting that saving a clone has the same getLoadedRevisionId behaviour as saving the original?

tstoeckler’s picture

Thanks @Sam152!

Before reading the patch, I was wondering why saving a revision didn't in fact hydrate the loaded revision ID, but it seems it does.

Well it does, however, that happens after hook_entity_insert is called. See \Drupal\Core\Entity\ContentEntityStorageBase::doPostSave() for details.

If this is the case, could the test case instead be asserting that saving a clone has the same getLoadedRevisionId behaviour as saving the original?

I'm not sure I fully understand the question and/or assertion. Saving the clone in principle does have the same the same behavior as saving the original. The issue here is with the in memory state being different which exposes this bug, in particular the loaded revision ID being 0 (in the original entity) while there is already a content moderation state (from the re-save).

I'll try to explain the flow of things a bit more in detail at the risk of being overly verbose.
Here is what happens just with Content Moderation without any re-saving:

  1. The node with its initial revision is inserted, receiving the revision ID 1 from the DB. The revision ID field is populated with this value but not yet the loaded revision ID.
  2. hook_entity_insert() is invoked on the node
  3. content_moderation_entity_insert() is called which ends up calling EntityOperations::updateOrCreateFromEntity().
  4. ContentModerationStateEntity::loadFromModeratedEntity($entity) correctly cannot find a content moderation state for the new node, so EntityOperations::updateOrCreateFromEntity() will create a new one. (Note that the fact that we add a condition on content_entity_revision_id = NULL (while weird and arguably incorrect) is not a problem here, because there is no content moderation state for the new node anyway, regardless of the target revision ID.)
  5. The new content moderation state entity is created using the revision ID field of the target entity so it has a target revision ID of 1.
  6. After the hook has been invoked the loaded revision is updated, which means it is set to 1.

Here is the scenario in the test with the re-saving of a clone (but without the fix) with the differences to the scenario above in bold:

  1. The node with its initial revision is inserted, receiving the revision ID 1 from the DB. The revision ID field is populated with this value but not yet the loaded revision ID.
  2. hook_entity_insert() is invoked on the node
  3. content_moderation_test_resave_entity_insert() is called which clones the node, marks it as syncing and re-saves it which means the following happens:
    1. The node with its initial revision is re-saved. Because the entity is marked as syncing Content Moderation will not force a new revision to be created (see \Drupal\content_moderation\Entity\Handler\ModerationHandler::onPresave()). The revision ID is still 1 and the loaded revision ID is still not set.
    2. hook_entity_update() is invoked on the node
    3. content_moderation_entity_update() is called which ends up calling EntityOperations::updateOrCreateFromEntity().
    4. ContentModerationStateEntity::loadFromModeratedEntity($entity) correctly cannot find a content moderation state for the new node, so EntityOperations::updateOrCreateFromEntity() will create a new one. (Note that the fact that we add a condition on content_entity_revision_id = NULL (while weird and arguably incorrect) is not a problemm here, because there is no content moderation state for the new node anyway, regardless of the target revision ID.)
    5. The new content moderation state entity is created using the revision ID field of the target entity so it has a target revision ID of 1.
    6. After the hook has been invoked the loaded revision is updated, which means it is set to 1. Because a clone was saved the loaded revision ID of the original entity is still not set.
  4. content_moderation_entity_insert() is called which ends up calling EntityOperations::updateOrCreateFromEntity().
  5. ContentModerationStateEntity::loadFromModeratedEntity($entity) incorrectly cannot find a content moderation state for the new node, so EntityOperations::updateOrCreateFromEntity() will create a new one. It cannot find one because of the condition on content_entity_revision_id = NULL while the actual state as a target revision of 1.
  6. The new content moderation state entity is created using the revision ID field of the target entity so it has a target revision ID of 1. This causes the SQL exception.

This would be what happens without the cloning with the differences to the scenario with cloning in bold:

  1. The node with its initial revision is inserted, receiving the revision ID 1 from the DB. The revision ID field is populated with this value but not yet the loaded revision ID.
  2. hook_entity_insert() is invoked on the node
  3. content_moderation_test_resave_entity_insert() is called which clones the node, marks it as syncing and re-saves it which means the following happens:
    1. The node with its initial revision is re-saved. Because the entity is marked as syncing Content Moderation will not force a new revision to be created (see \Drupal\content_moderation\Entity\Handler\ModerationHandler::onPresave()). The revision ID is still 1 and the loaded revision ID is still not set.
    2. hook_entity_update() is invoked on the node
    3. content_moderation_entity_update() is called which ends up calling EntityOperations::updateOrCreateFromEntity().
    4. ContentModerationStateEntity::loadFromModeratedEntity($entity) correctly cannot find a content moderation state for the new node, so \Drupal\content_moderation\EntityOperations::updateOrCreateFromEntity() will create a new one. (Note that the fact that we add a condition on content_entity_revision_id = NULL (while weird and arguably incorrect) is not a problem here, because there is no content moderation state for the new node anyway, regardless of the target revision ID.)
    5. The new content moderation state entity is created using the revision ID field of the target entity so it has a target revision ID of 1.
    6. After the hook has been invoked the loaded revision is updated, which means it is set to 1. Because it is the same object, the loaded revision ID of the original entity is now also 1.
  4. content_moderation_entity_insert() is called which ends up calling \Drupal\content_moderation\EntityOperations::updateOrCreateFromEntity().
  5. ContentModerationStateEntity::loadFromModeratedEntity($entity) correctly finds the existing content moderation state for the new node, so \Drupal\content_moderation\EntityOperations::updateOrCreateFromEntity() will update it and not create a new one. It finds it because of the condition on content_entity_revision_id = 1
  6. The new content moderation state entity is created using the revision ID field of the target entity so it has a target revision ID of 1. This causes the SQL exception.

So yes, the cloning is necessary to expose this bug, but it's not a problem with the cloning itself. As noted above, you could just as well do a re-load of the entity to get a separate entity object, but that would get into the whole topic of when the static cache is set and unset during the save process, so I think just cloning is clearer here.

Hope I haven't hopelessly confused everyone now with that wall of text!

sam152’s picture

Interesting, so to summarise, you could say that CM incorrectly assumes the internal memory state of an entity during the invocation of the presave hook? And additionally, it does not make sense to re-hydrate that in-memory state when reloading or unserializing an entity?

My only feedback on the patch is that the tests depend on the internal entity, instead of verifying the public APIs behave as expected. Maybe it's worth the additional coverage though.

tstoeckler’s picture

Interesting, so to summarise, you could say that CM incorrectly assumes the internal memory state of an entity during the invocation of the presave hook?

Hmm..., not 100% sure how that is meant, I can read it two different ways. I would summarize as follows: Content Moderation assumes that new entities do not already have a content moderation state when hook_insert() is called, but - while it is certainly the most "usual"/"common" case - that is not necessarily true.

And additionally, it does not make sense to re-hydrate that in-memory state when reloading or unserializing an entity?

Sorry, not sure if I missed some part of the context, but I don't get what (re-)hydration has to do with this. Do you mean the loaded revision ID? No that state is being managed totally fine, I don't think anything needs to change when (re-)loading or unserializing an entity. Maybe you can elaborate.

My only feedback on the patch is that the tests depend on the internal entity, instead of verifying the public APIs behave as expected. Maybe it's worth the additional coverage though.

Totally fine with removing the more explicit assertions in the test if that is the consensus, maybe a committer can weigh in on that as part of a sign-off (crossing my fingers... ;-).

sam152’s picture

Okay, understood. I think based on your very thorough explanation and analysis, I'm comfortable with this and despite this particular bit of CM being a low level and risky component to modify, it solves a real problem with the groups module.

I've queued up a test against 9.x, which I suspect will fail because of the test modules info file, but if we can get it green I can RTBC.

sam152’s picture

Version: 8.9.x-dev » 9.2.x-dev

Version: 9.2.x-dev » 9.3.x-dev

Drupal 9.2.0-alpha1 will be released the week of May 3, 2021, which means new developments and disruptive changes should now be targeted for the 9.3.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

tomefa’s picture

StatusFileSize
new8.45 KB

Reroll patch for 9.2.5

tstoeckler’s picture

StatusFileSize
new719 bytes
new8.44 KB

Thanks @Sam152 for following up with this, unfortunately I had already lost track of this when you replied. And nice catch with the info file, fixed that now so hopefully we can get this RTBC now (per #12).

And thanks @Tomefa for the re-roll and for the implicit reminder about this issue ;-)

gauravvvv’s picture

StatusFileSize
new8.45 KB
new2.28 KB

Re-rolled patch #16, attached interdiff for same.

Status: Needs review » Needs work

The last submitted patch, 17: 3181439-17.patch, failed testing. View results

tstoeckler’s picture

Status: Needs work » Needs review
StatusFileSize
new1.23 KB
new8.51 KB

Thanks @Gauravmahlawat for that! I should have checked the errors before, sorry, I had just blindly assumed that it had something to do with the info file.

Fixed the deprecations now. And this time I actually ran the test locally to verify that everything is fine and dandy, so hoping this will be the last one ;-)

Version: 9.3.x-dev » 9.4.x-dev

Drupal 9.3.0-rc1 was released on November 26, 2021, which means new developments and disruptive changes should now be targeted for the 9.4.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 9.4.x-dev » 9.5.x-dev

Drupal 9.4.0-alpha1 was released on May 6, 2022, which means new developments and disruptive changes should now be targeted for the 9.5.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 9.5.x-dev » 10.1.x-dev

Drupal 9.5.0-beta2 and Drupal 10.0.0-beta2 were released on September 29, 2022, which means new developments and disruptive changes should now be targeted for the 10.1.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

smustgrave’s picture

Triggering for 10.1

smustgrave’s picture

Status: Needs review » Reviewed & tested by the community

Reviewing the patch

Everything looks good but will ask the committers is there a test module that could be leveraged or for this use case is it best to have it's own?

gauravvvv’s picture

Updated attributions

  • catch committed dd8f782c on 10.0.x
    Issue #3181439 by tstoeckler, Gauravvv, Tomefa, Sam152, smustgrave,...

  • catch committed 775b3cff on 10.1.x
    Issue #3181439 by tstoeckler, Gauravvv, Tomefa, Sam152, smustgrave,...

  • catch committed b327c94f on 9.5.x
    Issue #3181439 by tstoeckler, Gauravvv, Tomefa, Sam152, smustgrave,...
catch’s picture

Version: 10.1.x-dev » 9.5.x-dev
Status: Reviewed & tested by the community » Fixed

I wondered about the separate test module too but it makes sense:

- we already have two one-off test modules in content_moderation
- we wouldn't want this running on every content_moderation test that uses a generic test module,and if we added a state key to enable/disable that adds complexity too

Committed/pushed to 10.1.x, cherry-picked to 10.0.x and 9.5.x, thanks!

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.