Problem/Motivation
The current way of associating a revision with a workspace has a couple of problems:
- the association history is lost when a workspace is published to Live, because we delete all entries from the workspace_association
tables
- when we introduce workspace hierarchies, operations on the workspace_association
entities will be either 1) very costly, because we have to update all the associations of a workspace as well as its descendants, or 2) bypass the Entity API completely, which makes the current implementation completely unnecessary overhead :)
Proposed resolution
- track the workspace in which a revision was created in a (revision metadata) base field
- convert the workspace_association
entity type to a custom index table
Remaining tasks
Discuss, agree on the proposed resolution, write upgrade path and tests.
User interface changes
Nope.
API changes
The workspace_association
will be a service in the container instead of an entity type.
Data model changes
All entity types supported by Workspaces will have an additional revision metadata (reference) field: workspace
Release notes snippet
TBD.
Comment | File | Size | Author |
---|---|---|---|
#54 | 3062434-54.patch | 99.81 KB | amateescu |
#53 | interdiff-53.txt | 953 bytes | amateescu |
#53 | 3062434-53.patch | 91.95 KB | amateescu |
#46 | interdiff-46.txt | 4.4 KB | amateescu |
#46 | 3062434-46.patch | 99.82 KB | amateescu |
Comments
Comment #2
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedHere's an initial patch to kick off the review process.
Some tests still need to be updated, and it needs an upgrade path and tests for it.
Comment #3
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedA small fix and moving to 8.8.x for now.
Comment #4
pmelab CreditAttribution: pmelab at Amazee Labs commentedFixed remaining tests.
Patch #3 only deleted the currently indexed (latest) revision for a workspace. But I suppose it should delete all revisions? I changed that in
Workspace::postDelete()
.Comment #5
pmelab CreditAttribution: pmelab at Amazee Labs commentedComment #6
pmelab CreditAttribution: pmelab at Amazee Labs commentedRe-rolled the broken patch. Discovered dorgflow. My life is a lot easier now.
Comment #7
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedThe interdiff from #6 looks great, thanks @pmelab!
I did a few cosmetic fixes and a more substantial one:
We're not deleting the workspace field here, just the revision metadata key :) And it's still needed, so I brought back this code.
Comment #9
pmelab CreditAttribution: pmelab at Amazee Labs commentedI tried to get hold of the last failing test. But I can't tell what's wrong. It seems like the
EntityDefinitionUpdateManager
tries to remove the workspace field twice, and this results in a fatal error when uninstalling the module.Comment #10
pmelab CreditAttribution: pmelab at Amazee Labs commentedI finally was able to track down the problem:
Removing the revision metadata key in
workspaces_uninstall
causes the entity updated manager to thinkworkspace
is a not-revisionable field and tries to remove it from the regular data table, which in return causes the fatal error.My current approach to solve this is to remove the revision metadata key in
system_modules_uninstalled
instead. I was able to verify this manually, but unfortunately not in a test case since I was not able to get updated entity information. I'm happy for any suggestions!Comment #11
yogeshmpawarComment #12
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedI looked into the failures from #7 as well and I can confirm the problem found by @pmelab, apparently there is no way for modules to clean up the revision metadata keys that they are adding to other entity types. This will need a separate core issue to be discussed and fixed properly, but until then I think the workaround of doing it in system module is our only choice :/
Cleaned up the uninstall code a bit and I think we need to test only the definitions from the entity definition update manager, because that's what the storage and the table mapping are using to determine the table of a field.
Comment #13
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedI discussed this issue with @catch a few days ago and he was +1 on the general direction, so here's an update path and tests for it.
While writing this upgrade path it turned out we're blocked by #3056816: Installing a new field storage definition during a fieldable entity type update is not possible because we want to install the new
workspace
revision metadata field for the entity types that have been converted to revisionable in 8.7.0, and it's not possible without that fix.Comment #14
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedHm.. the patches above can't apply because of
core/modules/content_moderation/tests/fixtures/update/drupal-8.4.0-content_moderation_installed.php
, which contains serialized entity type definitions and git considers it a binary file.Here's a new set of patches generated with
--binary
, and also attaching just that file generated with--text
for easier reviewing :)Comment #16
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedFiled #3067024: Add test coverage for uninstalling revisionable entity types whose code doesn't exist anymore for the failures of the combined patch from #14, so we have two dependencies now :/ Here's a new patch which includes both of them.
Comment #17
blazey CreditAttribution: blazey at Amazee Labs commentedWorkspaces cause
EntityStorageBase::loadMultiple
to return entities that weren't requested, i.e. their ids were not on the list, see #3068733: EntityStorageBase::loadMultiple returns unwanted entities when the static cache is warm. TheDrupal\workspaces\EntityOperations::entityPreload
is refactored in this issue but the problem is still there, so adding the test here as well.Comment #18
blazey CreditAttribution: blazey at Amazee Labs commentedRe-roll
Comment #19
blazey CreditAttribution: blazey at Amazee Labs commentedGit is hard :D
Comment #21
blazey CreditAttribution: blazey at Amazee Labs commentedOk, it looks like #3068733: EntityStorageBase::loadMultiple returns unwanted entities when the static cache is warm can be fixed in
EntityStorageBase::loadMultiple
and no changes toDrupal\workspaces\EntityOperations::entityPreload
are required. In this case these issues can probably be dispatched independently, so removing the files added after #16.Comment #22
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedIt seems that #3067024: Add test coverage for uninstalling revisionable entity types whose code doesn't exist anymore is not going to be easy, so let's drop that dependency and uninstall the
workspace_association
in a different way.We have only one blocker issue left: #3056816: Installing a new field storage definition during a fieldable entity type update is not possible.
Comment #23
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedForgot to roll the patches with
--binary
.Comment #24
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedYay, #3056816: Installing a new field storage definition during a fieldable entity type update is not possible was committed, so the non-combined patch from #23 is ready for final reviews!
Comment #26
Sam152 CreditAttribution: Sam152 at PreviousNext commentedPosting an initial review, leaving as NR to encourage others to take a look.
Should there be a @todo and a follow-up to create an API for modules to update schema after they are uninstalled? Maybe also needs to explain why hook_uninstall wasn't used?
Part of the motivation for this issue was to side-step entity API when managing workspaces, but doesn't the workspace field on each entity revision reintroduce that problem to some degree?
Should this be using the installed definition? Possibly causes issues on a site between deploy => updb or sites or where updates are stuck?
The docblock doesn't document NULL as a possible value.
Why would some code use the index and some code use the revision metadata field? As a broad question, why do both have to exist? Is it possible to answer that question through the implementation somehow?
With 7.1 being the new minimum supported version, I wonder if new interfaces can start making use of features like scalar type hints?
Is removing 'live' a blocker for this?
I'm surprised this works TBH, I've seen these updates blow up when you combine an 8.6 module with 8.0+ updates.
Why did this test case change?
How come workspaces have a large ID than
EntityTypeInterface::ID_MAX_LENGTH
?How come this is required?
Comment #27
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedThanks @Sam152 for the review, great questions!
workspace_association_revision
table.The new
workspace_association
index table contains only the _latest_ associated revisions (just like the base table of the previousworkspace_association
entity type) and it's needed as a performance optimization. We tried to make the whole system use only the data available in the workspace reference field, but it turned out that it needed an extra join for each (parent) workspace in the hierarchy, per the feature proposed in #3062486: Add the ability to create sub-workspaces in order to enable the dev -> stage -> live workflow for content.Live
workspace, the workspace reference needs to beNULL
for default revisions.workspace_association_revision
, which tracked *all* associated revisions, and now we are querying theworkspace_association
service/index, which tracks only the latest associated revisions.EntityTypeInterface::ID_MAX_LENGTH
specifies the max length of the entity type ID (e.g.node
,taxonomy_term
), and that column stores workspace entity IDs (e.g.stage
).Comment #29
amateescu CreditAttribution: amateescu for Pfizer, Inc. commented#3056816: Installing a new field storage definition during a fieldable entity type update is not possible was committed again so #27 should be green and waiting for more reviews :)
Comment #30
Sam152 CreditAttribution: Sam152 at PreviousNext commentedNice, changes and explanations look fine to me. Leaving NR for other reviews.
Comment #31
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedOpened an issue for #10 / #12 and linked it in a @todo.
This patch will fail because #3056816: Installing a new field storage definition during a fieldable entity type update is not possible has been reverted.. again :/
Comment #33
amateescu CreditAttribution: amateescu for Pfizer, Inc. commented#3056816: Installing a new field storage definition during a fieldable entity type update is not possible is in again, hopefully for good this time. So we're ready for final reviews here :)
Comment #34
Leksat CreditAttribution: Leksat at Amazee Labs commentedI checked the code and did some manual testing (including the update process). Could not find any issues with the patch. Looks very good to me!
Comment #35
catchThe revision metadata field is a nice way to keep history and seems an OK trade-off having to maintain a custom index table for that.
entity entity
This could use a bit more documentation - i.e. that it's an index table and the canonical associations are in the revision metadata or similar.
I didn't realise we were doing this, but do we really want to have this revision purging logic in core? Is this a specific purge operation or does it happen whenever a workspace is deleted? Also how can a default revision be part of a workspace and why would we ever delete that here?
Comment #36
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedThanks for the reviews!
@catch, re #35:
That purging happens every time a workspace is deleted, and I think it's needed in core because otherwise we wouldn't be able to handle the case when a workspace is deleted and then another one is created with the same ID/machine name. I'm happy to discuss alternatives in a separate issue though :)
As for default revisions being part of a workspace, I think that code is quite old and it was probably needed in some initial version of the contrib module which was tracking revisions for the Live workspace. However, since we no longer do that, we can definitely clean it up a bit, very good catch!
Comment #37
catchGot interrupted during the last review, here's a few more things.
Oof, but following the follow-up and don't think it should block this.
entity entity is still here (or maybe another instance?)
Minor, but there's no join here so why alias?
This looks like it's missing an ->accessCheck(FALSE)
Same question with the alias and no join. Or really why not just a static query in these two places since there's nothing dynamic about the query? But then asking that table name is a constant which makes a static query harder, but does it really need to be a constant? The thing that made me think about this in the first place was using the constant, then having to use a literal string for the alias immediately afterwards which slightly undermines having a constant.
Should be 'associated entities'.
This still says workspace_association entities. Use 'records' instead?
Since this is owned by the service, would it make sense to do the 'lazy table creation' logic that key/value and database cache backends do? We've slowly been getting rid of core usages of hook_schema().
When a revision is made 'live'/default, is it definitely never associated with a workspace at that point (except possibly the now-defunct 'live' workspace?) - if so this might be OK, but if not then deleting previously-default revisions feels really destructive. I realise this code is already there but it'd be good to open a spin-off issue to discuss more yeah.
(also patch isn't applying).
Comment #38
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedRe #37:
\Drupal\Core\Database\Query\SelectInterface::fields()
requires a$table_alias
as its first argument. Every select query needs to set an alias like this even if they don't need to join another table :)entries
, notentities
, but I agree thatrecords
is better :)Note that in #36 I removed the code which could delete default revisions, so we are only deleting pending deletions now. Also opened #3077202: Revaluate the deletion of pending revisions when a workspace is deleted to discuss this further.
The patch wasn't applying because I forgot to roll it with
--binary
:)Comment #39
catchThis brings up another question for me which is perhaps more relevant to the patch.
When we only stored workspace_association entities, and delete a workspace when it's published, the workspace association entities get deleted too. If you later delete the workspace itself, there are no associations, so the revisions won't get deleted.
But now, the workspace associations remain after the workspace is published, so if you delete an old, published workspace, then those revisions (which are the 'source' revisions for published content, not abandoned drafts) will get deleted. Or if that's not a correct summary what exactly does happen now?
Comment #40
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedVery good question! And you're right that with this patch, the "old" revisions that belong to a workspace that has been published and then deleted will be deleted as well.
I see two (quick) options:
- add another
workspace_association_revision
index table, which would store all the workspace-specific revisions that haven't been published yet- remove the code which deletes old revisions and deal with the "delete than recreate the same workspce ID" in #3077202: Revaluate the deletion of pending revisions when a workspace is deleted.
Comment #41
catchI think I prefer option #2. It should be possible to implement a retrospective revision purge logic if we eventually want to, but we can't bring data back. Marking needs work for that.
Comment #42
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedI started doing that but I found that there's quite a bit of code, tests and UI assumptions/constraints around the fact that we delete pending revisions and that it shouldn't be possible to create a workspace with the same ID as a deleted one, so I think it's safer for now to choose another approach: only delete pending revisions with an ID that's higher than the default revision ID. This way, we can never delete 'source' revisions for published content, but only pending 'draft' revisions.
This approach is possible at the moment because the
EntityWorkspaceConflict
constraint prevents editing an entity in any other workspace (or Live) if it has a pending draft revision in one workspace. In other words: if an entity has been edited in a workspace, it can only get to a default revision state if that workspace is published. This assumption will probably change when we get the conflict resolution API and the ability to merge workspaces, but in the meantime we can discuss all this in depth in #3077202: Revaluate the deletion of pending revisions when a workspace is deleted.Here's a test-only patch which shows the problem described in #39 with the patch from #38, and one that implements the proposal from this comment.
Comment #43
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedSigh, forgot
--binary
again :/Comment #45
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedRerolled after #2935780: Remove the concept of a 'live' default workspace.
Comment #46
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedFinally understood what @catch meant in #37.3 :)
Comment #47
dixon_All points from previous reviews have been addressed. And to me the patch and the interdiff looks good as well!
I'm not too opinionated on how/if we delete revisions when a workspace is deleted, so I'm happy to discuss other solutions. But I like the current approach in the patch of only deleting the forward revisions when the workspace itself is deleted. It feels like a clean solution aligned with what I think most users would expect when deleting a workspace.
Comment #49
catchDeleting just the forward revisions seems reasonable here and we can discuss further changes in the follow-up.
Committed 17bd9ce and pushed to 8.8.x. Thanks!
Comment #50
mikelutzLooks like this broke PostGre on HEAD. https://www.drupal.org/pift-ci-job/1410286
Comment #52
catchReverted for now. Will kick off postgres and sqlite test runs here.
Comment #53
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedThis should fix it :)
Comment #54
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedSigh.. forgot
--binary
again.Comment #55
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedAll green, back to RTBC :)
Comment #57
catchCommitted 6caa28c and pushed to 8.8.x. Thanks!
Comment #59
xjmThis probably needs at least a CR.
Comment #60
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedAdded a CR: https://www.drupal.org/node/3096868
Comment #61
jibranI was wondering should we enforce
max_length
here as workflow ID has->setSetting('max_length', 128)
but the question is can we enforce lenght on ER field target_id field?Comment #63
hchonovWe limit the length on the target_id property to 255 by default or to 32 when referencing a bundle entity type. Instead of hardcoding the value we could retrieve it from the target type ID definition. This will be not workspaces specific then. Could you please open an issue for this?
Comment #64
andypostThis hook should be moved to .module file so this condition is useless, faced in #3118155-20: Improve hook_module_preinstall & hook_module_preuninstall to include recommendations for config import and pass an $is_syncing argument